Microsoft, OpenAI's biggest stan, shows off its own smaller AI models that could rival ChatGPT
- Microsoft appears to be communicating its OpenAI backup plan.
- The company has shown off small-but-powerful language models.
Microsoft is showing off its own, small-scale AI models that it says might do just as well as ChatGPT.
Or to put it another way, it's unveiled its OpenAI back-up plan.
Reminder: Microsoft has invested $10 billion into OpenAI this year to capitalize on ChatGPT's popularity. It's also integrated the startup's technology into its products like Bing, and developed an "everyday AI companion" for customers based on OpenAI's technology.
But following the OpenAI board's shock move in November to fire and then reinstate CEO Sam Altman, the tech giant appears keen to show it has its own powerful technologies. As Business Insider's Ashley Stewart wrote, Microsoft's executives have recently put distance between the firm and OpenAI.
Meet Microsoft's small language models
On Wednesday, Microsoft announced Phi-2, a new iteration of an AI model from the company's research subsidiary. This model is small but mighty, the firm's researchers say.
It's a smaller large language model that was trained in 14 days and comprises 2.7 billion parameters. Think of parameters as variables, or coefficients, inside the model that are used to determine the response you get from asking something like ChatGPT a question.
For context, OpenAI's older model GPT-3 had 175 billion parameters. Meta's Llama 2 comes in a range of models, with the smallest starting at 7 billion parameters and the largest being 70 billion parameters.
That's a pretty significant size difference, so Microsoft isn't yet trying to suggest Phi-2 can be a GPT killer. OpenAI's GPT-4 is rumored to have more than 1 trillion parameters.
But Phi-2, which is cheaper to run than models a lot bigger thanks to its reduced need for intensive computing power, does seem to outgun its bigger brothers on some counts.
"On complex benchmarks Phi-2 matches or outperforms models up to 25x larger, thanks to new innovations in model scaling and training data curation," Microsoft researchers Mohan Javaheripi and Sébastien Bubeck wrote in a blog.
Specifically, when compared to 7-billion and 13-billion parameter models of Meta's AI, Phi-2 wins on common-sense reasoning, language understanding, and math and code.
It also beats out a 7 billion parameter model from French startup Mistral on these benchmarks, and puts to shame Gemini Nano, one of three new models announced by Google last week, per Microsoft's analysis.
Then there's Orca 2, another AI model announced by Microsoft on November 20 – three days after OpenAI boss Altman was fired.
Microsoft says Orca 2 is all about getting smaller AI models to "achieve enhanced reasoning abilities, which are typically found only in much larger language models." Performance levels are similar to or better than models 5-10 times larger, per Microsoft.
The company acknowledges that "frontier" AI models like GPT-4, which powers ChatGPT, have "demonstrated a remarkable ability to reason." They can answer complex questions without the need to take several steps of processing.
But its researchers say smaller models, like its Orca 2, can take on these complex tasks too, just taking a different approach. "While an extremely capable model like GPT-4 can answer complex tasks directly, a smaller model may benefit from breaking the task into steps," they said in a blog.
Data Drifters, a community of coders and enthusiasts, published its own analysis suggesting Orca 2 nears GPT-4 on reasoning.
Combined, Orca 2 and Phi-2 don't give Microsoft its own, in-house version of GPT-4. But they may show that AI for Microsoft doesn't begin and end with OpenAI.