Microsoft debuts its smallest AI model Phi-3 Mini

WASHINGTON, UNITED STATES — Microsoft has unveiled Phi-3 Mini, the first in a series of small yet powerful artificial intelligence (AI) language models.
Despite its modest 3.8 billion parameters, Phi-3 Mini delivers performance rivaling models up to 10 times larger, like GPT-3.5.
It’s now accessible on platforms such as Azure, Hugging Face, and Ollama, with two more models, Phi-3 Small and Phi-3 Medium, scheduled for future release.
Enhanced performance and accessibility of Microsoft’s Phi-3 Mini
Eric Boyd, Corporate Vice President of Microsoft Azure AI Platform, highlighted the model’s capabilities to The Verge, stating, “Phi-3 Mini is as capable as larger LLMs like GPT-3.5 but in a smaller form factor.”
This new model promises to deliver complex AI functionalities, which were previously only possible with bigger models, in a more compact and cost-efficient package.
Phi-3 Mini is engineered to be more affordable and efficient, particularly suitable for use on personal devices such as smartphones and laptops. This development follows reports from The Information about Microsoft’s focus on developing lightweight AI models, which includes other specialized models like Orca-Math, aimed at solving mathematical problems.
Innovative ‘curriculum’ training approach
The key to Phi-3’s punch is its innovative “curriculum” training approach inspired by how children learn through simple stories.
“There aren’t enough children’s books out there, so we took a list of more than 3,000 words and asked an LLM to make ‘children’s books’ to teach Phi,” Boyd explained.
“We took a list of over 3,000 words and asked an LLM to make ‘children’s books’ to teach Phi,” Boyd explained.
Phi-3 built on what previous iterations like Phi-1 for coding and Phi-2 for reasoning had learned.
Competing small and large language models
Google offers the Gemma 2B and 7B models, which are suitable for simple chatbots and language-related tasks. Anthropic’s Claude 3 Haiku can read dense research papers with graphs and summarize them quickly. Meta recently released Llama 3 8B, which may be used for chatbots and coding assistance.
Some of the larger and more capable AI models that Phi-3 aims to rival include OpenAI’s GPT-3.5 (175B parameters), Google’s Gemini 1.5 Pro, and Anthropic’s Claude 3 Opus and Sonnet models. Cohere also offers models like Command and Command-R in the mid-range size.
Meta’s Llama 2 Chat comes in 7B, 13B, and 70B parameter versions, with the largest rivaling GPT-3 in performance. Mistral has released models ranging from 7B to 8x22B parameters under open licenses.
While Microsoft’s Phi-3 family covers the small to mid-range model space currently, competitors like OpenAI (GPT-4), Anthropic (Claude Instant), and Cohere (Command Light) are also exploring ultra-compact and efficient models for on-device AI.