MITB Banner

Microsoft Introduces Phi-3, LLM That Runs on the Phone

Despite its compact size, Phi-3-Mini boasts performance levels that rival larger models such as Mixtral 8x7B and GPT-3.5.

Share

Everyone is Now Officially a Developer, Thanks to Microsoft
Listen to this story

While speaking to AIM, Harkirat Behl, one of the creators of the Phi model, said that his team was working on the next version of Phi-2 and making it more capable. “Phi-1.5 started showing great coding capabilities, Phi-2 was code with common sense abilities, and the next one would be even more capable,” he said. 

Microsoft has now unveiled Phi-3-Mini, a 3.8 billion parameter language model trained on an extensive dataset of 3.3 trillion tokens. Despite its compact size, Phi-3-Mini boasts performance levels that rival larger models such as Mixtral 8x7B and GPT-3.5. 

“One of the things that makes Phi-2 better than Meta’s Llama 2 7B and other models is that its 2.7 billion parameter size is very well suited for fitting on a phone,” said Behl. Phi-3 makes it even better now. 

For instance, Phi-3-Mini achieves 69% on the MMLU benchmark and 8.38 on the MT-bench, making it suitable for deployment on mobile phones. 

Phi-3-Mini, being highly capable, can run locally on a cell phone. Its small size allows it to be quantized to 4 bits, occupying approximately 1.8GB of memory. Microsoft tested the quantized model by deploying Phi-3-Mini on an iPhone 14 with an A16 Bionic chip, running natively on the device and fully offline, achieving more than 12 tokens per second.

The innovation behind Phi-3-Mini lies in its training dataset, an expanded version of the one used for its predecessor, Phi-2. This dataset comprises heavily filtered web data and synthetic data. The model has also been optimised for robustness, safety, and chat format.

Microsoft has also introduced Phi-3-Small and Phi-3-Medium models, both significantly more capable than Phi-3-Mini. Phi-3-Small, with 7 billion parameters, utilises the tiktoken tokenizer for improved multilingual tokenization. It boasts a vocabulary size of 100,352 and a default context length of 8K. 

The Phi-3-small 7 billion parameter model achieves an MMLU score of 75.3 outperforms Meta’s recently launched Llama 3 8B Instruct with a score of 66.

The model follows the standard decoder architecture of a 7B model class, featuring 32 layers and a hidden size of 4096. To minimise KV cache footprint, Phi-3-Small employs a grouped-query attention, with four queries sharing one key. 

Additionally, it utilises alternative layers of dense attention and a novel blocksparse attention to optimise KV cache savings while maintaining long context retrieval performance. An additional 10% multilingual data was used for training this model.

However, Phi-3-Mini has its limitations. While it demonstrates a similar level of language understanding and reasoning ability as much larger models, it is fundamentally limited by its size for certain tasks. For example, it lacks the capacity to store extensive “factual knowledge,” resulting in lower performance on tasks such as TriviaQA. 

Microsoft believes such weaknesses can be addressed by augmenting the model with a search engine. Additionally, the model’s language capabilities are mostly restricted to English, highlighting the need to explore multilingual capabilities for Small Language Models.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.