Microsoft’s 1.3-Billion Model Outperforms Llama 2

phi-1.5 is trained mostly on synthetic data, which proves that high-quality data is indeed all you need.
Microsoft phi-1.5
Listen to this story

Microsoft Research has done it once again. After outperforming Meta’s LLaMA with phi-1 in July, the researchers have now introduced phi-1.5, a cutting-edge language model of 1.3 billion parameters that outperforms Llama 2’s 7-billion parameters model on several benchmarks. Microsoft has decided to open source the model. 

The phi-1.5 model, comprising a staggering 1.3 billion parameters, has been meticulously crafted to excel in multiple domains, making it the go-to choice for a wide range of applications. It particularly shines when dealing with queries in the question-answering (QA) format, as well as in chat interactions and code-related tasks.

Click here to check out the open source model on Hugging Face

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

While phi-1 was trained on high-quality textbook data, phi-1.5 is trained on synthetic data only. What sets phi-1.5 apart is its comprehensive training regimen, encompassing a rich tapestry of data sources. The model’s learning journey draws from diverse data pools, including Python code snippets harvested from StackOverflow, code from competitive programming contests, synthetic Python textbooks, and exercises generated by the powerful GPT-3.5-turbo-0301. 

Click here to read the paper: Textbooks Are All You Need II: phi-1.5 technical report

Key Details of phi-1.5 Model:

  • Architecture: Transformer-based model with a focus on next-word prediction objectives
  • Dataset Size: Trained on a vast corpus of 30 billion tokens
  • Training Tokens: The model honed its skills on a staggering 150 billion tokens
  • Precision: Utilises the fp16 precision standard
  • GPUs: Harnesses the power of 32xA100-40G GPUs
  • Training Time: Achieved its remarkable capabilities through 8 days of intensive training

The brainpower behind phi-1.5, the Microsoft Research team, asserts that this model has achieved nearly state-of-the-art performance levels among models with less than 10 billion parameters. Rigorous benchmark tests evaluating common sense, language comprehension, and logical reasoning have positioned phi-1.5 as a formidable contender. 

Notably, phi-1.5 has outperformed Meta’s Llama-2 7b in the AGIEval score and has approached parity with llama-2 7b in the GPT4ALL’s Benchmark suite, as measured by the LM-Eval Harness.

Mohit Pandey
Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR