Reinforcement Learning Won Again, This Time With Microsoft

Phi-4 Reasoning Plus is the latest model that uses RL to achieve impressive scores on benchmarks.
Microsoft Introduces Multimodal Kosmos-2.5
While Microsoft is widely known for backing OpenAI with both infrastructure and capital, the company’s own open-source Phi family of models, part of their own AI research and development, isn’t as recognised the same way.  The Phi series of lightweight models is designed to consume less compute and storage. Thanks to various techniques and optimisation processes involved in the research process, these models have historically outperformed the competition, both in their lightweight segment and even some of the larger ones.  The latest addition is the Phi-4 Reasoning — a 14 billion-parameter model built by applying a supervised fine-tuning (SFT) algorithm to the Phi-4 base model. The researchers also derived the Phi-4 Reasoning Plus model by using reinforcement learning
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Supreeth Koundinya
Supreeth Koundinya
Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed