Listen to this story
|
In a groundbreaking announcement today at Microsoft Inspire, Meta and Microsoft announced the release of Llama 2, the cutting-edge iteration of their renowned open source LLM, LLaMa.
The model will also be available on Microsoft Azure platform catalog and Amazon SageMaker. The developers can also access the model on Windows platforms such as Subsystem for Linux (WSL), Windows terminal, Microsoft Visual Studio and VS Code.
The release is now made available as an open-source platform for research as well as commercial use.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Click here to check out the model on Hugging Face.
Building upon the success of the original LLaMA release earlier this year, which was hailed by the open source community, Llama 2 introduces a host of enhancements, bolstering its performance and safety measures. This also includes a fine-tuned model called Llama 2-chat, built for optimised dialogue use cases.
This is huge: Llama-v2 is open source, with a license that authorizes commercial use!
— Yann LeCun (@ylecun) July 18, 2023
This is going to change the landscape of the LLM market.
Llama-v2 is available on Microsoft Azure and will be available on AWS, Hugging Face and other providers
Pretrained and fine-tuned…
The highlight of this release is the introduction of the 7B, 13B, and 70B pre-trained and fine-tuned parameter models. Representing a substantial leap forward from its predecessor, Llama 2 incorporates a staggering 40% increase in pre-trained data, harnesses larger context data for training, and leverages GQA (Generalised Question-Answering) to achieve superior inference capabilities for the larger model.
Along with Microsoft, Meta also partnered with Amazon, HuggingFace, NVIDIA, Qualcomm, IBM, Zoom, Dropbox and a number of academic leaders from around the world to release the model highlighting the importance of open source software.
The paper highlights how the model is trained on publicly available data sources. Interestingly, the company also highlights that this does not include data from Meta’s products or services. “We made an effort to remove data from certain sites known to contain a high volume of personal information about private individuals,” said the researchers.
Read: The Biggest Winner from Threads: LLaMA
The fine-tuned Llama 2-Chat models have undergone meticulous optimisation for dialogue purposes. Employing a combination of supervised fine-tuning and multiple RLHF (Reinforcement Learning from Human Feedback) methods, including rejection sampling and PPO (Proximal Policy Optimization), the Llama 2 models have been enriched with over 1 million annotation points. This approach ensures enhanced interactions and responses in dialogues, setting new benchmarks in the field.
LLaMA2 Live Demo: pic.twitter.com/tJEuASN5ZB
— Marco Mascorro (@Mascobot) July 18, 2023
With rigorous evaluation against human and academic benchmarks across both closed and open alternatives, the Llama 2 model demonstrates an unparalleled level of competitiveness. Researchers and developers can now harness its formidable capabilities for a wide array of applications and projects, revolutionising the future of language processing and understanding.
Meta’s release of Llama 2 marks a significant step towards advancing the boundaries of AI research and its accessibility to the wider community. By making Llama 2 open source and available for commercial use, Meta fosters collaboration, inspiring further innovation, and opening the doors to new possibilities in the realm of generative AI.