Meta Launches New LLM LLaMA which Outperforms GPT-3 at a Fraction of the Size

Meta earlier released two LLM-based chatbots, BlenderBot-3 and Galactica that were taken down for giving incorrect results.
Listen to this story

Meta AI has unveiled, LLaMA, a set of foundation language models that range from 7B to 65B parameters. 

LLaMA-13B surpasses OpenAI’s GPT-3 (175B) while being over ten times smaller, and LLaMA-65B is comparable to DeepMind’s Chinchilla-70B and Google’s PaLM-540B. 

Read the full research paper here

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

The study differs from previous ones because it shows that it is feasible to attain state-of-the-art performance solely by training on publicly accessible data without resorting to proprietary datasets. Smaller models trained on a greater number of tokens, which are fragments of words, are simpler to retrain and adjust for particular product use cases. LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens, while the smallest model, LLaMA 7B, was trained on one trillion tokens.

LLaMA functions by taking a sequence of words as input and predicting the next word to recursively create text like any other LLM. The team decided to use text from the top 20 languages with a focus on those using Latin and Cyrillic alphabets to train the model.


Download our Mobile App



“We hope that releasing these models to the research community will accelerate the development of large language models, and help efforts to improve their robustness and mitigate known issues such as toxicity and bias,” said the official blog post.  

Meta aims to publish bigger models that are trained on more extensive pre-training datasets in the future, as it has observed steady enhancements in performance as it scales up.

The launch has caused a significant increase in AI-based crypto tokens. SingularityNET’s AGIX has risen by over 6% following the launch, while Fetch.ai’s FET was not far behind, experiencing a gain of over 4.5%.

Meta’s Wildcard Entry to the AI Race

In the ultimate race for AI supremacy, OpenAI led the way with the release of ChatGPT, a powerful chatbot fueled by GPT-3.5. Google soon followed suit with its ‘experimental’ chatbot Bard, while Chinese tech giant Baidu is planning to enter the fray with its own Ernie Bot- ERNIE 3.0. Not to forget Bing Chat aka Sydney which Microsoft claims to be built on a model which is ‘a new, next-generation OpenAI large language model’ that is more advanced than ChatGPT, aside from also being integrated with Bing search. 

Read more: Why is Meta Shying Away from LLMs?

Unfortunately, Meta has made multiple failed attempts in this space although it was it made it to the headlines for being one of the first ones to release a chatbot built on LLM – BlenderBot 3. But the excitement was short-lived, however, as the bot quickly turned into an AI disaster, spewing racist remarks and questioning Mark Zuckerberg’s ethics. 

But Meta was not deterred, and it continued to experiment with LLM-based models. They introduced Galactica, a model specifically designed for scientific research. Unfortunately, Galactica also met the same fate as BlenderBot 3 and suffered from hallucinating results leading to taking it down.

Although Meta has already mentioned in its blog post that further research is necessary to address the potential risks of bias, toxic comments, and hallucinations in LLMs, including LLaMA, Meta claims that LLaMA is versatile and can be applied to various use cases, unlike fine-tuned models that are designed for specific tasks. 

It will be interesting to see how Meta’s surprise entry to the bandwagon turns out. With its previous models ending up in the dustbin of history, the question on everyone’s mind is: can Meta really step up to the plate this time? 

As the battle for AI supremacy heats up, all eyes are on Meta to see if it has what it takes to go toe-to-toe with the big guns. Will it emerge as a strong contender or fade away with LLaMA as it did with previous models? Only time will tell, but one thing is certain – the fate of AI hangs in the balance.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Shritama Saha
Shritama is a technology journalist who is keen to learn about AI and analytics play. A graduate in mass communication, she is passionate to explore the influence of data science on fashion, drug development, films, and art.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.