MITB Banner

Meta Launches New LLM LLaMA which Outperforms GPT-3 at a Fraction of the Size

Meta earlier released two LLM-based chatbots, BlenderBot-3 and Galactica that were taken down for giving incorrect results.
Share
Listen to this story

Meta AI has unveiled, LLaMA, a set of foundation language models that range from 7B to 65B parameters. 

LLaMA-13B surpasses OpenAI’s GPT-3 (175B) while being over ten times smaller, and LLaMA-65B is comparable to DeepMind’s Chinchilla-70B and Google’s PaLM-540B. 

Read the full research paper here

The study differs from previous ones because it shows that it is feasible to attain state-of-the-art performance solely by training on publicly accessible data without resorting to proprietary datasets. Smaller models trained on a greater number of tokens, which are fragments of words, are simpler to retrain and adjust for particular product use cases. LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens, while the smallest model, LLaMA 7B, was trained on one trillion tokens.

LLaMA functions by taking a sequence of words as input and predicting the next word to recursively create text like any other LLM. The team decided to use text from the top 20 languages with a focus on those using Latin and Cyrillic alphabets to train the model.

“We hope that releasing these models to the research community will accelerate the development of large language models, and help efforts to improve their robustness and mitigate known issues such as toxicity and bias,” said the official blog post.  

Meta aims to publish bigger models that are trained on more extensive pre-training datasets in the future, as it has observed steady enhancements in performance as it scales up.

The launch has caused a significant increase in AI-based crypto tokens. SingularityNET’s AGIX has risen by over 6% following the launch, while Fetch.ai’s FET was not far behind, experiencing a gain of over 4.5%.

Meta’s Wildcard Entry to the AI Race

In the ultimate race for AI supremacy, OpenAI led the way with the release of ChatGPT, a powerful chatbot fueled by GPT-3.5. Google soon followed suit with its ‘experimental’ chatbot Bard, while Chinese tech giant Baidu is planning to enter the fray with its own Ernie Bot- ERNIE 3.0. Not to forget Bing Chat aka Sydney which Microsoft claims to be built on a model which is ‘a new, next-generation OpenAI large language model’ that is more advanced than ChatGPT, aside from also being integrated with Bing search. 

Read more: Why is Meta Shying Away from LLMs?

Unfortunately, Meta has made multiple failed attempts in this space although it was it made it to the headlines for being one of the first ones to release a chatbot built on LLM – BlenderBot 3. But the excitement was short-lived, however, as the bot quickly turned into an AI disaster, spewing racist remarks and questioning Mark Zuckerberg’s ethics. 

But Meta was not deterred, and it continued to experiment with LLM-based models. They introduced Galactica, a model specifically designed for scientific research. Unfortunately, Galactica also met the same fate as BlenderBot 3 and suffered from hallucinating results leading to taking it down.

Although Meta has already mentioned in its blog post that further research is necessary to address the potential risks of bias, toxic comments, and hallucinations in LLMs, including LLaMA, Meta claims that LLaMA is versatile and can be applied to various use cases, unlike fine-tuned models that are designed for specific tasks. 

It will be interesting to see how Meta’s surprise entry to the bandwagon turns out. With its previous models ending up in the dustbin of history, the question on everyone’s mind is: can Meta really step up to the plate this time? 

As the battle for AI supremacy heats up, all eyes are on Meta to see if it has what it takes to go toe-to-toe with the big guns. Will it emerge as a strong contender or fade away with LLaMA as it did with previous models? Only time will tell, but one thing is certain – the fate of AI hangs in the balance.

PS: The story was written using a keyboard.
Picture of Shritama Saha

Shritama Saha

Shritama (she/her) is a technology journalist at AIM who is passionate to explore the influence of AI on different domains including fashion, healthcare and banks.
Related Posts

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories

Featured

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

AIM Conference Calendar

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives. Revel in intimate events that encapsulate the heart and soul of the AI Industry.

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed