Listen to this story
|
Meta AI has unveiled, LLaMA, a set of foundation language models that range from 7B to 65B parameters.
LLaMA-13B surpasses OpenAI’s GPT-3 (175B) while being over ten times smaller, and LLaMA-65B is comparable to DeepMind’s Chinchilla-70B and Google’s PaLM-540B.
Read the full research paper here.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
The study differs from previous ones because it shows that it is feasible to attain state-of-the-art performance solely by training on publicly accessible data without resorting to proprietary datasets. Smaller models trained on a greater number of tokens, which are fragments of words, are simpler to retrain and adjust for particular product use cases. LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens, while the smallest model, LLaMA 7B, was trained on one trillion tokens.
LLaMA functions by taking a sequence of words as input and predicting the next word to recursively create text like any other LLM. The team decided to use text from the top 20 languages with a focus on those using Latin and Cyrillic alphabets to train the model.
Download our Mobile App
“We hope that releasing these models to the research community will accelerate the development of large language models, and help efforts to improve their robustness and mitigate known issues such as toxicity and bias,” said the official blog post.
Meta aims to publish bigger models that are trained on more extensive pre-training datasets in the future, as it has observed steady enhancements in performance as it scales up.
The launch has caused a significant increase in AI-based crypto tokens. SingularityNET’s AGIX has risen by over 6% following the launch, while Fetch.ai’s FET was not far behind, experiencing a gain of over 4.5%.
Meta’s Wildcard Entry to the AI Race
In the ultimate race for AI supremacy, OpenAI led the way with the release of ChatGPT, a powerful chatbot fueled by GPT-3.5. Google soon followed suit with its ‘experimental’ chatbot Bard, while Chinese tech giant Baidu is planning to enter the fray with its own Ernie Bot- ERNIE 3.0. Not to forget Bing Chat aka Sydney which Microsoft claims to be built on a model which is ‘a new, next-generation OpenAI large language model’ that is more advanced than ChatGPT, aside from also being integrated with Bing search.
Read more: Why is Meta Shying Away from LLMs?
Unfortunately, Meta has made multiple failed attempts in this space although it was it made it to the headlines for being one of the first ones to release a chatbot built on LLM – BlenderBot 3. But the excitement was short-lived, however, as the bot quickly turned into an AI disaster, spewing racist remarks and questioning Mark Zuckerberg’s ethics.
But Meta was not deterred, and it continued to experiment with LLM-based models. They introduced Galactica, a model specifically designed for scientific research. Unfortunately, Galactica also met the same fate as BlenderBot 3 and suffered from hallucinating results leading to taking it down.
Although Meta has already mentioned in its blog post that further research is necessary to address the potential risks of bias, toxic comments, and hallucinations in LLMs, including LLaMA, Meta claims that LLaMA is versatile and can be applied to various use cases, unlike fine-tuned models that are designed for specific tasks.
It will be interesting to see how Meta’s surprise entry to the bandwagon turns out. With its previous models ending up in the dustbin of history, the question on everyone’s mind is: can Meta really step up to the plate this time?
As the battle for AI supremacy heats up, all eyes are on Meta to see if it has what it takes to go toe-to-toe with the big guns. Will it emerge as a strong contender or fade away with LLaMA as it did with previous models? Only time will tell, but one thing is certain – the fate of AI hangs in the balance.