MITB Banner

Yet Another Language Model from Meta: Atlas

This model reaches a 42% accuracy on Natural Questions by using only 64 examples and outperforms PaLM
Share
Listen to this story

In the latest, tech giant Meta has come out with a new language model named Atlas. It is a retrieval-augmented language model with strong few-shot performance on question answering and fact-checking tasks, Meta adds. 

In the paper titled, ‘Few-shot Learning with Retrieval Augmented Language Models’, the researchers say that they performed evaluations on a variety of tasks such as MMLU, KILT and NaturalQuestions. This model reaches a 42% accuracy on Natural Questions by using only 64 examples and outperforms PaLM ( a 540B parameters model ) by 3 per cent though it has over 50 times lesser parameters (11B). 

Retrieval augmented model

In the paper, the researchers discuss the need to bring out this model. They add that LLMs have previously shown capabilities of few-shot results but for question answering and fact checking where knowledge is key, “massive parameter counts to store knowledge seem to be needed”. 

This is where retrieval augmented models come in as they are capable of knowledge intensive tasks without needing too many parameters. The researchers add that they wanted to see whether these models work in few-shot settings. 

“We investigate whether few-shot learning requires models to store a large amount of information in their parameters, and if memorisation can be decoupled from generalization,” the researchers state. 

As per the researchers, Atlas retrieves relevant documents by using a general-purpose dense retriever using a dual-encoder architecture based on the Contriever. After that, the documents are processed by a sequence-to-sequence model using the Fusion-in-Decoder architecture.

Image: Few-shot Learning with Retrieval Augmented Language Models

The researchers study the impact of different techniques to train Atlas on its few-shot performance on tasks such as fact checking and question answering. “We find that jointly pre-training the components is crucial for few-shot performance,” the paper adds. The model performs well in resource rich as well as few shot environments.  It demonstrates SOTA results on few-shot NaturalQuestions (+2.8 per cent), TriviaQA (+3.3%), FEVER (+5.1 per cent). Atlas is very strong in traditional full training set settings and sets new state of the art on NaturalQuestions by 8%, and TriviaQA by 9% and on 5 KILT tasks, Meta informs.

Image: Few-shot Learning with Retrieval Augmented Language Models

Architecture

The research team follows the text-to-text framework. The tasks follow this path:

  • The system receives a text query as input
  • It generates a text output

For classification tasks, this query comes in the form of a textual input and the model generates the “lexicalized class label”.

Image: Few-shot Learning with Retrieval Augmented Language Models

The model is based on two sub-models, the paper informs. 

  • The retriever – Here the retriever based on the Contriever. It is an information retrieval technique based on continuous dense embeddings.
  • Language model – The team uses T5 sequence-to-sequence architecture They use the Fusion-in-Decoder modification of sequence-to-sequence models and processes each document independently in the encoder.

For any task like question answering to generating articles, the model follows a similar approach. It starts by retrieving the top-k relevant documents from a large corpus of text with the retriever. Then, these documents are fed to the language model, along with the query, which generates the output. Both the retriever and the language model are based on pre-trained transformer networks as per the paper.

“Atlas outperforms much larger non-augmented models on few-shot question answering (NaturalQuestions and TriviaQA) and fact checking (FEVER), and is competitive with various very large models on a wide array of real-world exams,” Meta adds.

Meta tells us about other benefits of Atlas too. Retrieved passages can be inspected for better interpretability and the corpus that Atlas retrieves from can be edited, or even completely swapped out. This ensures that Atlas can be kept up-to-date without needing to be retrained.

PS: The story was written using a keyboard.
Share
Picture of Sreejani Bhattacharyya

Sreejani Bhattacharyya

I am a technology journalist at AIM. What gets me excited is deep-diving into new-age technologies and analysing how they impact us for the greater good. Reach me at sreejani.bhattacharyya@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India