Yet Another Language Model from Meta: Atlas

This model reaches a 42% accuracy on Natural Questions by using only 64 examples and outperforms PaLM

Published on August 10, 2022

by Sreejani Bhattacharyya

Listen to this story

In the latest, tech giant Meta has come out with a new language model named Atlas. It is a retrieval-augmented language model with strong few-shot performance on question answering and fact-checking tasks, Meta adds.

In the paper titled, ‘Few-shot Learning with Retrieval Augmented Language Models’, the researchers say that they performed evaluations on a variety of tasks such as MMLU, KILT and NaturalQuestions. This model reaches a 42% accuracy on Natural Questions by using only 64 examples and outperforms PaLM ( a 540B parameters model ) by 3 per cent though it has over 50 times lesser parameters (11B).

(1/6) Today we’re introducing Atlas, a new retrieval-augmented lang. model with strong few-shot performance on question answering and fact checking tasks. w/ only 11B parameters, Atlas outperforms a 540B parameter model by 3% w/ 64 training exs., reaching 42% on NaturalQuestions. pic.twitter.com/ptuoR3JvI4
— Meta AI (@MetaAI) August 8, 2022

Retrieval augmented model

In the paper, the researchers discuss the need to bring out this model. They add that LLMs have previously shown capabilities of few-shot results but for question answering and fact checking where knowledge is key, “massive parameter counts to store knowledge seem to be needed”.

This is where retrieval augmented models come in as they are capable of knowledge intensive tasks without needing too many parameters. The researchers add that they wanted to see whether these models work in few-shot settings.

“We investigate whether few-shot learning requires models to store a large amount of information in their parameters, and if memorisation can be decoupled from generalization,” the researchers state.

As per the researchers, Atlas retrieves relevant documents by using a general-purpose dense retriever using a dual-encoder architecture based on the Contriever. After that, the documents are processed by a sequence-to-sequence model using the Fusion-in-Decoder architecture.

Image: Few-shot Learning with Retrieval Augmented Language Models

The researchers study the impact of different techniques to train Atlas on its few-shot performance on tasks such as fact checking and question answering. “We find that jointly pre-training the components is crucial for few-shot performance,” the paper adds. The model performs well in resource rich as well as few shot environments. It demonstrates SOTA results on few-shot NaturalQuestions (+2.8 per cent), TriviaQA (+3.3%), FEVER (+5.1 per cent). Atlas is very strong in traditional full training set settings and sets new state of the art on NaturalQuestions by 8%, and TriviaQA by 9% and on 5 KILT tasks, Meta informs.

Image: Few-shot Learning with Retrieval Augmented Language Models

Architecture

The research team follows the text-to-text framework. The tasks follow this path:

The system receives a text query as input
It generates a text output

For classification tasks, this query comes in the form of a textual input and the model generates the “lexicalized class label”.

Image: Few-shot Learning with Retrieval Augmented Language Models

The model is based on two sub-models, the paper informs.

The retriever – Here the retriever based on the Contriever. It is an information retrieval technique based on continuous dense embeddings.
Language model – The team uses T5 sequence-to-sequence architecture They use the Fusion-in-Decoder modification of sequence-to-sequence models and processes each document independently in the encoder.

For any task like question answering to generating articles, the model follows a similar approach. It starts by retrieving the top-k relevant documents from a large corpus of text with the retriever. Then, these documents are fed to the language model, along with the query, which generates the output. Both the retriever and the language model are based on pre-trained transformer networks as per the paper.

“Atlas outperforms much larger non-augmented models on few-shot question answering (NaturalQuestions and TriviaQA) and fact checking (FEVER), and is competitive with various very large models on a wide array of real-world exams,” Meta adds.

Meta tells us about other benefits of Atlas too. Retrieved passages can be inspected for better interpretability and the corpus that Atlas retrieves from can be edited, or even completely swapped out. This ensures that Atlas can be kept up-to-date without needing to be retrained.

PS: The story was written using a keyboard.

Access all our open Survey & Awards Nomination forms in one place

Sreejani Bhattacharyya

I am a technology journalist at AIM. What gets me excited is deep-diving into new-age technologies and analysing how they impact us for the greater good. Reach me at sreejani.bhattacharyya@analyticsindiamag.com

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

The Impact of Lok Sabha Election on India’s AI Progress

Vidyashree Srinivas

The BJP aims to safeguard citizen safety and privacy, leaning towards regulation, while the Congress views AI advancements as an opportunity to create jobs.