Yet Another Language Model from Meta: Atlas

This model reaches a 42% accuracy on Natural Questions by using only 64 examples and outperforms PaLM
Listen to this story

In the latest, tech giant Meta has come out with a new language model named Atlas. It is a retrieval-augmented language model with strong few-shot performance on question answering and fact-checking tasks, Meta adds. 

In the paper titled, ‘Few-shot Learning with Retrieval Augmented Language Models’, the researchers say that they performed evaluations on a variety of tasks such as MMLU, KILT and NaturalQuestions. This model reaches a 42% accuracy on Natural Questions by using only 64 examples and outperforms PaLM ( a 540B parameters model ) by 3 per cent though it has over 50 times lesser parameters (11B). 

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

Retrieval augmented model

In the paper, the researchers discuss the need to bring out this model. They add that LLMs have previously shown capabilities of few-shot results but for question answering and fact checking where knowledge is key, “massive parameter counts to store knowledge seem to be needed”. 

This is where retrieval augmented models come in as they are capable of knowledge intensive tasks without needing too many parameters. The researchers add that they wanted to see whether these models work in few-shot settings. 

“We investigate whether few-shot learning requires models to store a large amount of information in their parameters, and if memorisation can be decoupled from generalization,” the researchers state. 

As per the researchers, Atlas retrieves relevant documents by using a general-purpose dense retriever using a dual-encoder architecture based on the Contriever. After that, the documents are processed by a sequence-to-sequence model using the Fusion-in-Decoder architecture.

Image: Few-shot Learning with Retrieval Augmented Language Models

The researchers study the impact of different techniques to train Atlas on its few-shot performance on tasks such as fact checking and question answering. “We find that jointly pre-training the components is crucial for few-shot performance,” the paper adds. The model performs well in resource rich as well as few shot environments.  It demonstrates SOTA results on few-shot NaturalQuestions (+2.8 per cent), TriviaQA (+3.3%), FEVER (+5.1 per cent). Atlas is very strong in traditional full training set settings and sets new state of the art on NaturalQuestions by 8%, and TriviaQA by 9% and on 5 KILT tasks, Meta informs.

Image: Few-shot Learning with Retrieval Augmented Language Models

Architecture

The research team follows the text-to-text framework. The tasks follow this path:

  • The system receives a text query as input
  • It generates a text output

For classification tasks, this query comes in the form of a textual input and the model generates the “lexicalized class label”.

Image: Few-shot Learning with Retrieval Augmented Language Models

The model is based on two sub-models, the paper informs. 

  • The retriever – Here the retriever based on the Contriever. It is an information retrieval technique based on continuous dense embeddings.
  • Language model – The team uses T5 sequence-to-sequence architecture They use the Fusion-in-Decoder modification of sequence-to-sequence models and processes each document independently in the encoder.

For any task like question answering to generating articles, the model follows a similar approach. It starts by retrieving the top-k relevant documents from a large corpus of text with the retriever. Then, these documents are fed to the language model, along with the query, which generates the output. Both the retriever and the language model are based on pre-trained transformer networks as per the paper.

“Atlas outperforms much larger non-augmented models on few-shot question answering (NaturalQuestions and TriviaQA) and fact checking (FEVER), and is competitive with various very large models on a wide array of real-world exams,” Meta adds.

Meta tells us about other benefits of Atlas too. Retrieved passages can be inspected for better interpretability and the corpus that Atlas retrieves from can be edited, or even completely swapped out. This ensures that Atlas can be kept up-to-date without needing to be retrained.

More Great AIM Stories

Sreejani Bhattacharyya
I am a technology journalist at AIM. What gets me excited is deep-diving into new-age technologies and analysing how they impact us for the greater good. Reach me at sreejani.bhattacharyya@analyticsindiamag.com

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM