Listen to this story
|
In the latest, tech giant Meta has come out with a new language model named Atlas. It is a retrieval-augmented language model with strong few-shot performance on question answering and fact-checking tasks, Meta adds.
In the paper titled, ‘Few-shot Learning with Retrieval Augmented Language Models’, the researchers say that they performed evaluations on a variety of tasks such as MMLU, KILT and NaturalQuestions. This model reaches a 42% accuracy on Natural Questions by using only 64 examples and outperforms PaLM ( a 540B parameters model ) by 3 per cent though it has over 50 times lesser parameters (11B).
Retrieval augmented model
In the paper, the researchers discuss the need to bring out this model. They add that LLMs have previously shown capabilities of few-shot results but for question answering and fact checking where knowledge is key, “massive parameter counts to store knowledge seem to be needed”.
This is where retrieval augmented models come in as they are capable of knowledge intensive tasks without needing too many parameters. The researchers add that they wanted to see whether these models work in few-shot settings.
“We investigate whether few-shot learning requires models to store a large amount of information in their parameters, and if memorisation can be decoupled from generalization,” the researchers state.
As per the researchers, Atlas retrieves relevant documents by using a general-purpose dense retriever using a dual-encoder architecture based on the Contriever. After that, the documents are processed by a sequence-to-sequence model using the Fusion-in-Decoder architecture.
Image: Few-shot Learning with Retrieval Augmented Language Models
The researchers study the impact of different techniques to train Atlas on its few-shot performance on tasks such as fact checking and question answering. “We find that jointly pre-training the components is crucial for few-shot performance,” the paper adds. The model performs well in resource rich as well as few shot environments. It demonstrates SOTA results on few-shot NaturalQuestions (+2.8 per cent), TriviaQA (+3.3%), FEVER (+5.1 per cent). Atlas is very strong in traditional full training set settings and sets new state of the art on NaturalQuestions by 8%, and TriviaQA by 9% and on 5 KILT tasks, Meta informs.
Image: Few-shot Learning with Retrieval Augmented Language Models
Architecture
The research team follows the text-to-text framework. The tasks follow this path:
- The system receives a text query as input
- It generates a text output
For classification tasks, this query comes in the form of a textual input and the model generates the “lexicalized class label”.
Image: Few-shot Learning with Retrieval Augmented Language Models
The model is based on two sub-models, the paper informs.
- The retriever – Here the retriever based on the Contriever. It is an information retrieval technique based on continuous dense embeddings.
- Language model – The team uses T5 sequence-to-sequence architecture They use the Fusion-in-Decoder modification of sequence-to-sequence models and processes each document independently in the encoder.
For any task like question answering to generating articles, the model follows a similar approach. It starts by retrieving the top-k relevant documents from a large corpus of text with the retriever. Then, these documents are fed to the language model, along with the query, which generates the output. Both the retriever and the language model are based on pre-trained transformer networks as per the paper.
“Atlas outperforms much larger non-augmented models on few-shot question answering (NaturalQuestions and TriviaQA) and fact checking (FEVER), and is competitive with various very large models on a wide array of real-world exams,” Meta adds.
Meta tells us about other benefits of Atlas too. Retrieved passages can be inspected for better interpretability and the corpus that Atlas retrieves from can be edited, or even completely swapped out. This ensures that Atlas can be kept up-to-date without needing to be retrained.