Listen to this story
Fine-tuning language models on a group of datasets phrased as instructions have been prominent in improving generalisation and model performance on unseen tasks. In an effort to take this advancement ahead, Google AI has released a new open-source language model – Flan-T5, which is capable of solving around 1800+ varied tasks.
The first author of the paper ‘Scaling Instruction-Finetuned Language Models’, Hyung Won Chung, broke the news in a Twitter thread:
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
The paper primarily explores instruction finetuning of areas such as scaling the number of tasks and the model size, and chain-of-thought data. The paper reads, “We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation).”
The team has publicly released Flan-T5 checkpoints, which achieve strong few-shot performance
compared to the much larger model of PaLM 62B. Moreover, instruction finetuning is a general method utilised to improve the performance and usability of pretrained language models. With Flan-T5, researchers claim that the new model will lead to improved prompting and multi-step reasoning abilities.
To know more about Flan-T5, read the whole paper here.