Listen to this story
|
Fine-tuning language models on a group of datasets phrased as instructions have been prominent in improving generalisation and model performance on unseen tasks. In an effort to take this advancement ahead, Google AI has released a new open-source language model – Flan-T5, which is capable of solving around 1800+ varied tasks.
The first author of the paper ‘Scaling Instruction-Finetuned Language Models’, Hyung Won Chung, broke the news in a Twitter thread:
The paper primarily explores instruction finetuning of areas such as scaling the number of tasks and the model size, and chain-of-thought data. The paper reads, “We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation).”
The team has publicly released Flan-T5 checkpoints, which achieve strong few-shot performance
compared to the much larger model of PaLM 62B. Moreover, instruction finetuning is a general method utilised to improve the performance and usability of pretrained language models. With Flan-T5, researchers claim that the new model will lead to improved prompting and multi-step reasoning abilities.
To know more about Flan-T5, read the whole paper here.