MITB Banner

Google AI Introduces FLAN, A Language Model with Instruction Fine-Tuning

Google AI hopes that the method presented will help inspire more research into models that can perform unseen tasks and learn from very little data.

Share

Google AI recently introduced their new Natural Language Processing (NLP) model, known as Fine-tuned LAnguage Net (FLAN), which explores a simple technique called instruction fine-tuning, or instruction tuning for short.

In general, fine-tuning requires a large number of training examples, along with stored model weights for each downstream task which is not always practical, particularly for large models. FLAN’s instruction fine-tuning technique involves fine-tuning a model not to solve a specific task, but to also make it more amenable to solving NLP tasks in particular.

FLAN is fine-tuned on a large set of varied instructions that use a simple and intuitive description of the task, such as “Classify this movie review as positive or negative,” or “Translate this sentence to Danish.” Creating a dataset of instructions from scratch to fine-tune the model would take a considerable amount of resources. Instead, it makes use of templates to transform existing datasets into an instructional format.

Image Source: Google AI

FLAN demonstrates that by training a model on a set of instructions, it not only becomes good at solving the kinds of instructions it has seen during training but becomes good at following instructions in general.

Image Source: Google AI

Google AI used established benchmark datasets to compare the performance of FLAN with existing models. It was also evaluated how FLAN performs without having seen any examples from that dataset during training. Evaluation results showed that FLAN on 25 tasks improves over zero-shot prompting on all but four of them. The results were found to be better than zero-shot GPT-3 on 20 of 25 tasks and better than even few-shot GPT-3 on some tasks.

Image Source: Google AI

It was also found that at smaller scales, the FLAN technique actually degrades performance, and only at larger scales does the model become able to generalize from instructions in the training data to unseen tasks. This might be because models that are too small do not have enough parameters to perform a large number of tasks.

Google AI hopes that the method presented will help inspire more research into models that can perform unseen tasks and learn from very little data.

Share
Picture of Victor Dey

Victor Dey

Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India