MITB Banner

Google updates its LaMDA language model

LaMDA is built by fine-tuning a family of Transformer-based neural language models specialised for dialog, with up to 137B model parameters.

Share

In May last year, Google announced a language model called ‘LaMDA’, or ‘Language Model for Dialogue Applications,’ at Google I/O 2021, and it has now come up with advances in the model of the same project. LaMDA is built by fine-tuning a family of Transformer-based neural language models specialised for dialog, with up to 137B model parameters. 

Google says that it has been building the conversational skills of LaMDA for a long time. It added that the architecture produces a model that can be trained to read many words, work on how they relate to each other and predict what word comes next.

Metrics 

As per the paper titled, “LaMDA: Language Models for Dialog Applications“, the benefits of model scaling with LaMDA are studied across three metrics.

  • Quality
  • Safety
  • Groundedness

Image: LaMDA: Language Models for Dialog Applications

The research team observed that model scaling alone improves quality, but its improvements on safety and groundedness are far behind human performance. It also saw that combining scaling and fine-tuning improves LaMDA by a significant amount for all three metrics. “Even if the model’s performance remains below human levels in safety and groundedness, the quality gap to measured crowd worker levels can be narrowed”, added the team.

  • Quality

The paper says that this is based on three components – sensibleness, specificity, and interestingness. The team collected annotated data that describes how sensible, specific and interesting a response is for a multiturn context. After that, they used these annotations to fine-tune a discriminator to re-rank candidate responses.

  • Safety

It is aimed to reduce the number of unsafe responses. The team defined an illustrative set of safety objectives that captures the behaviour that the model can exhibit in a dialog. They use a demographically diverse set of crowd workers to label responses in multiturn dialogs for the objectives. These labels are used to fine-tune a discriminator to detect and remove unsafe responses.

  • Groundedness

It is introduced to produce responses that are grounded in known sources where they contain verifiable external world information. The paper adds that though grounding in known sources does not guarantee factual accuracy, it can allow users to judge the validity of a response based on the reliability of its source and its reproduction.

Pre-Training

LaMDA undergoes two-stage training- pre-training and fine-tuning. The team has created a dataset of 1.56T words for the pre-training stage from public dialog data and other public web documents. Then the dataset is tokenised into 2.81T SentencePiece tokens and pre-trained using GSPMD to predict every next token in a sentence (given the previous tokens). 

Fine-Tuning

Image: Google

Here, the team trains LaMDA to perform a mix of generative tasks for natural-language responses to given contexts. The paper adds, “The LaMDA generator is trained to predict the next token on a dialog dataset restricted to back-and-forth dialog between two authors, while the LaMDA classifiers are trained to predict the Safety and Quality (SSI) ratings for the response in context using annotated data.”

The LaMDA generator generates many candidate responses given the current multi-turn dialog context. The LaMDA classifiers help predict the SSI and Safety scores. The responses with low Safety scores are filtered out first, and then the remaining candidates are re-ranked by their SSI scores. The top result is selected as the chosen response. 

Results

The team collected responses from the pre-trained model, fine-tuned model, human raters, and multi-turn two-author dialogs. Then, they asked a different set of human raters a bunch of questions to evaluate these responses against the three metrics of quality, safety, and groundedness.

The results show that LaMDA significantly outperforms the pre-trained model (in all dimensions and across all model sizes). 

Image: Google 

  • Quality 

The paper says that the quality metrics generally improve with the number of model parameters, with or without fine-tuning.

  • Safety 

It does not benefit from the model scaling alone but shows improvement with fine-tuning.

  • Groundedness

As the model size increases, groundedness improves. The model can access external knowledge sources and effectively shift some of the load of remembering knowledge to an external knowledge source through fine tuning.

Share
Picture of Sreejani Bhattacharyya

Sreejani Bhattacharyya

I am a technology journalist at AIM. What gets me excited is deep-diving into new-age technologies and analysing how they impact us for the greater good. Reach me at sreejani.bhattacharyya@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.