Facebook Shared The Recipes For Building An Open-Domain Chatbot

Recently, the researchers at Facebook AI open-sourced a new chatbot known as Blender. According to the researchers, this new chatbot performs more human-like interactions than the previous chatbots

Along with the announcement of the chatbot, the researchers shared the recipe behind the building and deploying of the same. They stated that for the first time ever, this chatbot has the ability to blend a diverse set of conversational skills in a single system, including empathy, knowledge and personality.

Building an open-domain chatbot is one of the complex and challenging domains in machine learning. In order to build a high-performance chatbot, the researchers worked on scaling neural models in the number of parameters as well as the size of the data they are trained on. The researchers stated, “Good conversation requires a number of skills that an expert conversationalist blends in a seamless way, providing engaging talking points, listening to their partners, as well as displaying knowledge, empathy and personality appropriately while maintaining a consistent persona.”

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

The Recipe

According to the researchers at Facebook AI, the recipe of the new chatbot incorporates not only large-scale neural models, with up to 9.4 billion parameters or 3.6x more than the largest existing system but also equally important techniques for blending skills and detailed generation.

The main steps of building this chatbot are scale, blending skills and generation strategies.


To create a high-performance chatbot, the first step is the large-scale training. For this, the researchers pre-trained large Transformer neural networks up to 9.4 billion on large amounts of conversational data. They used previously available public domain conversations that involved 1.5 billion training examples of extracted conversation. 

Blending Skills

For blending skills, the researchers selected specific tasks that make the model focus on personality and engagingness, knowledge, and empathy. They used a recently introduced novel task called Blended Skill Talk (BST) set-up for training and evaluating the desirable skills. BST targets these aspects by providing training data and initial conversational context. Blended Skill Talk (BST) not only emphasised the desirable traits but also showed that this tuning can minimise undesirable traits such as toxicity, learnt from large corpora.

According to the researchers, BST consists of the following skills:-

  • Engaging use of personality 
  • Engaging use of knowledge
  • Display of empathy
  • Ability to blend all three seamlessly

Generation Strategies

To avoid repetitions during a conversation by the agents, researchers usually implement a number of generation strategies such as beam search, next token sampling, and n-gram blocking. 

However, in this work, the researchers consider three types of architectures, which are retrieval, generative, and retrieve-and-refine (RetNRef) models. For the implementations of retrieval systems and generator, they used poly-encoder architecture and Byte-Level BPE tokenisation trained on the pre-training data, respectively.

From a given dialogue history as input, a retrieval system select the next dialogue utterance by scoring a large set of candidate responses and outputting the highest-scoring one. Then, a standard Seq2Seq Transformer architecture was employed to generate responses rather than retrieve them from a fixed set. And lastly for retrieve and refine, the researchers considered two variants for the retrieval step, they are dialogue retrieval and knowledge retrieval. 

Dataset Used

For pre-training, the researchers used pushshift.io Reddit dataset, which is a variant of Reddit Discussions. According to the researchers, this dataset is a good candidate for helping train a dialogue model in the open-domain case. 

For fine-tuning, the researchers used 3 different types of datasets, which are ConvAI2, Empathetic Dialogues (ED) and Wizard of Wikipedia (WoW). ConvAI2 includes training data of 140k utterances, involves paired crowd workers having a conversation where they get to know each other. Empathetic Dialogues dataset consists of 50k utterances of crowd worker conversations grounded in an emotional situation, and the Wizard of Wikipedia task involves discussing a given topic in-depth, where the goal is to both engage the partner as well as display expert knowledge.

Wrapping Up

For the evaluation of the chatbot, the researchers benchmarked its performance against Google’s Meena chatbot through pairwise human evaluations. They further utilised the ACUTE-Eval method in order to show a series of dialogues between humans paired with each respective chatbot. 

The researchers released 90M, 2.7B and 9.4B parameter pre-trained and fine-tuned generative models as well as provided a script for interacting with the bot with safety filtering built-in. According to the researchers, this method has taken a step further and gained improved performance in terms of engagingness and humanness. However, there are still various issues such as non-trivial repetition, knowledge and factual correctness, contradiction and forgetfulness, among others with the model, which needs to be mitigated in future studies. 

Read the paper here.

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox