BlenderBot 2 is a first of its kind open-source chatbot with long term memory– from the house of Facebook’s research platform ParlAI. The latest developments point to Facebook’s bid to make the AI more empathetic, knowledgeable and capable.
Language models like OpenAI’s GPT 3, and BlenderBot 1.0 are quite articulate and generate human-like text. BlenderBot was infamous for having the memory of a goldfish and its tendency to feign knowledge. Also the long-term memory of such systems is static– ie limited to what the models have been taught.
Credits: Facebook AI Blog Post
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
All about BlenderBot 2.0
The BlenderBot 2.0 can build long term memory for continuous access. Moreover, it can do so while simultaneously searching for information on the internet and holding conversations on nearly any topic. So, if a user talked about Tom Brady weeks ago, BlenderBot 2.0 will likely bring up the NFL in future conversations.
In conversational experiments, BlenderBot 2.0 generated contextual internet searches via Bing to respond with information and hold longer and factually correct conversations. The chatbot simultaneously stored conversational information in its LTM to leverage that knowledge in more extended ongoing discussions. To ensure that no data is passed between users, information is stored differently for different people.
Download our Mobile App
“While existing systems can ask and answer basic questions about things like food, movies, or bands, they typically struggle with more complex or freeform conversations, like, for example, discussing Tom Brady’s career in detail”, Facebook research scientist Jason Weston and research engineer Kurt Shuster wrote.“But technology-based on BlenderBot 2.0 could one day become a useful part of everyday life by being able to have multisession conversations on any topic that can last days, weeks, or even months, and by adding to what it knows and can talk about as the conversation evolves”.
Click here to watch the demonstration video.
The new system outperformed BlenderBot 1.0 in picking up conversation sessions with a 17 percent improvement in engagingness score, and 55 percent improvement in use of previous conversation sessions. The rate of hallucination has also dropped from 9 percent to 3 percent. ParlAI claimed BlenderBot 2.0 outperformed the conversational abilities of the existing systems. With its commonsense reasoning capacities, BlenderBot 2.0 can reduce its hallucination and limit its confusion between subtle concepts.
Tech behind
BlenderBot 2.0 uses an AI model based on Retrieval Augmented Generation, an approach that enables it to generate responses and incorporate knowledge beyond the conversation.
During conversation, the model seeks relevant information both in its long-term memory and from the internet. The model uses an enhanced traditional encoder-decoder architecture coupled with an additional neural network module to generate relevant search queries. BlenderBot 2.0 then prepends the resulting knowledge to the conversational history encoded using the Fusion-in-Decoder method. The chatbot generates a response based on the encoded knowledge
Credits: Facebook AI Blog Post
ParlAI researchers used a crowdsourcing platform to put together two datasets for training the neural network: Wizard of the Internet contains human conversations augmented with new internet searches. This guides BlenderBot 2.0 on generating search engine queries and creating responses based on it; Multisession has a long-context chat with humans referencing information from conversion history. It directs the bot on which new knowledge to store in long term memory.
Safety features
Language models run the risk of picking bias from the dataset it’s trained on. Facebook has introduced two prompts to mitigate this issue- baked-in safety and robustness to difficulty. The methods work by training data from the new human-and-model-in-the-loop framework in a two-stage system to ‘bank in’ safety to the generative model itself.
The research showed a 90 percent reduction in offensive responses and a 74.50 percent increase in safe responses.
BlenderBot 2.0 can continue a conversation as long as information about it is available on the web- it is limited only by what a search engine can provide. In an interview, Jason Weston said that while BlenderBot 2.0 may be able to use information about other languages if provided on Bing, the chatbot is, for now, only focused on English based results.