It’s Time to Rein in Hallucinating Chatbots

Retrieval augmented language modelling or RALM has also been suggested by experts as a redressal to the hallucination problem with LLMs

At the grand unveiling of Google’s chatbot Bard in Paris on February 8, a demo video of the chatbot answering select questions was released. As fate would have it, Bard gave the wrong answer to a question about the James Webb Space Telescope. What’s more, Google didn’t even notice it before the release of the demo. It isn’t that the large language model (LLM) that Bard was trained on didn’t have this information, it was simply hallucinating. 

The hallucination problem

A hallucinating model generates text that is factually incorrect, basically just spouting nonsense. But what is tricky about LLMs is that these facts are usually represented in a way that appears right, but isn’t. For most readers who usually tend to skim through the text, hallucinations can be hard to catch as the sentences always look right. 

As sneaky as these hallucinations are, they are hard to get rid of. In the words of deep learning critic and Professor Emeritus of Psychology and Neural Science at NYU, Gary Marcus, “Hallucinations are in their (LLMs) silicon blood, a byproduct of the way they compress their inputs, losing track of factual relations in the process. To blithely assume that the problem will soon go away is to ignore 20 years of history.” 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Can connecting LLMs to the web fix hallucinations?

With the quick-paced ascent of these chatbots, all their hallucinations have come to light too. And researchers are trying to get to their solution quicker than before. A Silicon Valley conversational AI startup ‘Got It AI’ is working to develop AI that will serve as a ‘truth-checker’ for enterprise applications like ChatGPT. 

Recently, Josh Tobin, a former OpenAI researcher and the co-founder and CEO at Gantry, an AI startup developing platforms for AI engineering, listed out a simple method to reduce LLM hallucinations on LinkedIn:


Download our Mobile App



1. Using retrieval-augmented models
2. Annotating the examples of hallucinations
3. Prompting or training a model that maps the context and the answer) -> p_halucinate
4. At test time, filtering responses using that model

Retrieval augmented language modelling or REALM has also been suggested by other experts as a redressal to the hallucination problem with LLMs. In REALM, the language models are trained on data fetched from external sources. For instance, if a user enters the prompt ‘Vincent Van Gogh was born in,’ a traditional LLM will try to complete the sentence by guessing the next token in that sequence. The model will likely give an accurate answer if trained on that dataset. Conversely, if not trained, the LLM will give a wrong answer. 

Retrieval Augmented Language Modelling process, Source: Google AI blog

On the other hand, REALM has a ‘knowledge retriever’ to search the document that will probably have the information relevant to the prompt. The model can then output Van Gogh’s birthplace from maybe a Wikipedia page and use this to generate a more reliable response. The knowledge retriever is also able to produce the references to the knowledge documents, which helps the user verify the source and accuracy of the text that the model generated. 

When an LLM is connected to the internet, it starts training itself using retrieval augmented language modelling. This is exactly the progression that we have seen with the chatbots released by the two Big Tech giants heading the race – Microsoft’s Bing chatbot or Sydney is built on the ‘next-generation OpenAI LLM’ and is connected to the web while Google’s Bard is built on their LaMDA and is connected to their search engine. 

A host of AI researchers resonate with this step. Joshua Levy, an AI author stated, “It looks very impressive. Adding web search makes research way more fluid. But the tricky part is how reliably it is combining facts and the LLM output. Have you fact-checked the citations and numbers?”

If connecting to the internet is the first step to removing hallucinations, then why is Bing’s chatbot throwing up the wildest answers? Instead, over the past few days, anybody who got the rare chance to use the chatbot shared instances of how the chatbot was unruly. 

There’s another philosophical take that can be considered here. If our end goal is to build machines that are human-like, isn’t hallucinating and giving flawed answers a part of being human? 

Shuhei Kurita, an NLP researcher with NYU went as far as to argue for hallucinations tweeting, “Google seems to try suppressing LLM hallucination, but isn’t hallucination a part of essential aspects of intelligence? Imagine writing novels by humans or playing with wild animals. They are parts of intelligence that aren’t directly relevant to living skills real-world.” 

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Poulomi Chatterjee
Poulomi is a Technology Journalist with Analytics India Magazine. Her fascination with tech and eagerness to dive into new areas led her to the dynamic world of AI and data analytics.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.