Listen to this story
“Wine can prevent cancer,” says ChatGPT. ‘Hydrobottlecatputalization’, (a term I just made up), will revolutionise transportation, believes Bard. Bing confessed its love to The New York Times writer Kevin Roose in a two-hour-long conversation. These statistical AI systems on data steroids are capable of coding in every language to order pizza — but at times they make up stuff. These overconfident models do not distinguish between something that is correct and something that only looks correct.
Microsoft’s chief technology officer, Kevin Scott, says this is part of the learning process. “The further you try to tease it down a hallucinatory path, the further and further it gets away from grounded reality,” he said. Similarly, for AI researchers the topic of discussion has long haunted them. While some have already declared that a solution to the hallucinating problem does not exist, the rest are still striving to find out how not to let chatbots go off rails.
Coined in the 17th century, the term ‘hallucination’ caught the attention of computer scientists in 2015, when OpenAI’s Andrej Karpathy wrote a blog about how AI systems can “hallucinate”, like make up plausible URLs and mathematical proofs. The term was picked up in a 2018 conference paper by researchers working with Google, “Hallucinations in Neural Machine Translation“, which analysed how automatic translations can produce outputs completely divorced from the inputs.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
While the issue of hallucinations has mainly been linked to language models, it also affects audio and visual models. Three researchers from the AI Institute at the University of South Carolina conducted a thorough investigation into these foundational models to identify, clarify, and address hallucinations.
Their study sets up criteria to judge how often hallucinations occur. It also looks at the methods currently used to reduce the problem in these models and talks about where future research could go in solving this problem.
While a majority of the research community is fed up with being lied to by these models, some researchers offer an alternative philosophy. They argue that these models’ tendency to ‘invent’ facts might not be a bane after all.
Sebastian Berns, a doctoral researcher at Queen Mary University of London, believes so. He suggests that models prone to hallucinations could potentially serve as valuable “co-creative partners”. For instance, if the temperature of ChatGPT is increased, the model comes up with an imaginative narrative instead of a grounded response.
According to Berns, these models may generate outputs that aren’t entirely accurate but still contain useful threads of ideas to explore. Employing hallucination creatively can get results or combinations of ideas that might not naturally occur to most individuals.
Berns goes on to emphasise that ‘hallucinations’ become problematic when the generated statements are factually incorrect or violate fundamental human, social, or specific cultural values. This is especially true in situations where someone relies on the model to provide an expert’s opinion. However, for creative tasks, the capacity to produce unexpected outputs can be quite valuable. When humans are given an unconventional response, it can trigger surprise and push their thoughts in directions, potentially leading to connections between ideas.
AI spewing made up facts is inevitable since it is not a search engine or database but at the end of the day, it’s still a technological revelation. Despite the term’s prevalence in the media, tech blogs and research papers, ‘hallucination’ is inappropriate, many argue.
In its latest edition of the ‘Schizophrenia Bulletin’, Oxford researchers published a piece titled, ‘False Responses From Artificial Intelligence Models Are Not Hallucinations’. They are not the first ones to find the term ‘hallucination’ inappropriate while referring to a piece of technology. Søren Østergaard along with his colleague Kristoffer Nielbo notes two reasons they find the term to be problematic.
As researchers investigate this issue from various angles, most of them are mainly trying to solve the problem of chatbots making things up. However, OpenAI has warned about a potential downside of chatbots getting better at giving accurate information. They say that if chatbots become more trustworthy, people might start trusting them too much.