Why Meta Took Down its ‘Hallucinating’ AI Model Galactica?

“The reality is that large language models like GPT-3 and Galactica are like bulls in a china shop, powerful but reckless”

On Wednesday, MetaAI and Papers with Code announced the release of Galactica, an open-source large language model trained on scientific knowledge, with 120 billion parameters. However, just days after its launch, Meta took Galactica down.

Interestingly, every result generated by Galactica came with the warning- Outputs may be unreliable. Language Models are prone to hallucinate text.

“Galactica is trained on a large and curated corpus of humanity’s scientific knowledge. This includes over 48 million papers, textbooks and lecture notes, millions of compounds and proteins, scientific websites, encyclopedias and more,” the paper said.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Galactica was designed to tackle the issue of information overload when accessing scientific information through search engines, where there is no proper organisation of scientific knowledge.

However, when members of the community started using the all new AI model by Meta, many of them found the results to be suspicious. In fact, many took to Twitter to point out that the results presented by Galactica were actually highly inaccurate.

Download our Mobile App

Alex Polozov, staff research scientist at Google, called Galactica an endless source of adversarial examples for hallucination, attribution, and alignment research.

False Results

“I asked Galactica about some things I know about and I’m troubled. In all cases, it was wrong or biassed but sounded right and authoritative. I think it’s dangerous,” Micheal Black, director at Max Planck Institute for Intelligent Systems, said.

Gary Marcus, a Professor of Psychology and Neural Science at NYU, is a popular critic of deep learning and AGI also took to Twitter to state that Galactica got his birthday, education as well as research interests wrong. Nearly 85% of the results presented by Galactica about Marcus were not true, according to him.

Tariq Desai, head of Data Science at ExploreAI, told AIM that he was in fact genuinely excited to try out Galactica because it seemed like a valuable way to search and synthesise scientific knowledge. “However, the few examples that I did try suggested that the model was better at mimicking the form of scientific writing than in reproducing its semantic content. For example, I prompted the model for a ‘literature review on whether HIV causes AIDS’ and was presented with text which was just wrong on this question, and which invented citations and research.

“Galactica was useful for exploring mathematical content, though, and demonstrates the potential of some interesting applications in that sphere,” Desai added.

Interestingly, the paper stated that Galactica beats GPT-3, one of the most popular large language models, by 68.2% versus 49.0% on technical knowledge probes such as LaTeX equations.

Inaccurate results could be dangerous 

Julian Togelius, associate professor at NYU, also pointed out that Galactica not only got his name wrong, but failed to summarise his work. “Asking Galactica to summarise my work gives results that vary from hilariously wrong to actually mostly correct.”

He also pointed out that while it was easy for him to figure out the difference, it might not be the same for someone who does not know him personally.

(Source: Twitter)

Even though some of the results are hysterical, inaccurate or falsely generated results could prove to be problematic because they could be perceived to be correct by other members of the community, and it could prove to be highly dangerous in terms of scientific research.

In this regard, Black said that Galactica generates text that’s grammatically correct and feels real. “This text will slip into real scientific submissions. It will be realistic but wrong or biassed and  hard to detect. It will influence how people think,” he said.

“It offers authoritative-sounding science that isn’t grounded in the scientific method. It produces pseudoscience based on statistical properties of science writing. Grammatical science writing is not the same as doing science. But it will be hard to distinguish,” he added.

Explaining further, Black said a pandora box has been opened and that there is a possibility for deep scientific fakes. Researcher’s names could be cited on papers they did not write. Further, these papers will be then cited by other researchers in real papers. “What a mess this will be,” Black said.

Marcus also concurs Black’s views on Galactica and how such models could prove to be dangerous. “The reality is that large language models like GPT-3 and Galactica are like bulls in a china shop, powerful but reckless. And they are likely to vastly increase the challenge of misinformation,” he said.

Sign up for The AI Forum for India

Analytics India Magazine is excited to announce the launch of AI Forum for India – a community, created in association with NVIDIA, aimed at fostering collaboration and growth within the artificial intelligence (AI) industry in India.

Pritam Bordoloi
I have a keen interest in creative writing and artificial intelligence. As a journalist, I deep dive into the world of technology and analyse how it’s restructuring business models and reshaping society.

Our Upcoming Events

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023

21 Jul, 2023 | New York
MachineCon USA 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

GPT-4: Beyond Magical Mystery

The OpenAI CEO believes that by ingesting human knowledge, the model is acquiring a form of reasoning capability that could be additive to human wisdom in some senses.