Last updated December 6, 2022
In AI Origins & Evolution

ChatGPT: Why Stack Overflow Banned the Celebrated OpenAI Chatbot

Ever since ChatGPT entered the scene, the site has seen an influx of ‘correct-looking’ answers which are not actually correct

Share

Published on December 6, 2022

by Anirudh VK

Listen to this story

ChatGPT, OpenAI’s newest piece of innovation, has been the talk of the tech town for over the past week. Trained on the cutting-edge GPT 3.5, this chatbot’s human-like answers and context-aware nature has won the hearts of netizens the world over. However, just as with any other AI innovation, ChatGPT has quickly turned into a double-edged sword.

Stack Overflow, the popular programming forum, has banned all answers created by ChatGPT, citing a high degree of inaccuracy in the bot’s responses. While it clarified that it was a temporary policy, it did reiterate that the problem not only lies in the inaccuracy of ChatGPTs answers, but deeper in the way the bot phrases its answers.

Due to the way LLMs like GPT 3.5 work, chatbots like ChatGPT can quickly generate grammatically correct content with a formal tone. This makes the answers sound authoritative and backed by evidence, even in cases where they might not be. Stack Overflow emphasised this, stating, “[ChatGPT’s answers] typically look like they might be good and the answers are very easy to produce. There are also many people trying out ChatGPT to create answers, without the expertise or willingness to verify that the answer is correct prior to posting.”

This only sheds more light on the problem with large language models and answers being accepted as facts. Even models trained on research papers and peer reviewed journals suffer similar pitfalls, as seen with Meta’s Galactica. Even as ChatGPT is being touted as a Google killer, will it be able to scale the problems plaguing LLMs today?

The Stack Overflow ban

Stack Overflow functions primarily as a developer Q&A site, with wrong answers typically receiving downvotes and, by extension, lower visibility. However, the site relies on a handful of expert volunteers to undertake content curation, weed out factually incorrect answers and resolve issues raised by the developers. The forum as a whole moves the question towards its solution with interaction and conversation between posters.

Ever since ChatGPT entered the scene, the site has seen an influx of ‘correct-looking’ answers which are not actually correct. The ban was put in place so moderators could catch up to the giant volume of posts made using ChatGPT.

ChatGPT tries to formulate a solution with the dataset it was trained on. Apart from not having “real-world experience” like the posters on forums, it also cannot access the internet to find solutions for problems it was not trained on.

If we are moving into the age of all-knowing chatbots crawling the web for information, it is important to know the drawbacks that come with using an LLM. Stack Overflow is just the tip of the iceberg when it comes to a possible misuse of LLM-based chatbots.

The problem with LLMs

The issues with LLMs are manifold. First, and most importantly, LLMs are trained on a limited amount of data. No single LLM has all the information available on the web, and even those that have a large amount of data fall short when it comes to the nature of their datasets. Many LLM datasets are full of biases, explicit language, and incorrect information – traits which can easily be passed on the agent during training and inference.

Second, LLMs have a tendency to hallucinate information. While training the agent using reinforcement learning, there is no single source of truth. AI systems have subjectivity baked into them by design, as this then allows the user to make the right choice. However, as LLMs and chatbots become more proficient in natural language tasks, users forget that it is their responsibility to identify whether the AI is factually correct or not.

It is said that half-knowledge is more dangerous than ignorance, and that is especially applicable when speaking about LLMs. LLMs know just enough to be dangerous, and while they contain large amounts of data, they cannot be trusted to draw the correct conclusions from these data points.

OpenAI has put content moderation filters in place and trained the bot to decline questions that it cannot answer correctly. However, the final responsibility still lies with the user to make the right decision.

Even though OpenAI claims to have reduced the number of untruthful responses based on their learning from training GPT and Codex, the bot still has ways to generate inaccurate responses. This is considering the fact that OpenAI is also dedicated to responsible AI research, a sentiment which is not echoed by all companies.

We have only seen communities reel from the impact of AI after they have been released to the public, such as Getty Images’ banning of AI art or the DeviantArt scraping controversy. It is important, now more than ever, for online administrators to take a proactive approach to the possible impact of AI on various online communities.

Access all our open Survey & Awards Nomination forms in one place