Time for OpenAI to Open Source Toxicity Detection?

Open-sourcing toxicity filtering models would bring huge social benefits and help all open-source LLMs become less toxic
Listen to this story

 If you have followed the AI space closely over the past few years, you would have encountered many instances where AI took violent, sexist and racist barbs at people. This happens because most of the data used to train these AI models is sourced from the web, and often has toxic content. 

Take for instance GPT-4Chan, where YouTuber Yannic Kilchner created an AI chatbot and trained it on three years’ worth of posts from 4chan, the repulsive cousin of Reddit. Kilchner fed the bot threads from the politically incorrect /pol/board, a 4chan message board notorious for racist, xenophobic, and hateful content. Obviously, the chatbot spewed unparliamentary content as a result.

Hence, it gets crucial to remove toxicity from AI models. 


Sign up for your weekly dose of what's up in emerging technology.

OpenAI, the company behind ChatGPT, has managed to do exactly that. According to an article published by Time, OpenAI did this by outsourcing the task to companies such as Sama, a San Francisco-based firm that labels data for Silicon Valley clients. Sama employs data labellers in countries such as Kenya, Uganda and India, who work round the clock to make OpenAI’s AI model free of toxic content. 

How did OpenAI do it?

Making an AI model ‘toxin-free’ is not an easy task. So, how did OpenAI mitigate biases and toxicity from ChatGPT? The Sam Altman-led firm adopted a strategy similar to that of social media companies like Meta (formerly Facebook). 

Download our Mobile App

Facebook runs an AI tool trained to detect toxic content from its platform. The toxicity-detecting tool built by OpenAI has been fed with violence, hate speech, and sexual abuse, so, when it encounters similar content, it blocks it. Apparently, OpenAI sent tens of thousands of text snippets to Sama, which included disturbing content such as child sexual abuse, bestiality, murder, suicide, torture, and self-harm, sourced from the darkest corners of the web.

Should OpenAI open-source it?

When the Time article was published, questions were raised on the wages paid to these data labellers in Kenya. These workers were paid nearly USD2 per hour for a nine-hour shift. While that is a completely different debate, there was also a cry among the AI community for OpenAI to open source these tools.

“It’d be great if OpenAI open-sources its toxicity filtering models, so that no such human cost is duplicated elsewhere. It’ll come with huge social benefits and help all open-source LLMs become less toxic as well”

Jim Fan, AI Scientist at NVIDIA, said.

By involving an entire community in the development process, the tool can be further refined on accuracy and reliability in its task. Additionally, releasing the tool as open source aligns with the original vision of OpenAI, which was founded by Sam Altman in 2015. However, as things stand, OpenAI has not announced any plans to open source its toxicity-detecting model. 

Many were also of the opinion that OpenAI should offer the tool as an API for members of the AI community to access and for OpenAI to monetize. This would provide a mutually beneficial outcome for both the community and the company. Further, OpenAI could also cut down on the cost involved. For instance, Sama had signed a contract worth USD3.9 million with Facebook, for content moderation. Open sourcing the tool would help OpenAI greatly at cutting down cost.

OpenAI’s content moderation tool

In August 2022, OpenAI announced a free content moderation tool for OpenAI API developers. The tool aims to help developers protect their applications against possible misuse and will provide them with free access to GPT-based classifiers that can detect harmful content.

OpenAI also published a paper titled ‘A Holistic Approach to Undesired Content Detection in the Real World’ for the same.

“Our moderation system is trained to detect a broad set of categories of undesired content, including those that are sexual, hateful, violent, and harassing in nature. This approach generalises to a wide range of different content taxonomies and can be used to create high-quality content classifiers that outperform off-the-shelf models,” the researchers at OpenAI said.

More Great AIM Stories

Pritam Bordoloi
I have a keen interest in creative writing and artificial intelligence. As a journalist, I deep dive into the world of technology and analyse how it’s restructuring business models and reshaping society.

AIM Upcoming Events

Early Bird Passes expire on 3rd Feb

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Is AI sexist?

Genderify, launched in 2020, determines the gender of a user by analysing their name, username and email address using AI.