Listen to this story
|
If you have followed the AI space closely over the past few years, you would have encountered many instances where AI took violent, sexist and racist barbs at people. This happens because most of the data used to train these AI models is sourced from the web, and often has toxic content.
Take for instance GPT-4Chan, where YouTuber Yannic Kilchner created an AI chatbot and trained it on three years’ worth of posts from 4chan, the repulsive cousin of Reddit. Kilchner fed the bot threads from the politically incorrect /pol/board, a 4chan message board notorious for racist, xenophobic, and hateful content. Obviously, the chatbot spewed unparliamentary content as a result.
Hence, it gets crucial to remove toxicity from AI models.
THE BELAMY
Sign up for your weekly dose of what's up in emerging technology.
OpenAI, the company behind ChatGPT, has managed to do exactly that. According to an article published by Time, OpenAI did this by outsourcing the task to companies such as Sama, a San Francisco-based firm that labels data for Silicon Valley clients. Sama employs data labellers in countries such as Kenya, Uganda and India, who work round the clock to make OpenAI’s AI model free of toxic content.
How did OpenAI do it?
Making an AI model ‘toxin-free’ is not an easy task. So, how did OpenAI mitigate biases and toxicity from ChatGPT? The Sam Altman-led firm adopted a strategy similar to that of social media companies like Meta (formerly Facebook).
Download our Mobile App
Facebook runs an AI tool trained to detect toxic content from its platform. The toxicity-detecting tool built by OpenAI has been fed with violence, hate speech, and sexual abuse, so, when it encounters similar content, it blocks it. Apparently, OpenAI sent tens of thousands of text snippets to Sama, which included disturbing content such as child sexual abuse, bestiality, murder, suicide, torture, and self-harm, sourced from the darkest corners of the web.
Should OpenAI open-source it?
When the Time article was published, questions were raised on the wages paid to these data labellers in Kenya. These workers were paid nearly USD2 per hour for a nine-hour shift. While that is a completely different debate, there was also a cry among the AI community for OpenAI to open source these tools.
“It’d be great if OpenAI open-sources its toxicity filtering models, so that no such human cost is duplicated elsewhere. It’ll come with huge social benefits and help all open-source LLMs become less toxic as well”
Jim Fan, AI Scientist at NVIDIA, said.
By involving an entire community in the development process, the tool can be further refined on accuracy and reliability in its task. Additionally, releasing the tool as open source aligns with the original vision of OpenAI, which was founded by Sam Altman in 2015. However, as things stand, OpenAI has not announced any plans to open source its toxicity-detecting model.
Many were also of the opinion that OpenAI should offer the tool as an API for members of the AI community to access and for OpenAI to monetize. This would provide a mutually beneficial outcome for both the community and the company. Further, OpenAI could also cut down on the cost involved. For instance, Sama had signed a contract worth USD3.9 million with Facebook, for content moderation. Open sourcing the tool would help OpenAI greatly at cutting down cost.
@sama Something you'd consider? Large positive externalities and seems mission-aligned
— David @ HASH (@nonparibus) January 22, 2023
— Gary Marcus (@GaryMarcus) January 22, 2023
OpenAI’s content moderation tool
In August 2022, OpenAI announced a free content moderation tool for OpenAI API developers. The tool aims to help developers protect their applications against possible misuse and will provide them with free access to GPT-based classifiers that can detect harmful content.
OpenAI also published a paper titled ‘A Holistic Approach to Undesired Content Detection in the Real World’ for the same.
“Our moderation system is trained to detect a broad set of categories of undesired content, including those that are sexual, hateful, violent, and harassing in nature. This approach generalises to a wide range of different content taxonomies and can be used to create high-quality content classifiers that outperform off-the-shelf models,” the researchers at OpenAI said.