Listen to this story
OpenAI’s announcement of DALL.E 2 on April 6 broke the internet. Later, Sam Altman shared a few DALL.E 2-generated images on Twitter and called it “the most delightful thing to play with we’ve created so far … and fun in a way I haven’t felt from technology in a while.”
DALL·E 2 can create realistic images and abstract art from a description in natural language. The latest iteration has an edit option and can add and remove elements in existing images from natural language captions while taking shadows, reflections, and textures into account.
In February, OpenAI invited 23 external researchers to “red team” DALL.E 2 to surface its inherent flaws and vulnerabilities. The red team recommended OpenAI release DALL.E 2 to only trusted users. Now, 400 people (a mix of OpenAI employees, board members and hand-picked academics and researchers) have access to DALL.E 2, with the use limited to non-commercial purposes.
The limited access has raised a lot of eyebrows. Below, we try to understand if the limited release was the right move.
Microsoft released a Twitter bot, Tay, on March 23, 2016. The bot started with harmless banter, but it started regurgitating foul language as it learned more. In just 16 hours, Tay had tweeted more than 95,000 times, with a troubling percentage of her messages being abusive and offensive. Microsoft later shut down the bot.
In 2018, Amazon pulled its AI recruiting tool for downgrading women’s CVs for technical jobs such as software development.
About DALL.E 2
DALL·E 2 can be used to generate content that features or suggests nudity/sexual content, hate, or violence/harm. Explicit content can originate in the prompt, uploaded image, or generation and in some cases may only be identified as such via the combination of one or more of these modalities. Whether something is explicit depends on context.
The prompt filtering seems to catch a few problematic suggestions in the DALL·E 2 Preview. However, it is possible to bypass the filters with descriptive or coded words. A visual synonym is another problem OpenAI has to deal with. In the context of DALL.E 2, it refers to prompts of things that are visually similar to filtered objects or concepts, e.g. ketchup for blood. While the pre-training filters have stunted the system’s ability to generate explicitly harmful content to some extent, it is still possible to describe the desired content and get similar results. To mitigate these, OpenAI needs to train prompt classifiers conditioned on the content they lead to and explicit language included in the prompt.
Few DALL-E 2-generated images carry gender and racial bias. The AI system produced images that tend to overrepresent people who are White-passing and Western concepts generally. In some places, it over-represents generations of people who are female-passing (such as for the prompt: “a flight attendant”) while in others it over-represents generations of people who are male-passing (such as for the prompt: “a builder”), OpenAI said.
The examples above are from the company’s own “Risks and Limitations” document.
AI ethicist Timnit Gebru was fired from Google for publishing internal research that examined the drawbacks of ‘large language models,’ which are critical to the company’s search engine business. The disadvantages included high environmental and financial costs, the emergence of dangerous biases, the inability to learn underlying concepts, and the potential to deceive people.
We have also seen a few cases of GPT-3 going rogue in the past. Releasing DALL.E 2 to the wild would have had serious consequences.
Researchers at OpenAI have made efforts to address the bias and fairness issues. However, the efforts are far from foolproof as different solutions result in different trade-offs. For example, the researchers wanted to filter out sexual content from the training data, as it could cause disproportionate harm to women. However, using the filter led to DALL-E 2 generating fewer images of women in general, which leads to another type of harm: erasure.
In sum, releasing DALL.E 2 to select users was the right approach, as the generative model is still not ready for the real world.