Listen to this story
In 2017, Facebook (Meta) was forced to shut down one of its AI systems after it had started communicating in a secret language. In an eerie throwback, Giannis Daras, a computer science PhD student at the University of Texas at Austin, has claimed that DALL.E 2 has its own secret language.
Two months back, OpenAI released DALL.E 2 (the successor of DALL.E) to much fanfare. DALL·E 2 can create realistic images and art from a description in natural language. It offers 4x greater resolution compared to DALL.E and can also make realistic edits to existing images from a natural language caption.
What is the claim?
In a yet to be peer reviewed paper, “Discovering the Hidden Vocabulary of DALLE-2”, Daras along with Alexandros G Dimakis (UT Austin Professor, researcher in Machine Learning and Information Theory), have explained their findings. The duo has query access to the model, through the API.
Sign up for your weekly dose of what's up in emerging technology.
As part of the experiment, the researchers prompted DALLE.2 with one of the following sentences or a variation of them.
• A book that has the word vegetables written on it.
Download our Mobile App
• Two people talking about vegetables, with subtitles
• The word vegetables written in 10 languages
DALLE.2 created images–with text written on it–based on the prompts. To the human eye, the text seems gibberish. The researchers claimed the text is actually not as random as it appears. The duo pointed out that in several cases, it is strongly correlated to the word that has to be translated.
Image: 2206.00169.pdf (arxiv.org)
The researchers gave an example:
If you prompt DALL.E 2 with the text: “Two farmers talking about vegetables, with subtitles.” you get the image in 2(a). They parsed the text in the images and prompted the model with the generated text 2(b), (c). The researchers concluded “Vicootes” means vegetables and “Apoploe vesrreaitais” means birds.
The authors said the method does not always work. There are instances when generated text gives random images when prompted back to the model. But with some manipulation– like selecting some words, running different produced texts–they could find words that appear random and are correlated with some visual concept.
Everyone does not agree
“No, DALL.E doesn’t have a secret language. (or at least, we haven’t found one yet). This viral DALL.E thread has some pretty astounding claims. But maybe the reason they’re so astounding is that, for the most part, they’re not true,” said Benjamin Hilton, a research analyst. “My best guess? It’s a random chance,” he said.
The researchers themselves have pointed out limitations. The gibberish prompts can be used for backdoor adversarial attacks, claimed the authors.
“Absurd prompts that consistently generate images challenge our confidence in these big generative models,” the researchers said. The duo emphasised on the need for more foundational research to explain the phenomena.
In a YCombinator thread, the commenters were split. One person pointed out this kind of phenomenon can be expected. As the models are trained on internet data of natural language (that has typos, abbreviations, etc), the machine is always trying to associate the words with other words that are semantically close.
Rachael Tatman, a language technology educator, also tried to explain the phenomenon in a series of tweets. She called the paper by Daras and Dimakis helpful as it highlighted how easy it is for humans to see things in a “language-y” way. She pointed out it is a good example of how big models can get weird.
Rapha gontijo lopes, a research scientist at Google Brain ,thinks that the “secret language” claim seems like mostly tokenizer effects and one can perform the inverse as well. He illustrated it with an example. He picked two families of fish “Actinopterygii” and “Placodermi” from Wikipedia and prompted DALL.E 2 with “placoactin knunfidg” and it consistently generated fish images.