Listen to this story
|
Chinese tech firm Baidu‘s leading text-to-image synthesis model, ‘ERNIE-ViLG’, censors text prompts labelled “politically sensitive”—such as “Tiananmen Square” and names of several political leaders.
According to MIT Technology Review, users would be able to generate images that capture the distinctiveness of Chinese culture using the new AI. The company claims that the model can create better anime art than DALL-E 2 or other existing image-making AIs.
When the software’s demo was made publicly available in late August 2022, users quickly discovered that some prompts—both direct references to political figures’ names and expressions that may be contentious solely in political contexts—were flagged as “sensitive” and prevented from generating any results.
The country’s sophisticated system of online censorship has extended to this latest trend in AI.
ERNIE-ViLG was first developed in 2021, and while testing public demos, some users then had found that the model was programmed to censor phrases of politically sensitive nature.
The model is part of Wenxin, a large-scale project in natural-language processing—trained on a data set that contained 10 billion parameters and 145 million image-text pairs. These are the values that a neural network modifies as it learns, and the AI uses them to recognise the minute variations between ideas and artistic approaches.
ERNIE-ViLG has a smaller training data set than DALL-E 2 (650 million pairs) as well as Stable Diffusion (2.3 billion pairs). Baidu also released a demo version of the model on the popular AI community, Hugging Face.
While words like “democracy” and “government” are allowed, prompts that combine them with other words, like “democracy Middle East”’ or “British government,” are effectively censored. Likewise, Tiananmen Square in Beijing can’t be found in ERNIE-ViLG owing to its association with the infamous Tiananmen Massacre, references to which are blatantly censored in China.