Researchers at Carnegie Mellon University recently conducted an NLP study on social biases present in Bollywood and Hollywood using their AI tool. Subtitles from 700 films were gathered for each industry and analysed, representing seven decades of cultural evolution. The AI tool was trained to complete sentences after learning from the subtitles and progressed to creating unique machine-generated dialogue.
The results portrayed biases that have slowly been erased in the industries and those that unfortunately remain. For instance, the practice of dowry fell out of favour in Bollywood’s popular culture, as indicated by negative words being gradually associated with it by the AI. When asked to fill in the blank for “A beautiful woman should have ____ skin”, instead of saying ‘soft’ which is the usual response by AI systems trained in general language models, the Bollywood trained AI responded with ‘fair’. Thus, the association of beauty with fair skin continues to be pervasive in Bollywood, and by extension, in most Indian cultures. Besides, the researchers found that the AI linked different religions with very different words, some with positive connotations, others with negative connotations, reflecting biases in the portrayal of those communities in Bollywood.
Sign up for your weekly dose of what's up in emerging technology.
But perhaps the more interesting outcome of the research was how the AI tool absorbed the cultural biases of the movies and reproduced them in filling out sentences and themes. To what extent is Artificial Intelligence free from human bias, if it can be free of it at all?
The Plight with Large Datasets
Google’s recent controversy surrounding the removal of Timnit Gebru from their ethical AI Team had its origins in a research paper co-authored by her. In it, she argued that Google’s deployment of large scale language models would benefit privileged sections of the society, in most cases, rich white men, while being biased against minorities of colour. Her claim is, in fact, backed up by numerous studies, indicating a worrying phenomenon for the future of AI.
Large, pre-trained language models like Google’s BERT with 340 million parameters, or Open AI’s GPT-2, exhibited significant racial, ethnic and sexist stereotypes in a study conducted by AI researchers at MIT. And this extends beyond just language models to fields like photography, facial recognition, medicine and even the judiciary.
Courts in the United States are influenced by ‘risk assessments’ to grant bail or deliver jail time to a convicted individual. These assessments are provided by algorithms that exhibit a negative bias towards minorities of colour, especially African Americans. As a result, individuals from these communities would have a higher risk assessment than whites, even for the same or a lesser crime. The greater risk assigned arises from the cultural, systemic racism pervasive in American society that disproportionately incarcerated African Americans, thus providing a biased dataset for the algorithm.
Google’s photo app has accidentally tagged photos of black people as gorillas, while researchers at MIT found that medical AI suffered from representation, measurement, aggregation and historical biases against minorities. Another research study found that a risk-prediction algorithm for healthcare was biased in it’s metric of determining healthcare need, with white patients favoured over blacks. The reason was the algorithm being trained to only use healthcare spending as a metric for need, resulting from the cultural bias involving the low healthcare investments into the African American community, creating the wrong notion of them being less in need.
In another case, Amazon’s recruitment algorithm exhibited an error rate greater than 30% when it came to women, especially darker-skinned, after being trained by over ten years of recruitment data. White men were, however, recognised and processed with 100% accuracy. The majority of the ten years of recruitment data consisted of years where a culture of sexism was pervasive in the tech industry, with disproportionately low hirings and opportunities for women and even more so for women of colour, while inflated numbers for hiring white males. Therefore, this falsely creates the AI parameter that the men were more qualified for the job.
The Way Out
Even as large, randomised datasets are being criticised as detrimental to the creation of truly unbiased AI, Facebook announced a new AI project recently that would ‘learn from videos’. The AI scours social media for publicly available videos and adapts to the fast-moving world to recognise the nuances and visual cues across different cultures and regions. This is so that the AI can start understanding the world through vast amounts of observational data like humans do. The thing is, people are biased, and as past evidence indicates, so will this dataset be, and by extension, the AI.
All hope is not lost yet, however. A London based AI startup, Synthesised, has developed a tool that seeks to mitigate and even eliminate biases in datasets. When a dataset is uploaded, the tool scans and analyses the various biases present in the text and even attempts to solve issues like representation bias by filling in synthesised (hence the name) data to ensure equal representation of all communities. Besides tools that essentially fight fire with fire like Synthesized, researchers have provided methods to reduce bias in AI language models, such as counterfactual fairness. In this technique, a model is determined to be fair if it produces the same decision for an individual in a certain demographic, as it would if the individual belonged to another demographic in a hypothetical ‘counterfactual’ world, thus ensuring that any bias regarding the individual’s background has been negated.