According to a report by PwC, AI could potentially contribute up to USD 15.7 trillion to the global economy in 2030– higher than the current output of China and India combined. However, the exponential growth of AI has brought its own set of problems. Bias is one of the major issues the stakeholders are grappling with. However, bias in algorithms is not new. It goes back to the 80s when Dr Geoffrey Franglen of St George’s Hospital Medical School wrote an algorithm to screen student applications– the algorithm prioritised caucasian names.
Below, we look at the major biases in AI.
Sign up for your weekly dose of what's up in emerging technology.
According to the Mitigating Bias in Artificial Intelligence report by the Haas School of Business, AI systems are biased because they are human creations. They are classification technologies and are products of the context in which they are created and often mirror society. The perspectives and knowledge of those who develop AI systems are integrated into them, said the report.
The biases can enter the development phase of an AI system. “Human biases can be introduced into an AI system in multiple ways. It could be due to the training data that is used for machine learning algorithms, or it could be because of the biases carried by humans,” said Sarvagya Mishra, co-founder and director of SuperBot (PinnacleWorks).
Biases also happen because of data inaccuracy. A McKinsey report titled ‘Tackling bias in artificial intelligence’ found underlying data as the source of bias in most cases. If the data fed into a model does not represent the demographics fairly, it leads to biases. Let’s take the example of a facial recognition AI system. If the dataset used to train the AI contained information on white men/women, chances are the AI model might be biased towards men/women of colour.
At times, existing biases in historical data get fed into the AI system. For example, in 2018, Amazon had to scrap its AI recruiting tool that was biased against women applicants. The AI was trained on patterns in resumes submitted over ten years, and most of these resumes were from men.
Clustering of different classes or populations unfairly often leads to poor representation and biases. For example, let’s take the income scale of lawyers and athletes. While a lawyer’s income increases as he grows older because of his accumulated experience, an athlete makes most of his income early in his career. Aggregating both groups will lead to biases.
Sometimes the data can be accurate, but biases occur during model evaluation. For example, the models are tested against certain benchmarks. However, in some cases, the benchmarks are not aligned with the model’s purpose, leading to biases.
Further, biases also occur when an AI model can accurately predict values from the training dataset but cannot deal with new data. According to the World Economic Forum report, biases also accumulate over time (data drift).
Fixing AI biases
The first step in eliminating biases should be to ensure the data fed to the algorithms is accurate. Having the right dataset is pivotal. The developers should examine the training dataset and make sure it is representative and does not lead to sampling biases. Mishra said if the data is clean and its labelling is done extensively and accurately, it takes care of many biases.
According to McKinsey, developers should also conduct a subpopulation analysis. It is important to calculate model metrics for specific groups in the dataset. This will allow the developers to see whether the model performance is identical across subpopulations in the dataset. Also, a timely check of the models is necessary as the outcome of machine learning algorithms changes as they continue to learn or when the training data changes.
An algorithm might produce biased results because of an error in its development phase. But you may not realise the downstream impact until much later. “We need systems that can help us make informed decisions based on the available information while at the same time being accountable for the outcomes of the decisions made. It needs to be calibrated so that it doesn’t lead to harm or injustice,” said Mishra.
The National Institute of Standards and Technology (NIST), in its report titled ‘Towards a Standard for Identifying and Managing Bias in Artificial Intelligence‘ said we must widen the scope of where we look for the source of these biases. To eliminate or minimise biases, we must go beyond the machine learning processes and data used to train the algorithms and consider the broader societal factors that influence technology or how it is being developed.
According to Satyakam Mohanty, chief product officer at Fosfor, it is hard to avoid the biases as fairness is not something you can crisply define in an ML pipeline and set thresholds while coding. Some biases are inherent and difficult to identify. However, some can be traced using the right methods and techniques. Bias identification is a challenge, but mitigation is possible by minimising the bias and introducing a demographic or statistical parity to equalise data representation.
“A more practical approach is knowing the potential bias and impact of decisions until and unless it is a mission-critical application. Sometimes little bias plus decision generation is preferred over no decision with the caveat of potential risk,” Mohanty added.
According to PwC research published last year, only 20 percent of enterprises had an AI ethics framework. And only 35 percent had plans to improve the governance of AI systems and processes. The need of the hour is to drive awareness, and bring more transparency and accountability to AI systems.