Machine learning is not just about machines. At least not yet. There is still a human element in the loop, and it looks like this will continue for some time. In other words, artificial general intelligence (AGI) is a distant dream. Since humans are interfering in the learning processes of ML models, the underlying biases surface in the form of inaccurate results.
Having an unbiased model is almost impossible as humans generate the data, and a model is only as good as the data it is fed. So, it is the job of the data engineer to keep an eye on the ways in which bias can enter the system. According to Google developers team, the following are the commonly encountered biases during the training of a machine learning model:
Automation bias is believed to occur when a human decision-maker favours recommendations made by an automated decision-making system over the information made without automation, even when it is found that the automated version is dishing out errors.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
The tendency to search for or interpret information in a way that confirms one’s prejudices (hypothesis). Machine learning developers might sometimes tend to collect data or label them in a way that would satisfy their unresolved prejudices. These biases seep into the results and sometimes blow up on a large scale.
This bias also stems from another form of bias known as experimenter bias, where the data scientist would train a model until their previously held hypothesis has been confirmed.
Group Attribution bias
This bias commonly occurs when the assumption that what is good for one person is good for the group is taken too seriously. The effects of this bias can further worsen if a convenience sampling is used for data collection. The attributions made in this way rarely reflect reality.
Out-Group Homogeneity Bias
Consider two groups of families. One group has a couple of twins, while the other does not. When a non-twin family is asked to distinguish between twins, they might falter, whereas the twin’s parents will identify with ease and might even give a nuanced description. So, for the non-twin family, these twins are all but the same. The brevity with which assumptions are made on groups outside ours leads to out-group homogeneity bias. Similarly, there is In-Group Bias as well, which works the other way around.
Selection bias is a result of errors in the way sampling is done. For example, we need to build an ML model that predicts audience sentiments with regard to films. As part of collecting data, if the audience is handed over a survey form, then the following forms of bias can appear:
- Coverage bias: When the population represented in the dataset does not match the population that the machine learning model is making predictions about. Taking the same movie example as above, by sampling from a population who chose to see the movie, the model’s predictions may not generalize to people who did not already express that level of interest in the film.
- Sampling bias: This occurs when the sample is not random or diverse. Suppose only the reviews of front row people in a theatre are taken instead of a random group, then, needless to say, we will hardly grasp the sentiments of the audience.
- Non-response bias: This bias is usually from the data end. A sigh of relief for the data collectors. This bias will occur when certain sections of the audience choose not to review the movie. Suppose the neutral audience keeps away from reviewing and only the ones with strong opinions, usually the fans, pile up in the reviews, then the results will lean in favour of the film. This bias is also known as participation bias.
Suppose an NLP model is trained on the dataset that contains news from the last few decades. Though calling news as biased is an understatement, there is a peculiar kind of bias that emerges out of the way the actions are documented. For example, if the word ‘laughed’ is more prevalent than ‘breathed’ in a story, then a machine learning model that takes the frequency of words into account will conclude that laughing is more common than breathing!