Outsmarting ML Biases: A Checklist

“ML optimization somehow finds a way and mitigation of bias is not always possible.”

Luca Massaron

The debate around fairness and explainability within the AI community has never been hotter. The works of Timnit Gebru and the likes have brought ethical AI to the spotlight. Machine learning has a reputation for being a black box, i.e one does not know what it learns and why it learns so. This makes validating results tricky. You don’t want to bet your diagnosis on such models. 

At the ongoing MLDS 2021 event organised by Analytics India Magazine, Luca Massaron, a Google Dev expert and a renowned ML author introduced the audience to the various ways in which the popular framework– TensorFlow can be leveraged to mitigate biases in the models. 

Luca discussed various factors that influence the results of ML based applications: Especially the fairness problem that may arise due to the optimisation strategies of different machine learning algorithms notwithstanding the best intentions and precautions by practitioners. During the presentation, Luca gave the 101 on understanding algorithm bias and why it is so important for a widespread adoption of machine learning and AI. 

Chasing The Impossible 

Image credits: Luca Massaron

Machine learning algorithms relentlessly search for a solution. In the case of GANs, the generator and discriminator network somehow finds a way to fool each other. The result is a Deepfake. Not that deep fakes are harmless but ML is used in more critical industries such as healthcare. So when a model that is fed with an underrepresented dataset is used, the chances of misdiagnosis increases. “Each ML algorithm has a strategy to answer optimally to your question,” warned Luca.

“ML is not just a matter of data and objectivity, sometimes there are loopholes.”

Luca listed different ways in which fairness can be defined:

  • Unawareness
  • Demographic Parity
  • Equalised odds
  • Predictive rate parity 
  • Individual fairness

The different definitions makes things even more cumbersome for the data scientist. Citing the work on the impossibility of fairness, Luca also explained why some notions of fairness are mutually incompatible and cannot be satisfied simultaneously. “ There is no single universal metric for quantifying fairness that can be applied to all ML problems,” he added.

No matter how fool proof the data curation process is, loopholes might creep in. So, what are these loopholes? According to Luca, there are five:

  • Skewed sample
  • Tainted examples
  • Limited features
  • Sample size disparity
  • Proxies.

“Algorithm bias is something related to how algorithms find their solutions and it is a widespread problem. Optimisation can easily hide its consequences behind performance. Implications are severe if ML solutions are powering AI whose artefacts are operating in society and economy,” explained Luca.

What Tools Do We have

“There is a growing academic attention to the topic, which doesn’t correspond to established practices.”

When it comes to ML fairness toolkits, Google’s TensorFlow team has been on the top. The team has been developing multiple tools to assist niche areas within the realms of fairness debate. The whole debate around ML fairness is forcing companies like Google to establish an ecosystem of fairer ML practice through their tools. Be it a What-if tool or Fairness Gym or Fairness indicators, Google’s AI team has been bestowing developers community a new tool every month. Luca highlighted the many tools that Google has introduced over the past couple of years. 

Let’s take a look at few of these tools:

TFCO

TFCO is a library for optimising inequality-constrained problems in TensorFlow. Usually, both the objective function, which can be accuracy maximised or reduced for loss and constraints are represented as Tensors, giving users the maximum amount of flexibility in specifying their optimisation problems. 

import tensorflow_constrained_optimization as tfco

Google’s Fairness Indicator

Fairness Indicators are built on top of TensorFlow’s Model Analysis, a component of TensorFlow Extended (TFX) that can be used to investigate and visualise model performance. These fairness indicators can also be accessed via TensorBoard for evaluating other real-time metrics. 

These are a few tools Google has developed and there is no doubt that there is more to come as AI based applications are moving out of the labs and into the real world. “ML systems are complex and can hide fairness problems. Fairness, privacy concerns, explainability are facets of the same hurdle in AI/ML adoption,” concluded Luca.

Download our Mobile App

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring