Active Hackathon

Google’s New ML Fairness Gym Has A Clear Mission — Track Down Bias & Promote Fairness In AI

Human societies are extremely complex. The cultural, racial and geographical differences around the globe and the lack of curated data make ‘fairness’ in technology a huge challenge. Now, in an attempt to track the long term societal impacts of artificial intelligence, Google researchers recently released a machine learning fairness gym. They have done this by using Google’s OpenAI Gym.


Sign up for your weekly dose of what's up in emerging technology.

Testing Fairness Using OpenAI Gym

OpenAI’s Gym is a toolkit for developing and comparing reinforcement learning algorithms and is compatible with any numerical computation library, such as TensorFlow or Theano.

The gym library is a collection of test problems — environments — that one can use to work out reinforcement learning algorithms. Google researchers have used this platform to build their own fairness gym.

To explain how bias creeps into the models, the researchers in their blog, have given the example of lending money via credit score. The strategies or metrics that were used to classify whether an individual qualifies for loan or not, were unfair at times according to their analysis. 

In their paper titled, Fairness is not static, they discuss in detail about how the simulation experiments were carried out. They divided the agents in the environment into 3 types.

  • a static agent that implements a naïve, one-shot classification strategy.
  • a robust agent that implements a similar one-shot policy, but uses the robust classification algorithm. 
  • Then a continuous agent that gathers an initial set of unmanipulated applicants, then continuously retrains a non-robust classifier based on the subsequent manipulated scores and labels that it observes.

The continuous agent, believe the researchers, is a reasonable model of deployed machine learning systems. 

Using the gym, the Google team has found that in the lending money experiment, the equal opportunity agent (EO agent) overlends to the disadvantaged group (which initially has a lower average credit score) by sometimes applying a lower threshold for the group than would be applied by the max reward agent. 

This causes the credit scores of one group to decrease more than other group, resulting in a wider credit score gap between the groups than in the simulations with the max reward agent. 

Depending on whether the indicator of welfare is the credit score or total loans received, it could be argued that one agent is better or more detrimental to other groups than the max reward agent.

They also found out that equal opportunity constraints — enforcing equalised TPR between groups at each step — does not equalise TPR (true positive rates or actual positives) cases in aggregate over the simulation. 

This also indicates that how the equality of opportunity metric is difficult to interpret when the underlying population is evolving and suggests that more careful analysis is necessary to ensure that the ML system is having the desired effects.

Many existing tools for evaluating fairness concerns don’t work well on large scale datasets and models. 

Here is a list of tools that promote ML fairness:

Google’s Fairness Indicator

Fairness Indicators is built on top of TensorFlow Model Analysis, a component of TensorFlow Extended (TFX) that can be used to investigate and visualise model performance. Fairness Indicators can also be accessed in TensorBoard for evaluating other real-time metrics. 

Microsoft’s Fairlearn

The Fairlearn project seeks to enable anyone involved in the development of artificial intelligence systems to assess their system’s fairness and mitigate the observed unfairness. The Fairlearn repository contains a Python package and Jupyter notebooks with examples of usage.

IBM’s AI Fairness 360 

The AI Fairness 360 Python package includes a comprehensive set of metrics for datasets and models to test for biases and is designed to translate algorithmic research from the lab into the actual practice of domains as wide-ranging as finance, human capital management, healthcare, and education. 

These tools make it possible to investigate the performance of models and their underlying biases and even visualise the results like the way Google’s fairness indicators integrates with the What-If Tool to load those specific data points and help in counterfactual analysis.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Data Science Skills Survey 2022 – By AIM and Great Learning

Data science and its applications are becoming more common in a rapidly digitising world. This report presents a comprehensive view to all the stakeholders — students, professionals, recruiters, and others — about the different key data science tools or skillsets required to start or advance a career in the data science industry.

How to Kill Google Play Monopoly

The only way to break Google’s monopoly is to have localised app stores with an interface as robust as Google’s – and this isn’t an easy ask. What are the options?