MITB Banner

Beginner’s Guide To Explainable AI: Hands-On Introduction To What-If Tool

Share

Explainable AI or shortly XAI is a domain that deals with maintaining transparency to the decision making capability of complex machine learning models and algorithms.

In this article, we will take a look at such a tool that is built for the purpose of making AI explainable. 

A simple way to understand this concept is to compare the decision-making process of humans with that of the machines. How do we humans come to a decision? We often make decisions whether they are small insignificant decisions like what outfit to wear for an event, to highly complex decisions that involve risks such as investments or loan approvals. 

However, at the end of the day, we can reason with ourselves and others for the choice of decision we made by explaining how we came to that decision. However, it is not that simple in the case of AI. When we consider machine learning, the explaining part is totally nonexistent or was at least till the concept of Explainable AI popped up.

The explainability of a Machine Learning algorithm or the answer to the question of how the algorithm came to a particular decision is completely hidden from the users. Such a model is called Black Box model and that is what most of the algorithms and Neural Networks are.

“Without transparency there is very little trust”

The objective behind Explainable AI is to induce maximum transparency to a machine learning model by answering questions that are hidden deep within the complexity of an algorithm or a neural network. 

Why is it significant?

Why transparency? Because, without transparency, we have very little clue about how the model actually performs. Besides, XAI can improve and simplify the usage of complex algorithms by making it way more interactive and understandable. 

Consider a simple example where a CNN is classifying an image as one of a dog or a cat. The algorithm makes decisions based on the features that it extracts. Now how do we find out exactly where the algorithm converges to make a prediction. 

We may do this by trying out different values for a feature until it converges to a point when its decision changes. This helps us better understand the decision making of the model and helps us identify the significance of each value that passes through the network. This is one small aspect of XAI.

Another interesting and comical aspect of XAI is that it may even prove to be a measure to prevent the possible AI apocalypse as we might possibly be able to make a Machine come to good decisions rather than decisions like exterminating humans.

In one of our previous articles, we listed down 8 Explainable AI frameworks that are driving a new paradigm for transparency in AI.

The What-If Tool

An Explainable AI tool from Google called the What-If Tool is just what it sounds like — it is intended to question the decisions made by an algorithm. Questions like “What if I change a particular data point,” or “What if I used a different feature, how will these changes affect the outcome of a model,” are contemplated here.

Answering these questions would mean writing codes that are complex and specific to a given model. What-If tool makes this process effortless. It provides answers to a wide variety of what-if scenarios without needing to write code thus making it easier for a larger set of users and non-programmers to understand an algorithm better.

The tool is a feature of the open-source TensorBoard web application that offers an interactive visual interface for exploring the model 

Hands-On Experience With The What-If Tool

We will use the What-If Tool (WIT) to compare two different models and their predictions on the same data. The following code was inspired by WIT Demo For Model Comparison

Setting Up

For this hands-on section, we will use data from MachineHack’s Predicting Food Delivery Time – Hackathon by IMS Proschool

Download the Participant’s data and upload it to a directory in your Google Drive.

Open your google drive and create a new Google Colab notebook. Mount your Google Drive by running the following code block and authorising your account.

from google.colab import drive

drive.mount("/GD")

Now we are all set to go!

Installing & Importing Necessary Libraries 

Loading The Dataset And Splitting Into Training And Validation Sets

train = pd.read_csv("/GD/My Drive/Colab Notebooks/DataSets/train.csv")

from sklearn.model_selection import train_test_split

train, val = train_test_split(train, test_size = 0.1, random_state = 2)

Note: The dataset has already been cleaned and prepared to an extent.

Let’s take a look at the dataset:

Data Preprocessing 

Now that we have our data set loaded we will preprocess the data into a format that is acceptable by the TensorFlow estimators. Tensorflow estimator is a high-level API that allows us to work with pre-implemented models. Estimators require the data in a specific format and use feature columns and input functions to create specifications for the model input.

In the following code block we will implement functions to perform the following operations :

  • A function to create a TensorFlow feature spec from the data frame and columns specified.
  • A function to create simple numeric and categorical feature columns from the feature spec and a list of columns from that spec to use.
  • A function to parse the Tf.Example protos into features for the input function.
  • A function to convert a data frame into a list of tf.Example protos.
  • A function to encode the label column and return the classes.

Now let’s go ahead with the actual preprocessing. We will gather some required parameters fro the dataset. 

Let’s have a look at the output: 

Now let’s transform the entire dataset.

Let’s see what the data looks like:

Creating Classifiers And Training

We have our input data in the right format, now let’s build a Linear classifier and DNN classifier using TensorFlow estimator.

Preprocessing The Validation Data

Configuring And Launching WIT Tool Box

Comparing Models With The What-If Tool

On executing the above code block, a beautiful and interactive tool box similar to that of Tensorboard will be served in the output block as shown in the image below.

Let’s concentrate on the right side where we see a colourful scatter plot.Above the plot there a neatly laid toolbar that lets us customize the plots.

Let’s make a simple plot between the inference scores of model 1 and model 2 along the x and y-axis respectively. We will colour and label the data points by the Delivery_Time.

The plot would look something like this:

  • The plots can be generated between any 2 features effortlessly just by selecting them from the drop-down.
  • Now click any of the data points in the plot and the entire left section of the Toolbox lights up. Let’s check a check out cool feature.
  • We will use the trained models to understand how classification decision changes when we change the value of a feature.
  • Click on a data point, on the left bar all the characteristics of the datapoint will be listed along with the values. 
  • Change the value of any feature and click on the Run Inference button. In the below example we will change the Average Cost of a data point from 300 to 200.

Based on the predicted probability, the data point will be moved to a different location in the plot as shown below.

The down-left corner of the toolbox displays some interesting figures about the probability score of each class of the datapoint for both models and how the change has affected the decision making.

The data point plot also supports Partial dependence plots.

Another cool feature is the Performance and Fairness dashboard that displays various metrics of the model. It also allows us to set different probability thresholds for classification and monitor the changes. 

The features tab displays a description of each of the features in the dataset, including the predicted labels and scores. It shows statistics such as count, missing percentage, mean, standard deviation, minimum, median and maximum for each of the features.

Complete Code

In A Nutshell

Explainable AI is an expanding field. Organizations have already started their work on this domain and there are various tools that are available today to make AI explainable and understandable. With tools like WIT, Machine Learning will become more accessible even to non-programers.

Share
Picture of Amal Nair

Amal Nair

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India