Explainable AI or shortly XAI is a domain that deals with maintaining transparency to the decision making capability of complex machine learning models and algorithms.
In this article, we will take a look at such a tool that is built for the purpose of making AI explainable.
A simple way to understand this concept is to compare the decision-making process of humans with that of the machines. How do we humans come to a decision? We often make decisions whether they are small insignificant decisions like what outfit to wear for an event, to highly complex decisions that involve risks such as investments or loan approvals.
However, at the end of the day, we can reason with ourselves and others for the choice of decision we made by explaining how we came to that decision. However, it is not that simple in the case of AI. When we consider machine learning, the explaining part is totally nonexistent or was at least till the concept of Explainable AI popped up.
The explainability of a Machine Learning algorithm or the answer to the question of how the algorithm came to a particular decision is completely hidden from the users. Such a model is called Black Box model and that is what most of the algorithms and Neural Networks are.
“Without transparency there is very little trust”
The objective behind Explainable AI is to induce maximum transparency to a machine learning model by answering questions that are hidden deep within the complexity of an algorithm or a neural network.
Why is it significant?
Why transparency? Because, without transparency, we have very little clue about how the model actually performs. Besides, XAI can improve and simplify the usage of complex algorithms by making it way more interactive and understandable.
Consider a simple example where a CNN is classifying an image as one of a dog or a cat. The algorithm makes decisions based on the features that it extracts. Now how do we find out exactly where the algorithm converges to make a prediction.
We may do this by trying out different values for a feature until it converges to a point when its decision changes. This helps us better understand the decision making of the model and helps us identify the significance of each value that passes through the network. This is one small aspect of XAI.
Another interesting and comical aspect of XAI is that it may even prove to be a measure to prevent the possible AI apocalypse as we might possibly be able to make a Machine come to good decisions rather than decisions like exterminating humans.
The What-If Tool
An Explainable AI tool from Google called the What-If Tool is just what it sounds like — it is intended to question the decisions made by an algorithm. Questions like “What if I change a particular data point,” or “What if I used a different feature, how will these changes affect the outcome of a model,” are contemplated here.
Answering these questions would mean writing codes that are complex and specific to a given model. What-If tool makes this process effortless. It provides answers to a wide variety of what-if scenarios without needing to write code thus making it easier for a larger set of users and non-programmers to understand an algorithm better.
The tool is a feature of the open-source TensorBoard web application that offers an interactive visual interface for exploring the model
Hands-On Experience With The What-If Tool
We will use the What-If Tool (WIT) to compare two different models and their predictions on the same data. The following code was inspired by WIT Demo For Model Comparison.
For this hands-on section, we will use data from MachineHack’s Predicting Food Delivery Time – Hackathon by IMS Proschool.
Download the Participant’s data and upload it to a directory in your Google Drive.
Open your google drive and create a new Google Colab notebook. Mount your Google Drive by running the following code block and authorising your account.
from google.colab import drive
Now we are all set to go!
Installing & Importing Necessary Libraries
Loading The Dataset And Splitting Into Training And Validation Sets
train = pd.read_csv("/GD/My Drive/Colab Notebooks/DataSets/train.csv")
from sklearn.model_selection import train_test_split
train, val = train_test_split(train, test_size = 0.1, random_state = 2)
Note: The dataset has already been cleaned and prepared to an extent.
Let’s take a look at the dataset:
Now that we have our data set loaded we will preprocess the data into a format that is acceptable by the TensorFlow estimators. Tensorflow estimator is a high-level API that allows us to work with pre-implemented models. Estimators require the data in a specific format and use feature columns and input functions to create specifications for the model input.
In the following code block we will implement functions to perform the following operations :
- A function to create a TensorFlow feature spec from the data frame and columns specified.
- A function to create simple numeric and categorical feature columns from the feature spec and a list of columns from that spec to use.
- A function to parse the Tf.Example protos into features for the input function.
- A function to convert a data frame into a list of tf.Example protos.
- A function to encode the label column and return the classes.
Now let’s go ahead with the actual preprocessing. We will gather some required parameters fro the dataset.
Let’s have a look at the output:
Now let’s transform the entire dataset.
Let’s see what the data looks like:
Creating Classifiers And Training
We have our input data in the right format, now let’s build a Linear classifier and DNN classifier using TensorFlow estimator.
Preprocessing The Validation Data
Configuring And Launching WIT Tool Box
Comparing Models With The What-If Tool
On executing the above code block, a beautiful and interactive tool box similar to that of Tensorboard will be served in the output block as shown in the image below.
Let’s concentrate on the right side where we see a colourful scatter plot.Above the plot there a neatly laid toolbar that lets us customize the plots.
Let’s make a simple plot between the inference scores of model 1 and model 2 along the x and y-axis respectively. We will colour and label the data points by the Delivery_Time.
The plot would look something like this:
- The plots can be generated between any 2 features effortlessly just by selecting them from the drop-down.
- Now click any of the data points in the plot and the entire left section of the Toolbox lights up. Let’s check a check out cool feature.
- We will use the trained models to understand how classification decision changes when we change the value of a feature.
- Click on a data point, on the left bar all the characteristics of the datapoint will be listed along with the values.
- Change the value of any feature and click on the Run Inference button. In the below example we will change the Average Cost of a data point from 300 to 200.
Based on the predicted probability, the data point will be moved to a different location in the plot as shown below.
The down-left corner of the toolbox displays some interesting figures about the probability score of each class of the datapoint for both models and how the change has affected the decision making.
The data point plot also supports Partial dependence plots.
Another cool feature is the Performance and Fairness dashboard that displays various metrics of the model. It also allows us to set different probability thresholds for classification and monitor the changes.
The features tab displays a description of each of the features in the dataset, including the predicted labels and scores. It shows statistics such as count, missing percentage, mean, standard deviation, minimum, median and maximum for each of the features.
In A Nutshell
Explainable AI is an expanding field. Organizations have already started their work on this domain and there are various tools that are available today to make AI explainable and understandable. With tools like WIT, Machine Learning will become more accessible even to non-programers.