How to Visualize and Debug Machine Learning Models using ELI5?

From this post you will come to know how particular predictions are being made and how models focus on various aspects of parameters it has learned.

Machine learning models are generally considered black-box models in the entire community despite their massive implementation. It becomes very essential to Understand how particular predictions are being made and how models focus on various aspects of parameters it has learned. Models are usually assessed using certain evaluation matrices on a given test dataset. Real-world data, on the other hand, is frequently different, so the evaluation metric may not accurately reflect the product’s purpose. 

In addition to such metrics, evaluating individual predictions and their justifications is a viable solution for optimizing performance. In this article, we will discuss debugging and visualizing machine learning algorithms using ELI5. ELI5 is a tool in Python that is used to visualize and debug various Machine Learning models using a unified API. The major points to be covered in this article are given below.

Table of Contents

  1. Explainability and Interpretability in Machine Learning
  2. ELI5 (Explain Like I’m 5) 
    • XGBoost with ELI5
    • Example of Keras Implementation
  3. Advantages and Usage of ELI5

Now, let us start with understanding explainability and interpretability. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Explainability and Interpretability in Machine Learning

Explainability and interpretability are frequently used in machine learning and artificial intelligence. Even though they are extremely similar, it’s worth exploring the differences, if only to demonstrate how difficult things can get once you start looking into machine learning systems. The amount to which a cause and effect may be observed within a system is known as interpretability. To put it another way, it refers to your ability to forecast what will happen in response to a change in input or computational parameters. It’s the ability to look at an algorithm and pertaining, what’s going on there.

Meanwhile, explainability refers to how well the internal mechanics of a machine or deep learning system can be communicated in human terms. It’s easy to overlook the tiny distinction with interpretability, but think of it this way: interpretability is about being able to understand mechanics without necessarily knowing why. Explainability refers to the ability to explain what is happening in detail. Simple models (Like  Linear or Logistic regression) can be used to explain findings for a sample data set. Typically, these models are insufficient, and we must go to Deep Learning models, which deliver great performance but are a mystery to the majority of Data Science practitioners. Machine learning models are currently utilized to make a variety of essential judgments, including fraud detection, credit rating, self-driving, and patient examination.

It becomes very important to every practitioner that enhancing the interpretability and explainability of models is now crucial in most development and that can make us stand differently than others. We can address the issues and goals of the problem statement correctly by understanding how algorithms work.  

ELI5 (Explain Like I’m 5) 

ELI5 is a Python toolkit that uses a uniform API to visualize and debug diverse Machine Learning models. It supports all scikit-learn algorithms (including the fit() and predict() methods). It includes built-in support for numerous ML frameworks and allows you to explain white-box models (Linear Regression, Decision Trees) as well as black-box models (Keras, XGBoost, LightGBM). It is applicable to both regression and classification models.

Now we are going to see how ELI5 interprets and explains a model using its eli5.show_weights and eli5.show_prediction API. The practical demo is divided into two parts. First, we are going to interpret and explain XGBoost. Following it we will see the same for Keras application.

XGBoost with ELI5
! pip install eli5
from xgboost import XGBClassifier
import eli5
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np

The data set here we are using is the sklearn built-in data set for breast cancer prediction while implementing, create a pandas data frame for breast cancer dataset with the proper header this is because when we execute eli5 it retrieves feature information from the model.   

data = load_breast_cancer()
df = pd.DataFrame(data.data)
df.columns = data.feature_names
df['target'] = data.target

Build a classifier:

model = XGBClassifier()
model.fit(x_train,y_train)

Now we need to use just two simple functional API’s of ELI5 as below.

eli5.show_weights(model, top=30)
eli5.explain_prediction_xgboost(model,x_test.iloc[0])

The left side shows weights assigned for each feature and the right side shows the prediction for one instance

As you can see from the above two tables how XGBoost assigned weights for each feature based on training data and from the other table, for a particular instance, to reach a probability of 0.981 for class 1 how each feature has contributed.   

Similarly, next, we are going to see the same interpretation for Keras’s application.

ELI5 with Keras Implementation

If we have a model that takes an image as input and returns class scores (probabilities that a specific object is present in the image), we can use ELI5 to see what was in the image that caused the model to predict a specific class score.

For the Keras demo, we are using a VGG16 pre-trained network and its interpretation for a random image.  

from tensorflow.keras.applications import VGG16 

from tensorflow.keras.applications import VGG16 
import keras

vgg16 = VGG16(include_top=True, weights='imagenet', classes=1000)
 
# load image
im = keras.preprocessing.image.load_img('/content/HAL-TEDBF-Fighter-Jet-With-Vikrant-Aircraft-Carrier-Art.jpg', target_size=(224, 224))
doc = keras.preprocessing.image.img_to_array(im)
doc = np.expand_dims(doc, axis=0)
doc = keras.applications.vgg16.preprocess_input(doc[0])
# visualize the image
keras.preprocessing.image.array_to_img(doc[0])
# explain
eli5.show_prediction(vgg16, doc)

As you can see ELI5 shows how the VGG16 looks for objects for which a given image is to be classified.  

Advantages and Usage of ELI5

ELI5, can use an existing function and produce good results that are formatted as well, It also allows code to be reused across different machine learning frameworks, It can deal with a slew of minor inconsistencies.

ELI5 can be used to inspect basic model parameters and to figure out how the models perform on a global scale. ELI5 can be used to examine specific predictions provided by a single model, as well as the decisions made by the models.

Conclusion

We generally tend to use many models and algorithms for our problem and choose one which performs better than others. Practically evaluating each such model is a tedious task that can slow down our development process. By using the ELI5 and mastering it for a variety of algorithms we can easily choose a model which can outperform our task. From this article, we have seen how interpretability and explainability play an important role. 

References

Vijaysinh Lendave
Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR