MITB Banner

A guide to explaining feature importance in neural networks using SHAP

SHAP values (SHapley Additive exPlanations) is an awesome tool to understand your complex Neural network models and other machine learning models such as Decision trees, Random forest.

Share

SHAP values (SHapley Additive exPlanations) is an awesome tool to understand your complex Neural network models and other machine learning models such as Decision trees, Random forests. Basically, it visually shows you which feature is important for making predictions. In this article, we will understand the SHAP values, why it is an important tool for interpreting neural network models, and in the end, we will implement SHAPS values to interpret neural networks. The major points to be covered in this article are listed below.

Table of contents

  1. Introduction to SHAP value
  2. Importance of SHAP values
  3. Calculating SHAP values of Neural networks.

Let’s first understand the features of SHAP.

Introduction to SHAP value

SHAP value is a real breakthrough tool in machine learning interpretation. SHAP value can work on both regression and classification problems. Also works on different kinds of machine learning models like logistic regression, SVM, tree-based models and deep learning models like neural networks. 

In a regression problem even if the features are correlated SHAP value can correctly assign the feature importance. Hence every ML developer should have this tool in their skillset to represent the model outcomes.

Are you looking for for a complete repository of Python libraries used in data science, check out here.

Importance of SHAP values

After implementing machine learning models our next step is to analyze the model. SHAP value helps to select which feature is important and which feature is useless by plotting graphs. SHAP value became a famous tool in a very short period of time because before we had interpretation only in tabular form so it became tricky to get the result, but in the visual representation of feature importance, we can get the result at first glimpse. 

Calculating SHAP values of Neural networks

In this section we are going to implement a neural network then calculate the SHAP value.

First of all, install the shap value package into your environment. 

(Note: implementation done in google colab)

!pip install shap

Load essential libraries that will help us to implement neural networks, plotting graphs, and computations.

#load libraries
import tensorflow as tf
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import sklearn
import lifelines
import shap

# This sets a common size for all the figures we will draw.
plt.rcParams['figure.figsize'] = [10, 7]

Now we will Import our data taken from MachineHack’s Weekend hackathon into pandas DataFrame. Our target variable is “IsUnderRisk”, where 1 means under risk, 0 means not under risk.

df = pd.read_csv('/content/Train.csv')
df

Data has no issue, I deliberately choose this data because preprocessing is not our motive. So I directly convert the dataframe into an array so that it can pass into the neural network.

#convert the data into array
dataset = df.values
dataset

Select X and y values

X = dataset[:, 0:7]
y = dataset[:, 7]

Store all feature names in an array and save it into the “features” variable

features = ['City', 'Location_Score', 'Internal_Audit_Score',
       'External_Audit_Score', 'Fin_Score', 'Loss_score', 'Past_Results'
       ]

Convert the values into standard form

#process the data
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scale = scaler.fit_transform(X)
X_scale

Splitting data into training and testing in 80 : 20

#split the data 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X_scale, y, test_size=0.2, random_state = 4)

Model building and compiling

#Build the model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid') # 1 because its a binary classification
])
#Compile the model
model.compile(
    loss = tf.keras.losses.binary_crossentropy,
    optimizer = tf.keras.optimizers.Adam(lr = 0.02),
    metrics = [
        tf.keras.metrics.BinaryAccuracy(name='accuracy'),
        tf.keras.metrics.Precision(name='precision'),
        tf.keras.metrics.Recall(name='recall')
    ]
)
hist = model.fit(X_train, y_train, epochs=100)

Now we will start calculating shap values. First start with defining an explainer, and in the second line we have calculated the shap values. Note that shap value calculation is extremely slow so before applying it make sure that it will take a lot of time, my test data has only 109 rows so i don’t have to worry. After running below code you will see the progress bar.

e = shap.KernelExplainer(model, X_train)
shap_values = e.shap_values(X_test)

We can use the shapely values to interpret our model. 

shap.initjs()
# visualize the first prediction's explanation with a force plot
shap.force_plot(e.expected_value[0], shap_values[0][0], features = features)

We can use the shapely values to interpret our model. ‘force_plot’ showing how each feature influences the output. ‘External_Audit_Score’, ‘Internal_Audit_Score’, ‘Fin_Score’ are the biggest contributors in making predictions.

shap.summary_plot(shap_values[0], X_test, feature_names=features)

The side colors bar from high to low indicate the value of the feature, and at the x-axis tells risk, the positive side tells you there is a risk and the negative side tells you that there is no risk. Actually the negative side is 0 and the positive side is 1. 

So the low value of “internal_audit_score” determines there is no risk, and higher value determines there is a risk. But see that the “Location_Score” high value determines there is no risk, and the low value determines there is no risk. The features are sorted by their significance in the data. We can see that “internal_audit_score” is the most important feature.

Final words

We start with the introduction to shap value then understand why this tool is very much important in interpreting the ML models. Then at the end we saw practically how shap value make life so easy in interpreting the ML models.

References

  1. Classify structured data with feature columns
  2. SHAP Official Git-Hub repository
  3. Link for the above codes
Share
Picture of Waqqas Ansari

Waqqas Ansari

Waqqas Ansari is a data science guy with a math background. He likes solving challenging business problems through predictive modelling, descriptive modelling, and machine learning algorithms. He is fascinated by new technologies, especially those relating to machine learning.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.