SHAP values (SHapley Additive exPlanations) is an awesome tool to understand your complex Neural network models and other machine learning models such as Decision trees, Random forests. Basically, it visually shows you which feature is important for making predictions. In this article, we will understand the SHAP values, why it is an important tool for interpreting neural network models, and in the end, we will implement SHAPS values to interpret neural networks. The major points to be covered in this article are listed below.
Table of contents
- Introduction to SHAP value
- Importance of SHAP values
- Calculating SHAP values of Neural networks.
Let’s first understand the features of SHAP.
Introduction to SHAP value
SHAP value is a real breakthrough tool in machine learning interpretation. SHAP value can work on both regression and classification problems. Also works on different kinds of machine learning models like logistic regression, SVM, tree-based models and deep learning models like neural networks.
In a regression problem even if the features are correlated SHAP value can correctly assign the feature importance. Hence every ML developer should have this tool in their skillset to represent the model outcomes.
Are you looking for for a complete repository of Python libraries used in data science, check out here.
Importance of SHAP values
After implementing machine learning models our next step is to analyze the model. SHAP value helps to select which feature is important and which feature is useless by plotting graphs. SHAP value became a famous tool in a very short period of time because before we had interpretation only in tabular form so it became tricky to get the result, but in the visual representation of feature importance, we can get the result at first glimpse.
Calculating SHAP values of Neural networks
In this section we are going to implement a neural network then calculate the SHAP value.
First of all, install the shap value package into your environment.
(Note: implementation done in google colab)
!pip install shap
Load essential libraries that will help us to implement neural networks, plotting graphs, and computations.
#load libraries import tensorflow as tf from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt plt.style.use('fivethirtyeight') import seaborn as sns import matplotlib.pyplot as plt import pandas as pd import numpy as np import sklearn import lifelines import shap # This sets a common size for all the figures we will draw. plt.rcParams['figure.figsize'] = [10, 7]
Now we will Import our data taken from MachineHack’s Weekend hackathon into pandas DataFrame. Our target variable is “IsUnderRisk”, where 1 means under risk, 0 means not under risk.
df = pd.read_csv('/content/Train.csv') df
Data has no issue, I deliberately choose this data because preprocessing is not our motive. So I directly convert the dataframe into an array so that it can pass into the neural network.
#convert the data into array dataset = df.values dataset
Select X and y values
X = dataset[:, 0:7] y = dataset[:, 7]
Store all feature names in an array and save it into the “features” variable
features = ['City', 'Location_Score', 'Internal_Audit_Score', 'External_Audit_Score', 'Fin_Score', 'Loss_score', 'Past_Results' ]
Convert the values into standard form
#process the data from sklearn.preprocessing import StandardScaler scaler = StandardScaler() X_scale = scaler.fit_transform(X) X_scale
Splitting data into training and testing in 80 : 20
#split the data 80% training and 20% testing X_train, X_test, y_train, y_test = train_test_split(X_scale, y, test_size=0.2, random_state = 4)
Model building and compiling
#Build the model model = tf.keras.Sequential([ tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dense(256, activation='relu'), tf.keras.layers.Dense(256, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid') # 1 because its a binary classification ])
#Compile the model model.compile( loss = tf.keras.losses.binary_crossentropy, optimizer = tf.keras.optimizers.Adam(lr = 0.02), metrics = [ tf.keras.metrics.BinaryAccuracy(name='accuracy'), tf.keras.metrics.Precision(name='precision'), tf.keras.metrics.Recall(name='recall') ] )
hist = model.fit(X_train, y_train, epochs=100)
Now we will start calculating shap values. First start with defining an explainer, and in the second line we have calculated the shap values. Note that shap value calculation is extremely slow so before applying it make sure that it will take a lot of time, my test data has only 109 rows so i don’t have to worry. After running below code you will see the progress bar.
e = shap.KernelExplainer(model, X_train) shap_values = e.shap_values(X_test)
We can use the shapely values to interpret our model.
shap.initjs() # visualize the first prediction's explanation with a force plot shap.force_plot(e.expected_value[0], shap_values[0][0], features = features)
We can use the shapely values to interpret our model. ‘force_plot’ showing how each feature influences the output. ‘External_Audit_Score’, ‘Internal_Audit_Score’, ‘Fin_Score’ are the biggest contributors in making predictions.
shap.summary_plot(shap_values[0], X_test, feature_names=features)
The side colors bar from high to low indicate the value of the feature, and at the x-axis tells risk, the positive side tells you there is a risk and the negative side tells you that there is no risk. Actually the negative side is 0 and the positive side is 1.
So the low value of “internal_audit_score” determines there is no risk, and higher value determines there is a risk. But see that the “Location_Score” high value determines there is no risk, and the low value determines there is no risk. The features are sorted by their significance in the data. We can see that “internal_audit_score” is the most important feature.
Final words
We start with the introduction to shap value then understand why this tool is very much important in interpreting the ML models. Then at the end we saw practically how shap value make life so easy in interpreting the ML models.