A Primer To Explainable and Interpretable Deep Learning

To be able to look at an algorithm and say, “yep, I can see what’s happening here.”

One of the biggest challenges in the data science industry is the Black Box Debate and the lack of trust in the algorithm. In the talk titled “Explainable and Interpretable Deep Learning” during the DevCon 2021, Dipyaman Sanyal, Head, Academics & Learning at Hero Vired, discusses the developing solution for the black box problem. 

Dipyaman Sanyal’s educational background consists of an MS and a PhD in Economics. His career only becomes more colourful, with his current title being the co-founder of Drop Math. In his 15+ year career, he has been awarded several honours, including 40 under 40 in India in Data Science in 2019. Sanyal is the only academic to have led three top-ranked analytics programs in India — UChicago-IBM-Jigsaw, Northwestern-Bridge and IMT, Ghaziabad. 

The Black Box problem

On one side of the debate are people wondering how to trust a model and an algorithm to make decisions for them without knowing what is in the little black box. The other side of the spectrum are people who argue that it’s like trusting a doctor to do surgery. “Why is it that we are holding AI on a different benchmark than humans?” Sanyal asked.

There is a greater demand for Explainable AI, XAI, especially among decision-makers on the side of the spectrum that don’t trust AIs completely. Surveys have shown that ⅔ of AI projects don’t go beyond the pilot phase because there is no trust in the black-box model. For Sanyal, the next leap in the AI industry will only arise after a significant amount of explainability tied into the world of AI. 

The saving grace

Governments have also increased the regulation in the data science field to prevent potential misuse of crypto, GANs, deep fakes, etc. There are two aspects that come as a solution here. 

The first aspect is interpretability – how to interpret the situation at hand and predict what will happen; to be able to look at an algorithm and say, “yep, I can see what’s happening here.” Interpretability is being able to discern the mechanics without knowing ‘why’, but it answers ‘how’.

Explainability goes one step ahead to explain the internal mechanics of a deep learning system in human terms. It explains what is happening and why every step of the way. 

Recent advances in AI have allowed algorithms to change and develop quickly, making it even more difficult to interpret what is underlying these models. In addition, DNNs are inherently vulnerable to adversarial inputs – this leads to unpredictable model behaviours. While this has caused a rush in deep learning research and implementation, the complexity of explanatory models has been rudimentary.  

Prevailing XML techniques 

XML techniques overcome the need to unravel the bits and pieces of more complex algorithms. These include: 

  • Feature importance – Shows the importance of every feature. 
  • LIME: Local Interpretable Model Agnostic Explanations – Learning interpretability model locally around the predictions; it is model agnostic.
  • SHAP: Shapley Additive Explanations – Provides the shaping value for each feature by taking conditional expectations into account and calculating the Shapley value.
  • PDP: Partial Dependence Plots – Shows the marginal effect of a feature on the predicted outcome.

XDL Techniques

The next steps to unravelling deep learning models: 

  • Gradient-Based Approaches – this takes a gradient descent approach where the higher the gradient for a given feature, the more sensitive the model scoring function is to the change in the respective input. The approach has two methods:
  1. DNNs: Deconvolutional Networks – a method that approximately projects the activations of an immediately hidden layer back to the input. 
  1. Guided Backpropagation – the method combines vanilla backpropagation at ReLUs with DeconvNets. 
  • Axiomatic approach – this approach has some formal notions or properties that define what explainability or relevance is and looks if the neurons follow the notions to be ‘relevant’. This is used for:
  1. Layer-wise Relevance Propagation – This is used to understand NN or LSTMs. It redistributes the prediction function backwards using local redistribution rules until assigning a relevance score to each input variable. It is a layer-wise relevance propagation every step of the way. It follows the conservation of total relevance for the layer. 

Source: DevCon 2021

  • DeepLIFT – This approach is useful for the tricky areas of deep learning while digging into feature selection inside an algorithm. It works through a form of backpropagation. The model takes the output and pulls it apart by ‘reading’ the different neurons that develop the original output. 

Source: DevCon 2021

  • CAMs – Class Activation Maps replace fully connected layers with a GAP with a CNN layer to reduce it into a one-dimensional tensor and a single linear layer on top before feeding to the softmax. It is used to make CNNs more interpretable.

In the architecture, the GAP layer reduces each feature map into a single scalar. 

Source: DevCon 2021

“Financial structures favour structural models that are easy to interpret by people,” Sanyal noted. “While deep learning algorithms have made great progress, further work is needed to attain appropriate perception accuracy and resilience based on numerous sensors.”

Download our Mobile App

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox