Explainability is one of the most challenging tasks in ensuring AI transparency. Explainable AI refers to the tools and techniques to understand the decision-making process of an AI system. The more complex the algorithm, the more difficult it is to understand how an AI/ML model has arrived at a decision. Explainable AI tries to tackle the black-box nature of such systems and build trust and transparency.
A recent study published by researchers at Carnegie Mellon University and Google AI came up with a method to formalise the ‘value’ of explanations. In this article, we try to understand this method’s advantages over existing frameworks.
Sign up for your weekly dose of what's up in emerging technology.
Evaluating The Model
The proposal is to calculate the ‘value’ for the explainability model using a student-teacher paradigm that measures the extent to which the teacher’s explanations improve a student’s learning. The student should be able to simulate the teacher model on unseen samples for which explanations are not available.
For instance, let us consider a teacher model trained on emails to predict whether the email is spam or not. Once the teacher is trained and teaches the student, the student should be able to mimic the teacher exactly. It doesn’t matter if the teacher’s predictions are wrong. The student’s goal here is not to make correct predictions but to exactly mimic the teacher.
Two cases are evaluated. In the first case, the teacher provides only two parameters — input and output. In our example, the teacher will provide the email and tell the student if it’s spam or not. In the second scenario, the teacher provides explanations for its decisions along with the input and output. In that case, the student should be able to mimic the teacher better. If the student cannot, then the teacher model is not doing well with its explanations.
The approach calculates the percentage of the student’s answers that match with the teacher — with and without explanations. The difference between these percentages is the evaluation metric of that explanation model. This evaluation is done on entirely new examples.
“One important difference between this model and the existing ones is that the students are not provided with explanations at the time of testing,” said Danish Pruthi, author of the study and doctoral candidate at CMU.
“This prevents teachers from leaking the answers to the students, corrupting the simulation task. Another thing here to consider is that the student and teacher can have different processes. That does not matter as the goal of the approach is to find out how well the student mimics the teacher model,” he added.
It is essential to understand how the students are trained with explanations. There are two approaches – Multitask Learning and Attention Regularisation.
In Multitask Learning, when the student learns from the teacher, the student has to accomplish two tasks. It should predict answers as well as explanations that match those from the teacher. Here the hope is that generating explanations provides an advantage for the student to mimic the teacher.
In our example, the teacher could mark an email as spam because it spotted the word ‘sale’. In this case, the student, along with marking the email spam, should also produce the same explanation as to the teacher. Here, the processes of the student and the teacher are similar.
In Attention Regularisation, the student observes where the teacher’s attention is, like what areas the teacher thinks are relevant before making the decision. And then you make the student pay attention to similar areas the teacher tells you. In this approach, you are not outputting something new, but matching your attention with what the teacher tells you.
For instance, in our example, if the teacher tells you to look at the title and watch out for ‘exclamation marks’ to mark an email as spam, then the student also pays attention to the exclamation marks.
“These are two different methods, and both of them work very well for explanations,” said Pruthi, “Other approaches make people look at the explanations, and that is clearly labour intensive and expensive. We don’t require any people in the loop for our method.
“There are also other problems of having people do it because people try to match the explanation to what their intuitions say about the task, rather than what the model is actually focusing on.”