“AI models do not need to be interpretable to be useful.”Nigam Shah, Stanford
Interpretability in machine learning goes back to the 1990s when it was neither referred to as “interpretability” nor “explainability”. Interpretable and explainable machine learning techniques emerged from the need to design intelligible machine learning systems and understand and explain predictions made by opaque models like deep neural networks.
In general, the ML community is yet to agree on a definition for explainability or interpretability. Sometimes it is even called understandability. Some define interpretability as “the ability to explain or to present in understandable terms to a human”. According to experts, interpretability depends on the domain of application and the target audience. Therefore, a one-size-fits-all definition might be infeasible or unnecessary. When concepts are used interchangeably, would it be wise to sacrifice the usability of a model for lack of comprehension? Where does one draw the line?
Despite deep learning’s popularity, many organisations are still comfortable using logistic regression, support vector machines and other conventional methods. Though model agnostic techniques can be used for traditional models, they are considered overkill for explaining kernel-based ML models. Model-agnostic methods can be computationally expensive and can lead to poorly approximated explanations.
Stanford’s Nigam Shah, in a recent interview, touched on why explainability may not always be necessary. “We don’t fully know how most of them really work. But we still use them because we have convinced ourselves via randomized control trials that they are beneficial,” said Shah.
Explainability In Its Many Forms
Image credits: Stanford HAI blog
For any organisation, explainability becomes an issue when clients or other stakeholders come into the picture. The stakeholders fall into two categories:
- One where explanations can be used as a one-off sanity check or shown to other stakeholders as reasoning for a particular prediction.
- Explanations that can be used to garner feedback from the stakeholder regarding how the model ought to be updated to better align with their intuition.
It is generally believed that explainable methodologies can have broader advantages as they can be communicated to a wider audience and not just the immediate stakeholders. These methodologies help share the insights across the organisation without the need for a specialist in every scenario.
According to Shah, there are three main types of AI interpretability:
- Explainability that focuses on how a model works.
- Causal explainability deals with the “whys and hows” of the model input and output.
- Trust-inducing explainability provides the information required to trust a model and confidently deploy it.
So, it is important to know what type of explainability a data science team is targeting. That said, there is a chance that a use case might be a mix of all three. Such trade-offs and overlaps present a bundle of paradoxes to a decision-maker.
With increasing sophistication and completeness, the system becomes less understandable. “As a model grows more realistic, it becomes more difficult to understand,” said David Hauser at the recently concluded machine learning developers conference. According to Hauser, clients want the model to be understandable and realistic.This is another paradox a data scientist has to live with. He also stressed that understandable solutions give up on accuracy. For instance, network pruning one such technique which takes a hit on accuracy. The moment non-linearities or interactions are introduced, the answers become less intuitive.
“Do you, as a user, care how the weather is predicted, and what the causal explanation is, as long as you know a day ahead if it is going to rain and the forecast is correct?”
We live in a world of an abundance of tools and services. Making the right choice leads to another paradox– Fredkin’s paradox, which states the more two alternatives seem similar, the harder it is to choose and the more time/effort required to decide.
Stanford professor Shah has also emphasised the Trust paradox. According to him, explanations aren’t always necessary. What can be worse is, sometimes they lead people to rely on a model even when it’s wrong. According to Shah, what engineers need from interpretability might not coincide with those of the model users whose focus is around causality and trust. Furthermore, explanations can also dent the chances of knowing what one really needs.
In his interview with Stanford HAI, Shah shared:
- AI models do not need to be interpretable to be useful.
- Doctors at Stanford prescribe drugs on a routine basis, without fully knowing how most of them really work.
- In health care, where AI models rarely lead to such automated decision making, an explanation may or may not be useful.
- If it is too late to intervene for the clinician, what good are the explanations?
- But, AI for job interviews, bail, loans, health care programs or housing, absolutely require a causal explanation.
One of the vital purposes of explanations is to improve ML engineers’ understanding of their models to refine and improve performance. Since machine learning models are “dual-use”, explanations or other tools could enable malicious users to increase capabilities and performance of undesirable systems.
There is no denying that explanations allow model refinement. And, as we go forward, apart from the debugging and auditing of the models, organisations are looking at data privacy through the lens of explainability. Medical diagnosis or credit card risk estimation, making models more explainable, cannot come at the cost of privacy. Thus, sensitive information is another hurdle for explainability.