Accountability is required for any decision-making tool in an organization. Machine learning models are already being used to automate time-consuming administrative tasks and to make complex business decisions. To ensure proper security of the model and business decisions, scientists and engineers must understand the inner mechanics of their models, which is commonly referred to as a black box. This is no longer the case, as various tools, such as ELI5, are available to track the inner mechanics of the model. In this article, we’ll look at how to explain the inner workings of language models like transformers using a toolbox called ECCO. The main points to be covered in this article are listed below.
Table Of Contents
- What Is Explainability?
- How is Explainability Important?
- Difference Between Interpretability and Explainability
- How ECCO can Explain the Inner working of Transformers
- Hands-On Implementation with ECCO
Let’s start the discussion by understanding the explainability of machine learning models.
What is Explainability?
Explainability in machine learning refers to the process of explaining a machine learning model’s decision to a human. The term “model explainability” refers to the ability of a human to understand an algorithm’s decision or output. It’s the process of deciphering the reasoning behind a machine learning model’s decisions and outcomes. With ‘black box’ machine learning models, which develop and learn directly from data without human supervision or guidance, this is an important concept to understand.
A human developer would traditionally write the code for a system or model. The system evolves from the data with machine learning. Machine learning will be used to improve the algorithm’s ability to perform a specific task or action by learning from data. Because the underlying functionality of the machine learning model was developed by the system itself, it can be difficult to understand why the system made a particular decision once it is deployed.
Machine learning models are used to classify new data or predict trends by learning relationships between input and output data. The model will identify these patterns and relationships within the dataset. This means that the deployed model will make decisions based on patterns and relationships that human developers may not be aware of. The explainability process aids human specialists in comprehending the decision’s algorithm. After that, the model can be explained to non-technical stakeholders.
Machine learning explainability can be achieved using a variety of tools and techniques that vary in approach and machine learning model type. Traditional machine learning models may be simpler to comprehend and explain, but more complex models, such as deep neural networks, can be extremely difficult to grasp.
How is Explainability Important?
When machine learning has a negative impact on business profits, it earns a bad reputation. This is frequently the result of a misalignment between the data science and business teams. There are a few areas where Explainability heals based on this, such as,
Understanding how your models make decisions reveals previously unknown vulnerabilities and flaws. Control is simple with these insights. When applied across all models in production, the ability to quickly identify and correct mistakes in low-risk situations adds up.
In high-risk industries like healthcare and finance, trust is critical. Before ML solutions can be used and trusted, all stakeholders must have a thorough understanding of what the model does. If you claim that your model is better at making decisions and detecting patterns than humans, you must be able to back it up with evidence. Experts in the field are understandably skeptical of any technology that claims to be able to see more than they can.
When a model makes a bad or rogue decision, it’s critical to understand the factors that led to that decision, as well as who is to blame for the failure, in order to avoid similar issues in the future. Data science teams can use explainability to give organizations more control over AI tools.
Difference Between Interpretability and Explainability
The terms explainability and interpretability are frequently used interchangeably in the disciplines of machine learning and artificial intelligence. While they are very similar, it is instructive to note the distinctions, if only to get a sense of how tough things may become as you advance deeper into machine learning systems.
The degree to which a cause and effect may be observed inside a system is referred to as interpretability. To put it another way, it’s your capacity to predict what will happen if the input or computational parameters are changed.
Explainability, on the other hand, relates to how well a machine’s or deep learning system’s internal mechanics can be articulated in human terms. It’s easy to ignore the subtle contrast between interpretability and comprehension, but consider this: interpretability is the ability to comprehend mechanics without necessarily knowing why. The ability to explain what is happening in depth is referred to as explainability.
How ECCO can Explain The inner working of Transformer
Many recent advances in NLP have been powered by the transformer architecture, and until now, we had no idea why Transformer-based NLP models have been so successful in recent years. To improve the transparency of Transformer-based language models, ECCO an open-source library for the explainability of Transformer-based NLP models was created.
ECCO offers tools and interactive explorable explanations to help with the examination and intuition of terms, such as Input Saliency, which visualizes the token importance for a given sentence. Hidden State Evaluation is applied to all layers of a model to determine the role of each layer. Non-negative matrix factorization of neuron activations was used to uncover underlying patterns of neuron firings, revealing firing patterns of linguistic properties of input tokens, and neuron activation tell us how a group of neurons spikes or responds while making a prediction.
Hands-On Implementation with ECCO
Now in this section, we will take a look at how ECCO can be used to understand the working of various transformers models while predicting the sequence-based output. Majorly we’ll see how weights are distributed at the final layer while predicting the next sequence and will also analyze all layers of the selected model.
To start with ECCO we can install it using the pip command as! pip install Ecco
And also make sure you have also installed the Pytorch.
First, we will start with generating a single token by passing a random string to the model. The GPT2 is used due to its superiority for generating the next sequence as a human does. The below code shows how we load the pre-trained model and how to use it for the prediction. The below generate method takes an input sequence and additionally there we can pass how many tokens we need to generate from the model by specifying generate= some number.
While initializing the pre-trained model we set activation=True that we capture all the firing status of the neurons.
import ecco # Load the model lm = ecco.from_pretrained('distilgpt2', activations=True)
Now we’ll generate a token using the generate method.
text = " Coincidentally i saw the" # Generate a token output = lm.generate(text)
From the method, token 6 and 5 is generated as first and of respectively.
The model has a total of 6 decoder layers and the last layer is the decision layer where the appropriate token is chosen.
Now we will observe the status of the last layers and see what are the top 15 tokens that the model has considered. Here we observe the status for position / token 6 and this can be achieved by output.layer_predictions method as below.
output.layer_predictions(position=6, layer=5, topk=15)
As we can see, the token first comes up with a higher contribution.
Similarly, we can check how different tokens would perform at the output layer. This can be done by explicitly passing the token numbers inside the method ranking_watch. However, tokens can be easily generated by using the pre-trained model that we have selected initially.
# generating the token IDs lm.tokenizer(" the first than")
Below are the generated token IDs.
Now we’ll supply these IDs to see the rankings.
output.rankings_watch(watch=[262, 717, 621], position=6)
At the decision layer, we can see the first rank is achieved by token first and the rest not even closer to it. Thus we can say the model has correctly identified the next token and did assign proper weights for possible tokens.
We have seen what the explainability of a Model is and how important it is when it comes to deploying such a model to the production level in this article. Tools are needed to aid debugging models, explain their behavior, and develop intuitions about their inner mechanics as language models become more common. Ecco is one such tool that combines ease of use, visual interactive explorable, and a variety of model explainability methods. This article focused on the ML model’s explainability and a glimpse of the ECCO Toolbox.