Listen to this story
Developers trying to fine-tune LLMs often encounter a plethora of issues. An experiment by Jonathan Whitaker and Jeremy Howard from fast.ai highlighted a rather unscrutinized problem with LLM models — overconfidence, which shouldn’t be confused with the widely discussed LLM hallucination.
Overconfidence is when the model insists on certain information provided in the dataset even if it is incorrect for the said question, which is possibly caused by the infamous two terms — underfitting and overfitting.
To start with, overfitting is when a model becomes overly intricate and tailors itself too closely to the training data. And underfitting, as the name suggests, is exactly the opposite, when the model does not have enough training data to make predictions. This balance is often referred to as the bias-variance tradeoff.
To tackle these problems, developers apply several techniques, some work, and some bring up other problems. When it comes to the case of fast.ai researchers, they tried to train the model on a single example, which, to their surprise, gave out very different results than they expected.
Enter Overconfident LLMs
When the model is given new unseen data, it can display unwarranted confidence in its predictions, despite being wrong. This is contrary to the conventional belief that neural networks typically require a multitude of examples due to the bumpy nature of loss surfaces during training.
Imagine a language model that has been fine-tuned on a comprehensive medical dataset to diagnose diseases based on patient descriptions. When provided with cases featuring evident symptoms and clear diagnostic criteria, the model confidently assigns high probabilities to specific diseases. For instance, if a patient describes classic symptoms of the flu, the model might assign a near 1.0 probability to influenza as the diagnosis.
However, when confronted with complex medical cases with ambiguous symptoms or multiple potential diseases, the model might distribute probabilities more evenly among different diagnostic options, indicating its uncertainty about the correct diagnosis.
Similarly, when training neural network classifiers, which are typically exposed to extensive datasets repeatedly, Howard and Whitaker noticed that even a single example of an input-output pair had a remarkable impact on these models. It was discovered that during training, the models exhibited overconfidence. As their confidence increased, they assigned close to 1.0 probability to their predictions, even if those predictions were incorrect.
This overconfidence, particularly in the early stages of training, raised concerns about how neural networks handle new information and adapt to it.
They found out that the model can learn to make accurate predictions after seeing a single example, the model essentially memorised the training data (a single example) and demonstrated a robust generalisation, making it less likely to overfit. The intention was to get the machine to learn efficiently and make reliable predictions regulating its confidence scores.
Is overfitting the cause of overconfident models?
While overfitting, the phenomenon where a model becomes too specific to the training data, is a well-known challenge in machine learning, the real problem here appears to be overconfidence. These predictions led to an unexpected result: the validation loss, which measures the model’s performance on unseen data, got worse, even as the model’s training loss improved.
As expected, the experiment brought up several discussions on a HackerNews thread. When the model learns the training data too well, it performs poorly on new data. The researchers of the model explained that they are not pointing out any problem, but just pointing out an opportunity if it is possible to train models with a single example.
Interestingly, the two terms are closely related, overconfidence can be a symptom of overfitting. When a model is overfit, it learns the statistical noise in the training data, as well as the underlying patterns. This can lead to the model being overly confident in its predictions, even when those predictions are not accurate.
However, overconfidence is not always caused by overfitting. A model can also be overconfident if it is not trained on enough data, or if the data is not representative of the real world.
Lucas Beyer, researcher at GoogleAI, clarifies that these findings are specific to fine-tuning pre-trained models and don’t necessarily change how models are initially pre-trained. He also pointed out that the findings are more applicable to fine-tuning scenarios and might not be as relevant to training models entirely from scratch.
While there are other questions and critiques of this experiment, one oversight not missed by anyone is the lack of the base model or any detail of the model trained on for this experiment. It is not even clear if they used the same dataset again and again to fine-tune the model, which resulted in overfitting, and thus, overconfidence.