“Neil Ferguson’s Imperial model could be the most devastating software mistake of all time.”
Neil Ferguson, a popular epidemiologist from the Imperial College, London, whose model forecasted the possible deaths due to COVID-19 with and without lockdown, has turned to be inaccurate and the events followed has even led to his resignation.
Ferguson’s forecasts have been around for a while. He has been doing this for over a decade — during swine flu, mad cow disease and other outbreaks. One thing that is common in all of his models was, the results were nowhere near to the reality.
But, why should we talk about only Ferguson’s forecasting model and its flaws when there have been many over the years?
Because the results of his models were taken at face value and were used to frame policies. These policies were used by the UK and the US for announcing the nationwide lockdowns. The Imperial College’s model forecast a number of potential outcomes: by October, more than 500,000 people in Great Britain and 2 million people in the U.S. would die as a result of COVID-19.
An ex-Google software developer who goes by the pseudonym Sue Denim, elaborated what is wrong with this model in his blog.
What Is The Problem With The Model
The model is designed to simulate households, schools, offices, people and their movements. But due to bugs, the code gives different results given identical inputs. “They routinely act as if this is unimportant,” lamented Denim in his critique. Even the code that was being evaluated by interested developers doesn’t actually paint the full picture. The code that was released by the Imperial College on GitHub is allegedly a heavily modified version, and this too fails to give accurate results. In other words, this code has a replicability issue.
This model spits out wildly different outcomes if you go play around with the formatting options.
This revised codebase is written in C++ and is split into multiple files, but the original program was a 15,000 line single file that was written over a decade, which the ex-Google engineer considers it to be an extremely poor practice.
In the documentation attached to the released code on GitHub, it says that the model is stochastic and multiple runs with different seeds should be undertaken to see average behaviour. To which, Denim says that “Stochastic” is just a scientific-sounding word for “random” and this should not be a problem if the randomness is intentional as this approach is widely used in Monte Carlo techniques.
The above plot shows two runs of the code. One with the results posted using Imperial model and the second one, the orange curve is what was obtained on a rerun. We can clearly see a difference at the tip, and this may look very close, but it actually indicates the number of deaths!
80,000 deaths cannot be brushed off as a glitch in the model. But according to the response given by the Imperial team, they were aware of the bugs, which they call “small non-determinisms!”
When Edinburgh ran into problems while running Imperial’s model, they were asked to run the model in single-threaded mode, i.e. they can use only a single CPU instead of the multi-core system because the code breaks down when it is made to run faster.
“…that’s how Imperial use the code: they know it breaks when they try to run it faster”
For example, in one section of code, writes Denim, there is a loop over all the “places” the simulation knows about. This code appears to be trying to calculate R0 for “places”.
But, R0 isn’t a real characteristic of the virus.
Denim also wrote about the involvement of Microsoft’s GitHub developers and how implausible it would be to say that GitHub employees don’t understand how to use GitHub.
Neil Ferguson’s model was also reviewed by the likes of Nassim Taleb who has been warning the masses about the pitfalls of misreading randomness.
“However, they make structural mistakes in analysing outbreak response. They ignore standard Contact Tracing allowing isolation of the infected prior to symptoms. They also ignore door-to-door monitoring to identify cases with symptoms,” wrote Nassim Taleb and his collaborators in their review of Ferguson’s Impact of non-pharmaceutical interventions.
This whole episode of Neil Ferguson exposes one thing for certain that the data scientists and policymakers can be fooled by randomness and their actions can lead to irreparable outcomes. This also urges for the importance of doing data science in the most robust and transparent way possible.
You can try the code for yourself here.