The COVID-19 pandemic was arguably the most disruptive global event of the last decade, and still continues to wreak havoc even almost two years after its breakout. The pandemic caught the whole world off guard, and turned out to be a steep learning curve for many countries.
In that light, Charu Garg, Senior Software Engineer, and Bala Dutt Principal Software Engineer, at Intuit, delivered a talk titled, ‘MLE for Inverse problem of SEIR for COVID-19 for the world’ at MLDS 2021. The duo presented a solution that uses machine learning algorithms to gain insights into the effects of the pandemic in different countries. These insights can help with disaster preparedness elsewhere.
The SEIR Model
The COVID-19 data for this project was collected from different sources. Though the data came in different formats, the parameters considered were similar and hence required very little data manipulation.
Speaking about the central model in epidemics such as COVID-19, Dutt said, “if you have one person that is infected, a patient zero infects others and it spreads. Different diseases have a different rate at which they increase. We refer to how infectious the disease is using a parameter R0.”
The duo used the SEIR (Susceptible-Exposed-Infectious-Removed) model. “It is a compartmental model and has four sections or compartments. We assume that the population moves from one compartment to another. In our project, the four compartments were–susceptible, exposed, infected, and removed.”
As the names suggest, susceptible compartment included population susceptible to infection, exposed for those who have already come in contact with the virus, people with the disease were placed in the infected compartment, and removed compartment contained population that has either recovered or died from the disease.
The rates of movement of the population were governed by three parameters–alpha, beta, and gamma.
“This model has sets of first-order differential equations to calculate the rate of change in different compartments based on alpha, beta and gamma parameters,” said Dutt.
To find the values of alpha, beta, and gamma, the state space search technique was used. “Alpha, beta, and gamma can have values ranging from zero to one. In a 3D space, we try to find their best values, which give the minimum loss. For a longer period of time, like two months, these values were highly dynamic. To overcome this, we divided the time period into simpler time windows. Within a time window, the values remained the same, outside the window, they varied. We did simulation over all the windows. The output of one time window feeds as input to next,” explained Dutt.
Window size is basically the number of days, which is the complete timeline considered for the model. “With a window size ‘all’, it was observed that there was a great difference between the actual and predicted value, so we had to compartmentalise the window size as well,” said Garg.
There were four window sizes considered–all, 7 days, 14 days, and 21 days. The greatest accuracy was observed with a 14 days window. Interestingly, the average quarantine period is also 14 days.
In future, the team plans to include more parameters such as the number of deaths in their SEIR model. Further, cluster considerations such as demographics, temperature, climate, etc will also be taken into account.