MIT Researchers Build New AI Model To Predict Viral Escape

“Every mammal on this planet instinctively develops a natural equilibrium with the surrounding environment, but humans do not. You move to an area, and you multiply and multiply until every natural resource is consumed, and the only way you can survive is to spread to another area. There is another organism on this planet that follows the same pattern. Do you know what it is? A virus,” said Agent Smith. The truth bomb from the Matrix franchise never rang truer.

Since time immemorial, viruses have been the chief nemesis of homosapiens. Though humans have come up with an equalizer in vaccines to take on this scourge, viruses always find a way to one-up us in this whac-a-mole game. The micro-menace mutates fast rendering vaccines impotent. The phenomenon is called ‘viral escape’.

But there’s good news. MIT Researchers have now devised a new way to computationally model viral escape. The study, published in the Science journal, can accelerate vaccine production for HIV, influenza, and Coronavirus. 


Sign up for your weekly dose of what's up in emerging technology.

The viral escape is one of the main reasons why it is challenging to produce vaccines for influenza and HIV, according to Bonnie Berger, head of the Computation and Biology group in MIT’s Computer Science and Artificial Intelligence Laboratory.

Here we try to understand how the researchers trained the model for predicting virus mutations, and how it can facilitate effective vaccine development. 

Download our Mobile App

How It Works?

The model can predict the sections of the viral surface proteins that are likely or less likely to mutate, making them good targets for vaccines.

The researchers used a machine-learning algorithm, originally developed for processing human natural language, to predict the virus behaviour. The researchers discovered the NLP model could be applied to genetic sequences. Here, grammar is analogous to the rules that determine whether the protein encoded in a particular sequence is functional or not. The semantic meaning is analogous to whether the protein can take new shape to evade antibodies. Therefore in a mutation, a virus wants to remain healthy; in other words, maintain its grammaticality while changing the protein’s structure in a useful way.

According to MIT’s press release, the researchers trained the NLP model on 60,000 HIV sequences, 45,000 influenza sequences, and 4,000 coronavirus sequences to analyse the genetic sequences for its grammar and semantics. 

Use Cases

The trained model predicted sequences with a high likelihood of mutation. The model predicted sequences in three proteins: the coronavirus spike protein, HIV envelope protein, and influenza hemagglutinin (HA) protein.

As per the findings, sequences in the HA protein were the least likely to mutate. Meaning antibodies can kill the HA stalk fast enough to prevent influenza. 

Similarly, for the Coronavirus, the subunit-2 or S2 part of the spike protein is the least likely to generate escape mutations. However, unlike influenza and HIV, we do not know how fast the mutations are and if the current vaccine will be sufficient. While initial findings showed the mutations are not as fast, new mutations have appeared in Singapore, South Africa, and Malaysia. The scientists have applied the NLP model for the new strain, as the research awaits peer-review. 

In HIV, the scientists found the V1-V2 parts of the envelope protein showed many possibilities of escape mutations.

Going Ahead

Predicting mutations for viral escapes can help quickly identify dangerous proteins and proteins that could be rapidly targeted. The model opens up infinite opportunities for drug development across diseases.

The researchers at MIT are now working on identifying possible targets for cancer vaccines that stimulate the immune system to destroy tumours.

More Great AIM Stories

Kashyap Raibagi
Kashyap currently works as a Tech Journalist at Analytics India Magazine (AIM). Reach out at

AIM Upcoming Events

Regular Passes expire on 3rd Mar

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 17th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, Virtual
Deep Learning DevCon 2023
27 May, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox