Active Hackathon

Can Baidu’s Latest AI Tool Fight Coronavirus

In the wake of the world’s attempt to prepare for the Corona outbreak, Baidu’s AI team has released a tool — LinearFold to reduce 2019-nCoV prediction time from 55 minutes to 27 seconds.

The new, or “novel” coronavirus, now called 2019-nCoV, hasn’t previously been detected before the outbreak, which was reported in Wuhan, China in December 2019. It has now claimed deaths of nearly 500 people, and the whole world is on high alert. 


Sign up for your weekly dose of what's up in emerging technology.

Compared to the SARS (severe acute respiratory syndrome) outbreak in 2003, which infected 8,098 and killed 774 in 17 countries, the incubation period of 2019-nCoV lasts longer, spanning up to two weeks, and is highly contagious. 

With time, the experts now believe that 2019-nCoV will likely to continue mutating, making it unpredictable and harder to control. 

As the medical experts are trying to figure out a clear defence strategy against this global pandemic, Baidu’s AI team has lent a helping hand in the form of LinearFold. Linearfold’s ability, announced by the researchers, has fetched it a placement in the top academic conference in bioinformatics, as well as in the Bioinformatics journal

With Baidu’s LinearFold, claimed by the researchers, it takes 27 seconds to analyse the structural information of the virus. This efficiency is crucial for understanding the virus and developing its vaccine.

Overview Of LinearFold

via Baidu Research

The challenge with the existing algorithm for RNA secondary structure prediction is the runtime that scales cubically with the RNA length. This delay in computation has been a huge challenge in predicting structures and applicability on RNA viruses which has large genomes such as HIV, Ebola, and in particular, the coronavirus family that ranges from 26 to 32 kilobases — the largest for an RNA virus.  

LinearFold is the first RNA folding algorithm to achieve linear runtime. Given an RNA sequence, x∈{A,C,G,U}⁠, the secondary structure prediction problem aims to find the best-scoring pseudoknot-free structure. 

In this framework, scores for different pairs can be assigned, and a penalty can be given for each unpaired nucleotide. 

LinearFold is the combination of computational linguistics and incremental parsing algorithms that are used to scan the RNA sequence in a faster way. 

Key Takeaways

According to the original paper, the authors list the following advantage of LinearFold:

  1. Though LinearFold uses only a fraction of time and memory compared to existing algorithms.
  2. The accuracy improvement of LinearFold is more pronounced on longer families of rRNAs.
  3. LinearFold is also more accurate than the baselines at predicting long-range base pairs, which are challenging for the current models 
  4. Although the performance of LinearFold depends on the beam size, the accuracy of the prediction is stable.

Genes are often expressed in terms of RNA’s (Ribonucleic Acid). RNAs play a key role in many biochemical reactions, and knowing their structure will help in guessing what role they will be playing. 

The 2019-nCoV belongs to a family of enveloped coronaviruses that are single-stranded RNA viruses, such as HIV, Ebola and influenza, which mutate faster and make vaccine development more difficult. 

So obtaining the sequence of RNA is a key to grasp its function. These sequences can be long and similar for a great length, and the crucial chunk of sequence can appear somewhere in the whole structure. Predicting this structure and then, in turn, predicting the function of RNA will help in designing drugs to control the catalysis of the enzymes, which the viruses use for synthesis. Baidu’s Linearfold offers the much needed accurate yet quick prediction that can cut down the diagnosis time.

Know more about LinearFold here.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
How Data Science Can Help Overcome The Global Chip Shortage

China-Taiwan standoff might increase Global chip shortage

After Nancy Pelosi’s visit to Taiwan, Chinese aircraft are violating Taiwan’s airspace. The escalation made TSMC’s chairman go public and threaten the world with consequences. Can this move by China fuel a global chip shortage?

Another bill bites the dust

The Bill had faced heavy criticism from different stakeholders -citizens, tech firms, political parties since its inception

So long, Spotify

‘TikTok Music’ is set to take over the online streaming space, but there exists an app that has silently established itself in the Indian market.