Now Reading
How Machine Learning Helped Identify Nipah Virus-Carrying Bats Species In India

How Machine Learning Helped Identify Nipah Virus-Carrying Bats Species In India

In an advancement that could save hundreds of lives, a machine learning algorithm has now come to the rescue by identifying and species of bats that are potential carriers fo the deadly Nipah virus.


A team of scientists have found preliminary evidence that 11 different species of bats found in India are potential carriers of the deadly Nipah virus, that killed 17 people in Kerala last year. Until now, the virus was confirmed only in the Indian flying fox (Pteropus medius), a fruit bat species found across south Asia and treated as a major disease vector.

Today machine learning has come up with a way to help identify bats that are prone to carry the Nipah virus. A team of scientists could mark the species of bats that have traits to possess the deadly virus, all with the help of ML algorithms.

This research is a collaborative approach of experts from Montana State University, Cary Institute of Ecosystem Studies and Johns Hopkins Bloomberg School of Public Health. The work was published in the reputed journal PLoS Neglected Tropical Diseases, this week.

Data Used

The data that was used for this approach to identify the bats prone to carrying the virus, consists of a data with bat species that in the past have carried the virus. This data belonged to 523 different bat species. It included:

1.Traits of the 523 bat species. These traits were 48 in number.

2.Diet composition, geographic ranges and reproduction.

3.Torpor and migration behaviour.

4.Biological and ecological attributes. 

5.Environmental conditions were also taken into this study

The data then went through the following process:

1.The model was trained only on 80% of the total dataset. It had 50,000 trees specifying a Bernoulli error distribution.

2.In order to prevent overfitting, the distribution was built with 10-fold cross-validation.

3.Each species of the bat was weighted by its sample size to account for the fact that some species are more frequently sampled for henipaviruses compared to others.

4.To calculate the corrected area under the curve, target shuffling methods were applied. 

See Also

Goals For The ML Algorithm

There were these three basic objectives of the study:

1.To predict the bat species that may carry the Nipah virus: After the appropriate data is collected, it was clubbed together. An ML approach based on traits of bats was applied. The team used a model called the regression model. This model was applied to the data that characterized 48 traits of the 523 species. 

2.Virus prone with respect to the region: It had to be found out, with respect to a particular region, about how some bats are found to be Nipah virus positive compared to some others in the same region. This was done by examining various traits of species that are most likely to have the Nipah virus.

3.To find out whether a greater data for an enhanced study leads to trait profiles that describe the virus: It had to be found out if the traits that describe species of bats provide better studies of the species or the species where evidence of the virus injection has been reported does it better. For this reason, the team conducted a second generalised boosted regression analysis. This was to know if a greater data for better-studied species leads to which of the either of the two. In order to do this, they used the number of citations in Web of Science for the scientific name of each species. 

Uses Of The Algorithm

The ML algorithm used could do the following things:

  • Successfully identified the known Nipah-positive bats with an accuracy of 83%. 
  • Identified 6 bat species. These are identified as the ones that showed traits supporting them to be possible suspects of the host of the virus. Four of these 6 species occur in India, among which, 2 are Kerala-based

For every data recorded, classification was done on the basis of species, country of sampling, diagnostic method, sample size, sampling and reporting method. 

The Way Ahead

With the advancement in ML techniques, we can now combat critical diseases. It is used for predicting cardiac arrest, to discover new drugs in medicine, recognising cancerous tissues, identifying patterns in patient symptoms, and many others. With this new research, we can also now tackle Nipah virus from being spread by taking preventive measures.

What Do You Think?

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top