Active Hackathon

AWS Launches ML-Enabled Search Capabilities For COVID-19 Dataset


With data around COVID-19 growing by the day, getting the most out of it has become a challenge. Amazon Web Services (AWS) has tried to streamline this vast pool of information with the launch of CORD-19 Search. Powered by machine learning (ML), this search website enables users to quickly comb through thousands of research papers and other material using natural language questions.

With an initial repository of 24,000 scientific and research sources around COVID-19 built through this dataset, it has almost doubled to 47,000 papers today. 


Sign up for your weekly dose of what's up in emerging technology.

How AWS’ CORD-19 Search Can Help

The body of work which the scientific community has been using to study the novel coronavirus to find solutions to detect and even treat COVID-19 is vast. This makes it challenging for them to analyse it to extract key insights.

How can CORD-19 Search help? The website enables them to navigate this vast body of work to find information that is verified and up-to-date. Its simple search interface also allows users to ask questions using natural language. It then produces answers along with supporting source documents. Furthermore, CORD-19 Search also offers evidence-based topics on relevant subjects, including transmission and risk factors. Scientists can derive a lot of value from these functionalities by quickly having their queries addressed and using that to advance their research.

How CORD-19 Search Was Built

The original dataset is enhanced by a natural language processing service that uses ML to extract relevant medical information from unorganised content. This includes the treatment as well as the timeline. This is then mapped to COVID-19-related topics using a multi-label classification model and inference.
Following this, the data is then indexed in ML-powered Amazon Kendra. This delivers robust natural-language query capabilities, making it easier to find associated articles. AWS has been applying ML to the CORD-19 data set to quicken the pace of discovery to fight the spread and eventually contain coronavirus.

More Great AIM Stories

Anu Thomas
Anu is a writer who stews in existential angst and actively seeks what’s broken. Lover of avant-garde films and BoJack Horseman fan theories, she has previously worked for Economic Times. Contact:

Our Upcoming Events

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Ouch, Cognizant

The company has reduced its full-year 2022 revenue growth guidance to 8.5% – 9.5% in constant currency from the 9-11% in the previous quarter

The curious case of Google Cloud revenue

Porat had earlier said that Google Cloud was putting in money to make more money, but even with the bucket-loads of money that it was making, profitability was still elusive.