Now Reading
Release Of COVID-19 Datahub And A Call To Action With AI

Release Of COVID-19 Datahub And A Call To Action With AI

The White House, today, in their official press release has announced the release of COVID-19 Open Research Dataset(CORD-19). This dataset was released with the combined efforts of researchers and leaders from the Allen Institute for AI, Chan Zuckerberg Initiative (CZI), Microsoft and other top medical organisations.

CORD-19 dataset consists of the most extensive machine-readable Coronavirus literature collection available with over 29,000 articles, more than 13,000 of which have full text.

How CORD-19 Was Made

To curate these thousands of articles, Microsoft’s web-scale literature curation tools were used. Whereas the Allen AI team transformed the content into machine-readable form, making the corpus ready for analysis and study.

Researchers are encouraged to submit the tools they have developed for text mining and also their insights that can help the white house’s call to action, which can be accessed via the Kaggle platform. These tools will be openly available for researchers around the world through Kaggle.

The researchers recognised the need for sharing vital information across scientific and medical communities to accelerate the response to the coronavirus pandemic. The new COVID-19 Open Research Dataset is designed to help researchers worldwide to access important information faster.

Since it is difficult to manually go through more than 20,000 articles to draw insights, Kaggle has decided to upload the machine-readable versions of those articles, which can be accessed by the huge 4 million data scientists community. 

Download all relevant material here:

Here a list of other resources and platforms that can help fight COVID-19 using algorithms:

Machinehack’s Challenge To Keep Track Of COVID-19

The objective of the hackathon is to gauge COVID-19 on various metrics — confirmed cases, recovered cases, and death events for the subsequent day using historical data on a given date.

The dataset will be updated daily at 00:00 UTC standard time with the prevailing forecast of the well-defined target variables. Besides, the published data is dynamic, and hence it will be replenished each day in a new column. The data in the rows will also fluctuate based on the reported changes for COVID-19 outbreak in various world geographies.

Check here.


Currently, bioRxiv has a repository of 539 articles related to COVID-19. This is a wonderful resource of a free online archive for unpublished preprints in the life sciences. It is operated by Cold Spring Harbor Laboratory, not-for-profit research and educational institution.

Check here.


To support urgent research to combat the ongoing outbreak of COVID-19, caused by the novel coronavirus SARS-CoV-2, the editorial teams at Nature Research have curated a collection of relevant articles. This collection includes research into the basic biology of coronavirus infection, its detection, treatment and evolution diseases, and coverage of current events. Nature publications have assured that these articles will remain free to access for as long as the outbreak remains a public health emergency.

Check here.

See Also

Database of Chest X-Ray By University Of Montreal

This GitHub repo is maintained by a doctoral student at the University Of Montreal. This repo consists of a constantly updated database of COVID-19 cases with chest X-ray or CT images.

All images and data will be released publicly in this GitHub repo.

Check here for more details.

WHO Database

Consists of the world health organisation’s WHO database of publications on coronavirus disease (COVID-19). 

Check here.


GISAID has a database that consists of influenza virus sequences and epidemiological data associated with human viruses, both geographical as well as species-specific data associated with avian and other animal viruses. This data is quite useful for researchers who are studying the evolution of viruses based on different factors. This data can be helpful to develop accelerated approaches to prevent future pandemics.

Check here.

Protein Data Bank

The protein data bank is a great repository that contains information regarding computationally predicted protein structures. Even DeepMind lab’s AlphaFold predictions have been uploaded to PDB. This initiative is aimed at enabling researchers to rapidly develop tests for this novel pathogen. Other labs have shared experimentally-determined and computationally-predicted structures of some of the viral proteins, and still, others have shared epidemiological data.

Check here.

Join Our Telegram Group. Be part of an engaging online community. Join Here.

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top