Google Working On COVID-19 Public Dataset Program To Make Data Freely Accessible

Google Working On COVID-19 Public Dataset Program To Make Data Freely Accessible

In order to help analysts, researchers and data scientists in the effort to combat COVID-19, Google is making a hosted repository of public datasets which will be free to access through Google’s COVID-19 Public Dataset Program. Data scientists and analysts will also be able to use BigQuery ML in order to train their advanced ML models with this data right inside BigQuery at no additional cost. These datasets will be on the lines of Johns Hopkins Center for Systems Science and Engineering, the Global Health Data from the World Bank, and OpenStreetMap data.

We have always known that the role of data is critical in conducting research, study, and combat any health emergencies and global crises. And free access to datasets can help analysts to analyse that data at cloud scale, which is an essential part of the research process, particularly for those who are working towards combating novel COVID-19.

According to Sam Skillman, the head of engineering at Descartes Labs, “Making COVID-19 data open and available in BigQuery will indeed be a boon for data scientists and researchers of the field.” Skillman further said that, in particular, having queries be free will allow greater participation, and the ability to share results and analysis with colleagues and the public quickly, will accelerate the shared understanding of how the virus is spreading.


Sign up for your weekly dose of what's up in emerging technology.

These datasets have been created to remove barriers and provide access to critical information quickly and easily to analysts. It also eliminates the need to search for onboard large data files. Google made the datasets available on Google cloud console for researchers who wish to access the data. In effect until September 15, 2020, the datasets come along with a description of the data and sample queries in order to advance the research process. All the data Google included in the program will be made for the public and will be freely available strictly for educational and research purposes only. 

An associate research scientist, Matteo Chinazzi, of Northeastern University, stated that developing data-driven models to combat the spread of this infectious COVID-19 disease is critical. The team in Northeastern University has been working intensively to model and better understand the spread of this outbreak. Chinazzi said, “By making the COVID-19 datasets open and available for researchers in BigQuery will help us to better understand, study, and analyse the impact of this disease.”

Google also ensured that they have policies and measures in place to handle them in accordance with widely recognised patient privacy and data security policies. “We on the Google Cloud team sincerely hope that the COVID-19 Public Dataset Program will enable better and faster research to combat the spread of this disease,” said officials.

More Great AIM Stories

Sejuti Das
Sejuti currently works as Associate Editor at Analytics India Magazine (AIM). Reach out at

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM