In order to help analysts, researchers and data scientists in the effort to combat COVID-19, Google is making a hosted repository of public datasets which will be free to access through Google’s COVID-19 Public Dataset Program. Data scientists and analysts will also be able to use BigQuery ML in order to train their advanced ML models with this data right inside BigQuery at no additional cost. These datasets will be on the lines of Johns Hopkins Center for Systems Science and Engineering, the Global Health Data from the World Bank, and OpenStreetMap data.
We have always known that the role of data is critical in conducting research, study, and combat any health emergencies and global crises. And free access to datasets can help analysts to analyse that data at cloud scale, which is an essential part of the research process, particularly for those who are working towards combating novel COVID-19.
According to Sam Skillman, the head of engineering at Descartes Labs, “Making COVID-19 data open and available in BigQuery will indeed be a boon for data scientists and researchers of the field.” Skillman further said that, in particular, having queries be free will allow greater participation, and the ability to share results and analysis with colleagues and the public quickly, will accelerate the shared understanding of how the virus is spreading.
These datasets have been created to remove barriers and provide access to critical information quickly and easily to analysts. It also eliminates the need to search for onboard large data files. Google made the datasets available on Google cloud console for researchers who wish to access the data. In effect until September 15, 2020, the datasets come along with a description of the data and sample queries in order to advance the research process. All the data Google included in the program will be made for the public and will be freely available strictly for educational and research purposes only.
An associate research scientist, Matteo Chinazzi, of Northeastern University, stated that developing data-driven models to combat the spread of this infectious COVID-19 disease is critical. The team in Northeastern University has been working intensively to model and better understand the spread of this outbreak. Chinazzi said, “By making the COVID-19 datasets open and available for researchers in BigQuery will help us to better understand, study, and analyse the impact of this disease.”
Google also ensured that they have policies and measures in place to handle them in accordance with widely recognised patient privacy and data security policies. “We on the Google Cloud team sincerely hope that the COVID-19 Public Dataset Program will enable better and faster research to combat the spread of this disease,” said officials.