Listen to this story
Google has recently integrated Google Dataset into their Google search query. The new integration will enable users to access statistical datasets from the Google search box itself. The goal is to make datasets “easy to find and access” just like any other information on the internet. Previously, this search function was available as a separate search site called ‘Dataset Search‘.
Google’s announcement is also a step towards making the search engine superior. With its rival, Microsoft Bing, slowly catching up in the AI search engine race, Google’s “Dataset Search” is a source of credibility which chatbot powered search engines cannot provide.
Dataset Search is an existing dataset search engine that indexes over 45 million datasets from more that 13,000 websites. These datasets are sourced from various sources including academic repositories, government websites and other available repositories. The datasets span across various topics and can be used for research and analysis purposes.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
To allow easy discovery of statistical content, Google has made it simpler for users to access all the information in one place. The top three datasets are available to the users with an option to access “more datasets”.
Dataset Search indexes datasets that contain structured data from schema.org. A collaborative project between search engines Google, Microsoft, Yahoo and Yandex, scheme.org is a markup language that adds structured data to web pages which assists search engines to interpret the content in a better manner.
Google has mentioned in their announcement that research material should be freely accessible to everyone as it helps people from various verticals including scientific research, business analysis and public policies. They also want to align themselves with the country’s policy of providing free information for “federally funded research”.
The results that are returned are not consistent. As shown in the above screenshot, if a user types “India reservoir level data”, the datasets are not listed unless the search query is “India reservoir level dataset”. However, if you search by “USA reservoir level data”, the datasets are visible. This could either be a function of the data available on the internet or that the dataset integration on Google search box is yet to work seamlessly for data from other countries.