Google’s Dataset Search Now Integrated with Google Search

Dataset Search is now integrated into Google Search to allow seamless access to statistical data.
Listen to this story

Google has recently integrated Google Dataset into their Google search query. The new integration will enable users to access statistical datasets from the Google search box itself. The goal is to make datasets “easy to find and access” just like any other information on the internet. Previously, this search function was available as a separate search site called ‘Dataset Search‘. 

Google’s announcement is also a step towards making the search engine superior. With its rival, Microsoft Bing, slowly catching up in the AI search engine race, Google’s “Dataset Search” is a source of credibility which chatbot powered search engines cannot provide. 

Dataset Search is an existing dataset search engine that indexes over 45 million datasets from more that 13,000 websites. These datasets are sourced from various sources including academic repositories, government websites and other available repositories.  The datasets span across various topics and can be used for research and analysis purposes. 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

To allow easy discovery of statistical content, Google has made it simpler for users to access all the information in one place. The top three datasets are available to the users with an option to access “more datasets”.  

Dataset Search indexes datasets that contain structured data from schema.org. A collaborative project between search engines Google, Microsoft, Yahoo and Yandex, scheme.org is a markup language that adds structured data to web pages which assists search engines to interpret the content in a better manner. 


Download our Mobile App



Google has mentioned in their announcement that research material should be freely accessible to everyone as it helps people from various verticals including scientific research, business analysis and public policies. They also want to align themselves with the country’s policy of providing free information for “federally funded research”. 

The results that are returned are not consistent. As shown in the above screenshot, if a user types “India reservoir level data”, the datasets are not listed unless the search query is “India reservoir level dataset”. However, if you search by “USA reservoir level data”, the datasets are visible. This could either be a function of the data available on the internet or that the dataset integration on Google search box is yet to work seamlessly for data from other countries. 

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Vandana Nair
As a rare breed of engineering, MBA, and journalism graduate, I bring a unique combination of technical know-how, business acumen, and storytelling skills to the table. My insatiable curiosity for all things startups, businesses, and AI technologies ensure that I'll always bring a fresh and insightful perspective to my reporting.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.