What is Open Data and where can you find it?

In a very recent development, Boston’s new cloud-hosted open-data website, Analyze Boston, decided to make the data usable and accessible to the users. Powered by CKAN-based OpenGov’s Open Data platform, it would let users interact with data—that is now open!

This and many more instances of open data has been on a rise in the last decade given the surge in internet use and world wide web. Data has become a currency of growth and there had been no better time than today to unlock the potential that data has towards innovation and productivity.

And the increased importance of data brings “Open Data” into picture, with many profit and nonprofit organizations allowing the users to avail this data to further bring newer innovations. Let’s get quick insight onto what open data is and where can you find it!

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

What is Open Data?

Data that is freely available, can be reused and redistributed by anyone, has no copyright, patents or any other means of control—is Open data for you! But does it come free of cost and unlicensed? “In our opinion, Open Data is not necessarily “Free Data” – it is simply data that is easily accessible to any potential user, with no artificial barriers”, said Eugene Osovetsky, WebServius Founder & CTO on Quora.

Open” and “Free” are two totally different things, in a way that a user can use data in anyway they want but it may not necessarily be free of cost. Any data in machine readable format that is licensed, but permits the use of data in a way that they want is– open data.

Osovetsky further opined that for data to be open it should be available in standard form rather than proprietary formats, should be available on demand with technologies like API and on a pay per use model for premium data feeds. “A modern REST API with self-service signup capabilities, selling some valuable data at $0.001 per record, is more “open” than data that’s sitting on something like Data.gov in a 50GB archived file in some obscure proprietary format – even though the latter is free”, he further added.

“Open can apply to information from any source and about any topic. Anyone can release their data under an open license for free use by and benefit to the public.” says Laura James, CEO of the Open Knowledge Foundation in the blog.

While the concept of open data in science dates back to the formation of World Data Center system in 1950s, the International Council for Science established several World Data Centers later to minimize the risk of data loss and to maximize data accessibility. Since then there have been many sources providing open data for free or with some cost.

What can you do with open data?

Technology giants like Google (Knowledge Graph – Semantic Search) and IBM (Watson – AI) take advantage of free, open data repositories to develop their products. And so do other the companies.

If open data are put to a proper use and analysis, the businesses can reap unscrupulous benefits. It can do so by allowing development of new applications, improvement of products and services, improving economy, giving a chance to make effective and intelligent moves by performing data analytics on open data and much more. It also has a huge potential and opportunity for entrepreneurs.

In government functionalities, it can bring in more transparency by keeping a tab on how the public money is spent or the data collected by the government from universities can help transform Department of Education into a calculator, to help parents and students make more informed financial decisions about their education.

Not just that, open data about weather can provide an early warning system for environmental disasters or for consumers to understand their personal impacts on the environment. Weatherbase and Wunderground are such open data resources.

In a nutshell, open data can enable creation of tools to improve consumer choice and citizen decision-making.

There are highly professional consulting tools available that can help in exploring, analyzing and managing open data such as Open Data Manager.

Where can you find Open Data?

Open Data has gained prominence with President Obama signing the Memorandum on Transparency and Open Government in 2009. In the US, at the federal level, open data facilitated the creation of USASpending.gov, a set of online tools for exploring the federal budget. Since then seventy-five countries have signed the Open Government Partnership.

There are several government website dedicated to distributing a portion of the data they collect as the open data. While open government data is largely rampant, there are several private organizations that are shaping up the open data sources.

For instance, Data.gov lists the sites of a total of 40 US states and 46 US cities and counties with web sites to provide open data. The United Nations has an open data website, that publishes statistical data from Member States and UN Agencies, the World Bank publishes a range of statistical data relating to developing countries, etc.

Socrata is another good place to explore government-related data. They have some visualization tools that make exploring the data easier.

There are open data available for city-specific government such as San Francisco Data, data aggregators such as Programmable Web that lets the user explore APIs, Infochimps that offers thousands of public and proprietary data sets for download and API access, Data Market that has data related to economics, healthcare, food and agriculture and many more.

Google Public data explorer houses a lot of data from world development indicators, while Google BigQuery Public Datasets lists a special group of public datasets that the user can access and integrate into the applications.

Other resources are Kaggle Datasets, Amazon Datasets, Reddit Datasets, Linkeddata, Datahub, Data World and more.

IBM Open Data Platform

IBM along with other big data industry players announced a new initiative called Open Data Platform (ODP) in Feb 2015, that promotes the collaboration and innovation of big data technologies. The focus of ODP is on innovating around the Apache Hadoop open source core, to facilitate the growth of ecosystem and enabling solutions on a standardized Open Data Platform all across.

The idea behind ODP is to bring new big data solutions to the market at a quicker rate that can be achieved by making it easier for the ecosystem vendors to enable and test on a well-defined common Hadoop core platform.

“The common platform defines a specific set of common core Apache Hadoop components and versions. With all the partners testing and certifying to this same common core, customers can test, deploy and gain value from their big data environments more quickly knowing that a set of big data solutions all align to the same common Open Data Platform”, said IBM in its blogpost.

Where to find Open Data in India

While the size of population in India is huge, there is a lack of digital records that primarily hinders the adoption of big data and analytics by the government itself. However, the GOI is efforting towards adopting Big Data systems and collect digital records to help Big Data and Analytics field grow in India.

One such effort by GOI towards an Open Data Policy, under the Department of Information and Technology (DIT) to encourage sharing information between departments and across ministries, is Open Government Data Platform. Aiming to increase transparency in the functioning of Government, it can be accessed by Government of India Ministries/ Departments their organizations to publish datasets, documents, services, tools and applications collected by them for public use.

Few other resources where Open Data can be found in India are-

Challenges of Open Data- How Secure is it?

While open data has given a chance to experiment with growing newer technologies, it is not immune to it’s own set of challenges—privacy and security complexities being the major one. What kind of data is being put up there? While data sharing is necessary in some form or the other, every effort should be given to reduce risks to re-identification of individuals without their express consent.

Various guidelines need to be in place to ensure a safe data exchange process. Additionally, the organizations providing open source need a layer of legal and commercial negotiations to define what data will be shared and for what purpose.

It’s necessary to adhere to the legal framework that protects not only the privacy of the individuals but also rights of participants to open innovation using data.

Srishti Deoras
Srishti currently works as Associate Editor at Analytics India Magazine. When not covering the analytics news, editing and writing articles, she could be found reading or capturing thoughts into pictures.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox