MITB Banner

What is Open Data and where can you find it?

Share

In a very recent development, Boston’s new cloud-hosted open-data website, Analyze Boston, decided to make the data usable and accessible to the users. Powered by CKAN-based OpenGov’s Open Data platform, it would let users interact with data—that is now open!

This and many more instances of open data has been on a rise in the last decade given the surge in internet use and world wide web. Data has become a currency of growth and there had been no better time than today to unlock the potential that data has towards innovation and productivity.

And the increased importance of data brings “Open Data” into picture, with many profit and nonprofit organizations allowing the users to avail this data to further bring newer innovations. Let’s get quick insight onto what open data is and where can you find it!

What is Open Data?

Data that is freely available, can be reused and redistributed by anyone, has no copyright, patents or any other means of control—is Open data for you! But does it come free of cost and unlicensed? “In our opinion, Open Data is not necessarily “Free Data” – it is simply data that is easily accessible to any potential user, with no artificial barriers”, said Eugene Osovetsky, WebServius Founder & CTO on Quora.

Open” and “Free” are two totally different things, in a way that a user can use data in anyway they want but it may not necessarily be free of cost. Any data in machine readable format that is licensed, but permits the use of data in a way that they want is– open data.

Osovetsky further opined that for data to be open it should be available in standard form rather than proprietary formats, should be available on demand with technologies like API and on a pay per use model for premium data feeds. “A modern REST API with self-service signup capabilities, selling some valuable data at $0.001 per record, is more “open” than data that’s sitting on something like Data.gov in a 50GB archived file in some obscure proprietary format – even though the latter is free”, he further added.

“Open can apply to information from any source and about any topic. Anyone can release their data under an open license for free use by and benefit to the public.” says Laura James, CEO of the Open Knowledge Foundation in the blog.

While the concept of open data in science dates back to the formation of World Data Center system in 1950s, the International Council for Science established several World Data Centers later to minimize the risk of data loss and to maximize data accessibility. Since then there have been many sources providing open data for free or with some cost.

What can you do with open data?

Technology giants like Google (Knowledge Graph – Semantic Search) and IBM (Watson – AI) take advantage of free, open data repositories to develop their products. And so do other the companies.

If open data are put to a proper use and analysis, the businesses can reap unscrupulous benefits. It can do so by allowing development of new applications, improvement of products and services, improving economy, giving a chance to make effective and intelligent moves by performing data analytics on open data and much more. It also has a huge potential and opportunity for entrepreneurs.

In government functionalities, it can bring in more transparency by keeping a tab on how the public money is spent or the data collected by the government from universities can help transform Department of Education into a calculator, to help parents and students make more informed financial decisions about their education.

Not just that, open data about weather can provide an early warning system for environmental disasters or for consumers to understand their personal impacts on the environment. Weatherbase and Wunderground are such open data resources.

In a nutshell, open data can enable creation of tools to improve consumer choice and citizen decision-making.

There are highly professional consulting tools available that can help in exploring, analyzing and managing open data such as Open Data Manager.

Where can you find Open Data?

Open Data has gained prominence with President Obama signing the Memorandum on Transparency and Open Government in 2009. In the US, at the federal level, open data facilitated the creation of USASpending.gov, a set of online tools for exploring the federal budget. Since then seventy-five countries have signed the Open Government Partnership.

There are several government website dedicated to distributing a portion of the data they collect as the open data. While open government data is largely rampant, there are several private organizations that are shaping up the open data sources.

For instance, Data.gov lists the sites of a total of 40 US states and 46 US cities and counties with web sites to provide open data. The United Nations has an open data website, that publishes statistical data from Member States and UN Agencies, the World Bank publishes a range of statistical data relating to developing countries, etc.

Socrata is another good place to explore government-related data. They have some visualization tools that make exploring the data easier.

There are open data available for city-specific government such as San Francisco Data, data aggregators such as Programmable Web that lets the user explore APIs, Infochimps that offers thousands of public and proprietary data sets for download and API access, Data Market that has data related to economics, healthcare, food and agriculture and many more.

Google Public data explorer houses a lot of data from world development indicators, while Google BigQuery Public Datasets lists a special group of public datasets that the user can access and integrate into the applications.

Other resources are Kaggle Datasets, Amazon Datasets, Reddit Datasets, Linkeddata, Datahub, Data World and more.

IBM Open Data Platform

IBM along with other big data industry players announced a new initiative called Open Data Platform (ODP) in Feb 2015, that promotes the collaboration and innovation of big data technologies. The focus of ODP is on innovating around the Apache Hadoop open source core, to facilitate the growth of ecosystem and enabling solutions on a standardized Open Data Platform all across.

The idea behind ODP is to bring new big data solutions to the market at a quicker rate that can be achieved by making it easier for the ecosystem vendors to enable and test on a well-defined common Hadoop core platform.

“The common platform defines a specific set of common core Apache Hadoop components and versions. With all the partners testing and certifying to this same common core, customers can test, deploy and gain value from their big data environments more quickly knowing that a set of big data solutions all align to the same common Open Data Platform”, said IBM in its blogpost.

Where to find Open Data in India

While the size of population in India is huge, there is a lack of digital records that primarily hinders the adoption of big data and analytics by the government itself. However, the GOI is efforting towards adopting Big Data systems and collect digital records to help Big Data and Analytics field grow in India.

One such effort by GOI towards an Open Data Policy, under the Department of Information and Technology (DIT) to encourage sharing information between departments and across ministries, is Open Government Data Platform. Aiming to increase transparency in the functioning of Government, it can be accessed by Government of India Ministries/ Departments their organizations to publish datasets, documents, services, tools and applications collected by them for public use.

Few other resources where Open Data can be found in India are-

Challenges of Open Data- How Secure is it?

While open data has given a chance to experiment with growing newer technologies, it is not immune to it’s own set of challenges—privacy and security complexities being the major one. What kind of data is being put up there? While data sharing is necessary in some form or the other, every effort should be given to reduce risks to re-identification of individuals without their express consent.

Various guidelines need to be in place to ensure a safe data exchange process. Additionally, the organizations providing open source need a layer of legal and commercial negotiations to define what data will be shared and for what purpose.

It’s necessary to adhere to the legal framework that protects not only the privacy of the individuals but also rights of participants to open innovation using data.

Share
Picture of Srishti Deoras

Srishti Deoras

Srishti currently works as Associate Editor at Analytics India Magazine. When not covering the analytics news, editing and writing articles, she could be found reading or capturing thoughts into pictures.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.