Listen to this story
Solving problems is the objective and the task of a data scientist. Finding datasets to test and train your models is probably the first of the steps to take before even starting to build a model. Moreover, sometimes getting access to a dataset can open windows of imagination and give you an idea about building a machine learning model by utilising the data.
Lately, the Government of India has been making much of their data publicly available. There are probably a lot of problems that the government expects developers and scientists to fix. Solving anything for India-based on Indian data is an arduous task as well as a great responsibility.
We have combined a list of 10 publicly available Indian government datasets that you can access for free and use for data analysis and machine learning models:
Sign up for your weekly dose of what's up in emerging technology.
Set up by the National Informatics Centre (NIC), OGD Platform India runs in compliance with Open Data Policy of India. The website contains a database of government-owned systems that is accessible for information and is also compiled in a machine readable format. In addition, it publishes datasets, tools, documents, and applications that are available for public use. The community portal consists of infographics, blogs, and visualisations that can be contributed by everyone following a verification.
The RBI website has macroeconomic data with indicators about the state of the Indian economy, including data about the financial sector like the banking revenues and transactions, saving, incomes, employment, and various others that can be used for analysis. The website was founded by the RBI to facilitate proper management and transparency of the economy of the nation. Researchers and analysts can access the data for working on their projects related to the financial sector.
Under the Ministry of Home Affairs, this website offers visualisations, tables, medical certifications of death along with offering the National Population Register (NPR), National Linguistic Survey and more. The data is available for researchers to study the microeconomic specificities of the population of India, with graphics consisting of maps.
The Indian Geo-Platform developed by ISRO offers users to download their 2D and 3D maps with details like the environment and other natural features. The data can be beneficial for building urban and rural projects along with monitoring forests, water resources, and agriculture. The government encourages researchers to use the platform for the development of management services and eGovernance use cases.
Along with updates on the COVID-19 guidelines, the MOHFW site holds information about the health department of different hospitals and states. The Government of India promotes researchers and scientists to use the data to build applications related to healthcare.
Developed with an objective to provide a single window to access to information and services about all government activities and entities, the National Portal of India was designed by collaboration between National Informatics Centre and Ministry of Electronic & Information Technology. The website has different sections of information like rural development, tourism, fisheries, among others.
Released by the Wildlife Institute of India, the dataset has altogether 4591 specimens housed at WII herbarium. Nearly all of these specimens are digital with the GBIF network and thus allow researchers to use them quickly for data analytics applications. The research data is regularly used by researchers and scientists to monitor and help preserve wildlife species.
India.AI has a collection of several datasets ranging from weather, transport, deaths, signboards, traffic, and pricing among others that can be leveraged for use in data analysis. Along with data for health and agriculture, they also host advertising, computing, twitter, and other datasets for commercial projects. Developers can also contribute to the dataset following a verification from the primary providers.
The Department of Economic Affairs has launched a unique page on their website to display the data and statistics about the financial and economic state of the country. The website has information about the external debt, central government borrowing, monthly economic report, and a national summary data page.
NFSC holds state-wise data about food, nutrition and several other initiatives taken by the government and is managed by the Department of Food & Public Distribution. It also has information about the fair price shops, distribution and usage of ration cards along with regular updates about their status and numbers. Apart from data, the website also allows a grievance page where citizens can file complaints about the food security system.