Data institutions have become more prominent in recent years. In 2020, the global pandemic has shown us how tackling global challenges need access to data from across the public, private and not-for-profit organisations. The data institutions effectively collect and share datasets at a worldwide scale, which is playing a vital role in helping governments, businesses and communities respond to the pandemic.
Access to the accurate data can help us to tackle the notable challenges we face – from the earlier discovery of disease to decreasing pollution in urban spaces. We have seen the fight against COVID-19 where a community of data scientists, researchers and engineers exploring innovative ways to analyse and use data, to create hourly predictive outcomes.
Data institutions are not an outcome of the recent health crisis. In fact, there are many essential data institutions which have been around for a long time. One of the first was the UK biobank, and it’s been around since 2006 doing extraordinary work, established to steward genetic data and samples to make them available under specific conditions for health research and development.
A BioBank comprises biological data samples that have been taken for research. They have massive scale resources that link biology with health or other data, including death and cancer registries. The data is available as an open-access resource to all of those undertaking health-related research for the public good. It has provided an extraordinary resource to further our understanding of a whole range of diseases from cancer to dementia, and improve the UK’s health outcomes.
The UK biobank was in the news recently when it implemented rapid dynamic linkage with Public Health England’s Second Generation Surveillance System. This enabled UKB to provide a regular feed on new COVID-19 test results and allow practical research into the epidemiological and human genetic risk factors of severe infections.
Data institutions are combining or linking data from various sources and providing insights and other services back to those that have contributed data. In the maritime sector, HiLo takes data produced by around 3,500 ships worldwide to deliver vital risk and safety analyses associated with lifeboat accidents, engine room fires and other incidents. Another example is how OpenCorporates consolidates and makes important open data on companies, factories and funding accessible individually.
Data Institutions Manage The Infrastructure Of Open Data
Data institutions are organisations that create and maintain a data infrastructure which is a critical part of making data available for tackling global challenges. Anyone can access, use and share this data which increases both transparency and the innovation potential. There are many examples of data institutions emerging now, and we can see that if we’re to unlock the real social and economic environmental potential of data, we’re going to need more data institutions.
Developing and managing common data infrastructure for a sector or field is critical, such as by registering identifiers or issuing open standards. In the UK, Open Banking Limited was founded in 2016 to develop standards and guidelines to stimulate competition and innovation in the retail banking sector.
Recently, the Indian government published its national data strategy for consultation. It outlines the government’s intention to support the nation by increasing the availability of data, skills and knowledge to use it responsibly. It has pledged to address the barriers to data sharing to better understand the world around us. This needs to emphasise the importance of an open and trustworthy data infrastructure and the role of data institutions in making that happen.
Solving Complex Issues By Sharing Data
By learning the distinct parts of a system, and how they interact, organisations can better explain how the system as a whole might be changed. But this needs collaborative and open datasets to address difficulties like switching to renewable energy, advancing healthcare outcomes and other problems. To change a system, experts first have to understand it by using data analytics and AI models.
For example, the Open Data Institute’s Data Ecosystem Mapping tool promotes systems thinking by encouraging people to learn how data, insights and information flows between companies by mapping out these complex relationships. A data ecosystem consists of data infrastructure, people, communities and businesses that benefit from the value created by it.
Sharing data with rivals can be counterintuitive for companies, but there are various instances which show the value and efficacy of leveraging a data ecosystem to approach the particularly common challenge. For example, HiLo Maritime Risk Management is a not-for-profit joint industry enterprise which applies a predictive mathematical model to improve industry safety. The insights generated guide partner operations to circumvent likely high impact incidents.
“Transparency and an organisation’s ability to explain the AI algorithms builds trust. It also enables teams to monitor how decisions are made, and if necessary address failings, bias or problems in the system. For instance, major healthcare problems like the ongoing pandemic can only be solved if scientists, charities and pharmaceutical companies work together, and sharing data can be a vital part of these collaborations,” wrote Leigh Dodds, Director of Delivery at Open Data Institute in a blog.
According to experts, many economic sectors could take advantage of institutions that steward and provide access to data. Take wildlife conservation, for example. In 2019, Open Data Institute investigated using a ‘data trust’ – a type of data institution – to empower academics and conservationists to distribute data with app developers to help stop international illegal wildlife trade.
Open Data Institute works with other organisations to develop the next generation of data institutions. The organisation is also exploring a new partnership with Microsoft with a commitment to creating 20 new data collaborations and institutions by 2022. ODI also worked with the Greater London Authority to investigate how data trusts could offer better city services, and with WRAP to evaluate how data trusts could assist with the mission to overcome global food waste.
“Creating a data ecosystem map helps to understand how data creates value. It identifies the data, data stewards and data users; the different roles they play; and the relationships between them,” Dodds wrote.
But data trusts are not the only way to manage access to data. ODI’s research found that there is tremendous demand from private, public and third sector organisations in nations around the world to improve the sharing of data. There are also a family of data institutions emerging to carry people and communities to take a more active role in stewarding data concerning themselves. These include data co-ops, data coalitions, data unions and data trusts.
Data Institutions Are Bound By Governance To Be Transparent
Data institutions’ mission involves stewarding data on behalf of others, often towards the public, educational or charitable purposes. But data institutions need the right rules and mechanisms to govern the ethical and fair use of data. Ensuring that data is available to approach these difficulties, in ways that don’t create harm to people and communities, needs responsible data stewardship.
This, in turn, needs the appropriate technical infrastructure and pipelines across a diverse set of people, organisations, and communities to be involved in working with it and creating an open and collaborative ecosystem. The great thing about an open and collaborative data institution is that the checks and balances are built in all aspects of processes from technical to governance. To free up and make data available in the most effective ways, it needs an understanding of the various challenges that are not merely technical but also related to governance and standards.
The UK biobank, for example, has an equitable and transparent access policy. The data samples and operations take place through a coordinating centre hosted by the University of Manchester, an independent ethics and governance council advises the board and funders. They also publish public reports on the conformance of UK biobank to an ethics and governance framework.
Join Our Telegram Group. Be part of an engaging online community. Join Here.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Vishal Chawla is a senior tech journalist at Analytics India Magazine and writes about AI, data analytics, cybersecurity, cloud computing, and blockchain. Vishal also hosts AIM's video podcast called Simulated Reality- featuring tech leaders, AI experts, and innovative startups of India.