Dark Data refers to the data that the enterprise collects through different internal processes, but which are not used by the business as inputs into their analytics models. While it may sound incredulous, dark data come to exist for multiple reasons within the company and are often the result of siloed thinking.
For example, a CCTV camera that has been set-up by a major retail chain is primarily there to detect, and alert instances of shoplifting. However, it is also actively capturing many other data points of interest – like the shopper behaviour data, the efficiency of a shelf display, etc. which can provide valuable insights into planning and designing the store. However, it is highly likely that the company is not using some or all of these “extraneous” systems to build recommendation systems. When this happens, these data points will enter into the realm of dark data for the enterprise, signifying a lost opportunity.
According to IDC, 90% of the unstructured data are never analyzed. Such data is known as dark data. According to Gartner, dark data is “the information assets organizations collect, process, and store during regular business activities, but generally fail to use for other purposes.”
It is necessary to formulate a sound strategy to uncover and identify the potentially valuable use cases within your dark data. But first, you need to identify the sources that are creating the stream of dark data within your company. Once you have identified the data sources and use cases, you can create a plan, have a projection for your expected ROI and then begin working in the right direction to derive the optimal and desired success with analytics.
Where does Dark Data come from?
Organizations acquire Dark Data through various operational sources. Organizations, in some cases, are not even aware that they are collecting dark data. And in other cases, they may report that their current business intelligence system is inadequate to analyze the dark data. But, most successful organizations regularly report that they have audited their dark data, and use it appropriately as the fuel for insightful decision making. Often dark data is in an unstructured format, because it throws up a variety of challenges, mainly because they are complicated to categorize.
How do they come to exist in the first place?
- High Throughput of Data Collection – often exceeding the rate at which they can be analyzed. In TDWIs 2018 Report: What it Takes to be Data-Driven, 88% of surveyed respondents said that one of the original analytics tools – spreadsheets – is used to drive decision-making at their companies. Spreadsheets are not equipped to deal with and analyze the massive volume of data that is being generated, collected, and which is required to be processed at enterprises. Because of the inability of the analytics tool to integrate with all sources of data, process and translate into useful insights for business, many data points become dark data.
- Functional Silos – Lack of collaboration and sharing of data assets among different departments and functions because of historical or cultural reasons. It may also happen because there is not a consensus on the expected benefits, or a strong mandate from the top, limiting the benefits derived from the processes. The silos become a mindset, creating barriers to innovation. A recent American Management Association (AMA) survey revealed that 83 percent of executives said that silos exist in their companies. 97 percent think they have a negative effect, and 31 percent believe that silos have enormously destructive consequences. Siloes create organizational blindness which begets dark data.
- Low Cost of Storage – Because the storage costs have plummeted, companies are now storing vast amounts of data, although they may not be using presently. Companies are hopeful that the data will be useful in the future. Because of this, businesses are collecting and storing a vast amount of data. Without having a clear plan and coherent strategy to utilize these as inputs into existing analytics models.
- Regulatory Reasons – With the rising instances of data misuse, scandals such as Facebook – Cambridge Analytica Data Scandal, companies are now faced with intense scrutiny from regulatory authorities about maintaining the due diligence and safeguarding their data especially sensitive consumer data. Regulations such as GDPR, CCPA, Data Protection Act 2018 UK, are making it mandatory for companies to store specific data and metadata that will be needed for regulatory audit purposes. Because of this, extra data is being collected. However, because there is no business drive behind storing the data, not all are being used in a strategic way of delivering business benefits.
- Technology Limitations – Many times, organizations may not be willing to invest in large volume and unstructured data analysis. It could be because the data analysis may be too complicated, expensive, or they may not have the technical expertise available to process it and map it to actual business objectives.
What are The Pitfalls of Dark Data For Enterprises?
Dark data can contain potentially useful information such as survey responses, consumer behaviour and reviews, call centre transcripts, user behaviour. When organizations do not have a trustworthy framework in place to streamline data collection, they can risk missing out on valuable insights that data might throw for them, and this might directly translate into lost revenues or loss of competitive advantage.
For example, if the customer service team is not sharing the tickets data with the sales team, then the sales team might not be able to prioritize their calls and follow-ups accordingly. Or if user feedback is not being collected across all social channels, then the company risks losing out on valuable insights which might result in user churn
How to overcome this?
Your data could be telling you all kinds of things. But you need to first identify what you need to answer and map it to the right data sources. Each goal requires multiple technologies to deliver accurate and actionable insight.
Identify champions of analytics within each of your functions, who will take up this initiative to identify and tap into dark data within these functions. That will help you design an effective business intelligence programme within your enterprise.
Understand how your workforce is utilizing the tools and processes, and what is your current state. Set up a robust governance mechanism, have a roadmap to define your to-be state, which will also help you to understand how to get there.
(Disclosure: I work for Polestar Solutions)