MITB Banner

Why Data Science Struggles, a counterview – Effective Data Science Series: 3 of 5

Share

When you take an honest look at the great data science experiment that has occurred worldwide, you find that the promises made by the advocates of data science have been far from the reality of what has been delivered. Once the data scientists were hired and then were asked to deliver, the results that have been achieved have been underwhelming, to say the least.

So why have the data scientists failed to deliver? There are a host of reasons.

A good starting point to explain what is going on is to look where data scientists have aimed their efforts at. Fig 1 shows where the efforts of the data scientists have been.

Data scientists have conducted their efforts primarily in two places – the world of structured data and the world of machine-generated data. From a strategic standpoint, this seems like a reasonable thing to do. There is a wealth of business value in structured data. But there is a problem here. The problem is that the analytical processing in the world of structured systems is already well traveled. Business analysts have been looking at the world of structured data for years now. It is sort of like looking for gold in California. In 1849 there was gold that was easily found in California. But today all of the easily found gold in California is gone.

Fig 2 shows this phenomenon.

So there is plenty of business value in structured data. But for new exciting opportunity, there is very little of that to be found in structured data.

The other place where data scientists are spending a lot of their capital is on machine generated data. Fig 3 shows the effort to find business value in machine generated data –

There is a lot of promise in looking at machine generated data. The data – for the most part – is virgin. It has never before been examined. And secondly, the structure of the data is uniform, or at least reasonably uniform. Both of these factors make the machine generated data very promising.

But there are some very serious drawbacks to looking for business value here. Some of the drawbacks are –

  • The data is hard to find. There is so much that interesting values “hide” behind a mountain of other data
  • The data is hard to find. In order to have business value the data must be captured and compared to other data. Finding these relationships is hard to do, especially in light of the sheer volume of data that has to be manipulated
  • The data is hard to find. The volume of data and the technology used to manage the large volumes of data are optimized on the management of the data, not the analysis of the data.
  • When interesting data is found, it is operational, not strategic. Unfortunately, operational data is less useful than strategic data.

But the final reason why data science struggles so much in this arena is that in many cases there just is not that much business value to be found in the first place.

No wonder the results achieved by the data scientist have been so underwhelming. In the world of structured data most of the good results have already been found. And in the world of machine generated data interesting data is hard to find if it is even there at all. Or if it is there it is operational, not strategic.

PS: The story was written using a keyboard.
Share
Picture of William Inmon

William Inmon

William H. Inmon (born 1945) is an American computer scientist, recognized by many as the father of the data warehouse. Bill Inmon wrote the first book, held the first conference (with Arnie Barnett), wrote the first column in a magazine and was the first to offer classes in data warehousing. Bill Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India