MITB Banner

Dr Sheng Chuan Wu Of Franz.Inc Talks About Why Data Is Grossly Underestimated

Share

sheng chuan wu

Data Analytics is much more efficient than traditional Business Intelligence solutions because of the quantum jump in computing power it has seen in recent times, as well as the pervasiveness of Big Data. Applying machine learning algorithm to learn abstract patterns from data and interpret its results may take up less than 20 percent of the total effort. But then why do companies fail to integrate data properly and often doom a promising analytics project?

Dr Sheng-Chuan Wu, Vice President at Franz.Inc, spoke on these lines at Cypher 2017, India’s most exciting Analytics summit.

Representing Data:

Resource Description Framework (RDF), said Dr Wu, can be used to represent any form of information in the world.

“In my wide experience and with all the consulting projects that I have worked upon, I still haven’t found a single case or example that I could not model with the simple RDF representation,” said Dr Wu. RDF is a totally schemaless system — you can add more information, you can do whatever you want — it is very flexible, he added.

But according to Dr Wu, there are many naïve views of data analytics. One of the most common among them is the gross underestimation of the effort required to prepare data for analysis, for example, the ETL and integration of data from heterogeneous sources.

AlphaGo:

With the great fanfare of AlphaGo beating the world’s number 1 Go Chess player, Deep Learning (ANN coupled with massive GPU power) has become the face of AI machine learning. However, Deep Learning may work wonderfully with balanced data set such as Go chess games and images, it is not as effective on many other machine learning tasks. There are perhaps more than 10 major machine learning algorithms (with many derivatives), each of which may be good at certain problems but ineffective on others.

Industry 2.0:

“The machines these days have all the sensory information. When you get the sensory information, you must do something about it,” said Dr Wu. Typically, we either try to prevent problems, in some cases improve product quality or reduce energy consumption. All of them use the same machine learning algorithm to predict or to improve the information.

If you consider the failure of IBM Watson at the MD Anderson Cancer Centre of Texas, there are a few takeaways from the incident. Machine Learning technologies are currently the subject of a mega-hype by vendors such as IBM and consulting firms such as PwC. Therefore, boards need to ask searching and tough questions right from the start of an AI project.

Dr Wu summarised the talk by saying that for companies and projects to be successful in the real world, they would require an effective data integration approach. “If you don’t have good data integration, that’s what kills you… Not your algorithm, not your fancy natural language processors…none of that. It is the difficult, dirty work of data integration,” he said.

Share
Picture of Prajakta Hebbar

Prajakta Hebbar

Prajakta is a Writer/Editor/Social Media diva. Lover of all that is 'quaint', her favourite things include dogs, Starbucks, butter popcorn, Jane Austen novels and neo-noir films. She has previously worked for HuffPost, CNN IBN, The Indian Express and Bose.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.