How to build better data products

According to a survey, less than 40% of organisations are managing data as a business asset.

Neeraj Gehani, Product Director at dunnhumby, made a case for having a product mindset at the fourth edition of the Machine Learning Developers Summit (MLDS) during his session titled ‘Emergence of Data Products.’ He unpacked the reasons for data products becoming increasingly important, types of data products, the framework for building data products and the unique challenges in building data products. 

Data products

“Value delivery is about personalisation. From a competitive advantage perspective, it is about customer retention, profitability, or using the data to create a competitive edge around the business. And from a strategic differentiation perspective, companies have unique data assets, unique algorithms or business models, and all these things come together to create the flavor of data products,” said Gehani.

There are three types of data: raw data, aggregated data and data from ML models. Any kind of data in the form of transactional data–for example, the data that gets stored in Google Cloud, Amazon s3, Microsoft Azure etc–meant for internal use is raw data. The data is built or maintained by data engineers. The next level of evolution is around aggregated data to build dashboards to get insights. Such data sets are internal. The dashboards offer insights or reports for decision making: The datasets are typically built and maintained by business analysts with support from the developers. Automated platforms, like Optimizely, produce data in an automated fashion for testing, model telemetry, etc. Such platforms are built and maintained by data scientists and machine learning engineers.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.


Of late, vendors are coming up with a lot of tools, from infrastructure side to machine learning side, analyst side, enterprise applications, security data sets etc.

“The developer community is heavily invested in learning the relevant skills to capitalise on the demand in the market. There are a lot of people undertaking courses in data analysis, SQL, Python, and ML. In fact, from May 2019 to May 2020, there has been a 300% increase in total enrolments for machine learning courses. But, according to a survey, less than 40% of organisations are managing data as a business asset and creating a data driven organisation,” informed Gehani.

The disconnect is palpable. The vendors are rolling out data products and training the community. But when it comes to creating a data driven organisation, forging a data driven culture, or managing data as a business asset, the numbers are not positive. Something is really off in terms of why businesses are not getting the kind of value they would expect from datasets.

Better RoI

It is essential to build a rapport with business teams, rather than work in silos. From a solutioning perspective, data scientists, developers and machine learning engineers need to start operating with a product mindset. 

Once the developers have built a model and a dashboard, it is essential to keep iterating the product to keep them relevant. This comes with a product mindset in terms of thinking through the value chain end to end. “I think this is something our community is missing. It starts from defining business problems at a very high level in terms of what are the objectives, what are the use cases? What is the benchmarking success criteria?” said Gehani.

Even if data scientists have defined the business problem, it is essential to understand the data in terms of whether the right data is available, identify the gaps in the quality, clean up the data, look for things like missing value imputation, labeling all those things along with feature engineering. Then, when model development is important for predictive kinds of products, it’s all about selecting the modeling techniques to building models and then evaluating models.

“The point I’m trying to make is that as a community, I think we are very, very siloed. So we have people who are focused on understanding data, and making sure that data is set up for success. They are focused on model development, but people who think end-to-end value chain, are limited.” said Gehani. “So just building a science model alone will not help. You have to think through this end-to-end and think about, all the areas where integrations will happen.”

Meeta Ramnani
Meeta’s interest lies in finding out real practical applications of technology. At AIM, she writes stories that question the new inventions and the need to develop them. She believes that technology has and will continue to change the world very fast and that it is no more ‘cool’ to be ‘old-school’. If people don’t update themselves with the technology, they will surely be left behind.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox