Will Data Engineer Take Over Machine Learning Work In Future, Surpassing Data Scientist?

Over the years, there has been a lot of hype around machine learning, and there is enormous interest in data science as a career. But what about the future of data science in general? We know that job titles are fragile and job responsibilities that prove themselves add value to the business and become sustainable. 

Talking about the sustainability of data science as a lucrative career, it is important to analyse where the value exists. Does the value exist in model development and production, or does data science shift more towards empowering business users? In this article, we will take a look at how data engineering gives an advantage in the future of deploying ML models.

At a time when most data science models don’t go into production, the data science skillset alone may not be sustainable unless there are better engineering techniques which get embedded in the ML pipelines. If most ML models don’t go in production, they simply don’t add any business value. 

Experts say that data scientists should focus their efforts on providing value through work other than deploying machine learning in production. This can eventually lead to a shift towards more engineering focus skills acquisition as automation becomes mainstream.

“Data scientists do not have a sustainable advantage over engineers in producing models of sufficient quality which can be leveraged in production,” said Eduardo Arino de la Rubia, a renowned data scientist and machine learning expert during his insightful presentation at the 2020 RStudio Conference. 

Data Engineer Vs Data Scientist

Analysts say machine learning engineers are likely going to take the ML work that data scientists currently do and will create off-the-shelf ML tools such as AutoML, hence reducing the need for data scientists to perform ML tasks. 

“There is more of engineering which data scientists need to learn when it comes to data science. Data scientists have been dedicating their time on learning algorithms and tools for machine learning, which is great, but what if they have not learnt the engineering aspect. This is why in the market, there are new roles such as ML engineer, AI engineer or a data engineer,” said Lavi Nigam, Data Scientist at Gartner at AIM’s Plugin event.

If you look at data engineers, they develop, test, and maintain architectures, and align them with business needs. Data engineers also develop data processes, use programming and tools, and also prepare data for predictive and prescriptive modelling. The most important thing is that they have the software and other technical knowledge on how to embed machine learning into systems, which many data scientists don’t.

In this context, Lavi Nigam said, “Engineering will play the most important role in the new age ML pipelines in future. That’s why you should focus on learning more on the engineering aspect of data science, which also helps data scientists prepare for automation.” 

Going Forward

If businesses are going to automate model production and focus more on engineering, how can data science bring value?

Things like exploratory data analysis (EDA) and feature engineering will be hard to fully automate, and so traditional statistical modelling will still be very critical. Developing models in the small sample limit will always add value, just the tooling can be easier and more integrated with data engineering.

According to experts, outside of model production, data science will continue to be important in understanding the business problem and data, formulating the best approach, inferencing and explaining the results to stakeholders.  Data science adds value by making every part of business data-informed. That is a competitive advantage, and that is one area other than just modelling production that businesses need. It can further add tremendous value with negotiation, communication and business influence. 

Download our Mobile App

Vishal Chawla
Vishal Chawla is a senior tech journalist at Analytics India Magazine and writes about AI, data analytics, cybersecurity, cloud computing, and blockchain. Vishal also hosts AIM's video podcast called Simulated Reality- featuring tech leaders, AI experts, and innovative startups of India.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.