Over the years, there has been a lot of hype around machine learning, and there is enormous interest in data science as a career. But what about the future of data science in general? We know that job titles are fragile and job responsibilities that prove themselves add value to the business and become sustainable.
Talking about the sustainability of data science as a lucrative career, it is important to analyse where the value exists. Does the value exist in model development and production, or does data science shift more towards empowering business users? In this article, we will take a look at how data engineering gives an advantage in the future of deploying ML models.
At a time when most data science models don’t go into production, the data science skillset alone may not be sustainable unless there are better engineering techniques which get embedded in the ML pipelines. If most ML models don’t go in production, they simply don’t add any business value.
Experts say that data scientists should focus their efforts on providing value through work other than deploying machine learning in production. This can eventually lead to a shift towards more engineering focus skills acquisition as automation becomes mainstream.
“Data scientists do not have a sustainable advantage over engineers in producing models of sufficient quality which can be leveraged in production,” said Eduardo Arino de la Rubia, a renowned data scientist and machine learning expert during his insightful presentation at the 2020 RStudio Conference.
Data Engineer Vs Data Scientist
Analysts say machine learning engineers are likely going to take the ML work that data scientists currently do and will create off-the-shelf ML tools such as AutoML, hence reducing the need for data scientists to perform ML tasks.
“There is more of engineering which data scientists need to learn when it comes to data science. Data scientists have been dedicating their time on learning algorithms and tools for machine learning, which is great, but what if they have not learnt the engineering aspect. This is why in the market, there are new roles such as ML engineer, AI engineer or a data engineer,” said Lavi Nigam, Data Scientist at Gartner at AIM’s Plugin event.
If you look at data engineers, they develop, test, and maintain architectures, and align them with business needs. Data engineers also develop data processes, use programming and tools, and also prepare data for predictive and prescriptive modelling. The most important thing is that they have the software and other technical knowledge on how to embed machine learning into systems, which many data scientists don’t.
In this context, Lavi Nigam said, “Engineering will play the most important role in the new age ML pipelines in future. That’s why you should focus on learning more on the engineering aspect of data science, which also helps data scientists prepare for automation.”
Going Forward
If businesses are going to automate model production and focus more on engineering, how can data science bring value?
Things like exploratory data analysis (EDA) and feature engineering will be hard to fully automate, and so traditional statistical modelling will still be very critical. Developing models in the small sample limit will always add value, just the tooling can be easier and more integrated with data engineering.
According to experts, outside of model production, data science will continue to be important in understanding the business problem and data, formulating the best approach, inferencing and explaining the results to stakeholders. Data science adds value by making every part of business data-informed. That is a competitive advantage, and that is one area other than just modelling production that businesses need. It can further add tremendous value with negotiation, communication and business influence.