MITB Banner

What pisses off data scientists the most 

For machine learning engineers, the biggest roadblock to getting models to production is access to compute resources.

Share

Illustration by What pisses off data scientists the most

Almost half of the work of data scientists who are involved in machine learning production involves re-coding models from Python/R to another language or vice versa. According to Anaconda’s State Of Data Science 2021 report, re-coding models was the biggest roadblock to production for those involved in infrastructure.

The survey that involved more than 4000 respondents from 140 countries also had a view that meeting IT security standards is the top blocker for data engineers, DevOps, product managers, and system admins. Also, for machine learning engineers, the biggest roadblock to getting models to production is access to compute resources.

From working in silo to cleaning massive data on legacy infrastructure, the most promising job of the century seems to be losing its charm. There are many reasons why data scientists are unhappy or decide to quit, but there are also many that just pisses them off.

Too much pressure and no scope for experimentation

Data scientists are expected to do magic with their use of data and just solve all the issues of the company and its sales. This requires long working hours, extremely short deadlines and clearly no scope for error. This is not really a healthy environment for the ‘scientist’ to grow and mature. 

Employers don’t understand what the data scientist is doing

According to the Anaconda survey, 25% of respondents said that a lack of data literacy among decision-makers at their organisation limited their team’s ability to impact business decisions. This lack of data literacy at the executive level ultimately hurts the ability to make data-driven business decisions. The data scientists think that they are brought in to write smart machine learning algorithms and create analytic reports. But many times, the company needs a chart to present in board meetings each day. This gap leads to frustration on both sides where the company doesn’t see value being driven ‘quickly enough’ and the data scientist just becomes unhappy in their role.

End up becoming a PM

Many data scientists are hired in companies where there is no setup for analysis at all. The employee needs to hire more people and build the infrastructure from scratch. This makes them more of a product manager, and they don’t really get to work with data and build models. This issue props us as many companies fail to hire senior/experienced data practitioners before hiring juniors. This is the perfect recipe for an unhappy relationship for both parties. 

Passed the peak of inflated expectations

  • Data Scientist is the sexiest job of the 21st century – Harvard Business Review.
  • Data science is one of the top professions – LinkedIn and Glassdoor
  • The average data scientist salary in the US is between $117 – 120K, way more than an experienced software developer

Too many uncooked inflated benefits that come from being a data scientist have just dropped. Employees have come to realise that the reason they joined the glorious data science actually doesn’t exist, and it is just another job.

Other departments not being data-centric

For data to actually show results, it is essential that data scientists and the different departments of a company have multiple collaborations. But it is quite strange that most of the time, data scientists actually work alone in silos. What data scientists find the most difficult to do is politics. Each team across the organisation needs to be data-centric and collect data in a proper way for algorithms to work. While it is not achievable immediately, many departments also feel that while asking for data, the data scientists are ‘after them.’ Also, with a lot of politics in the corporate culture, it is difficult for data teams to bring policies into place. 

Legacy infrastructure

Many companies still rely on legacy systems and do not have proper machine learning tools. The data scientist entrant comes from an academic world and has been on platforms like Kaggle, GitHub, and other open-source projects. They want to work on high-end projects, but in reality, they spend a large part of their time making sense of the data. The Anaconda report states that machine learning engineers find compute resources to be the most significant roadblock to deploying models to production.

Sorting garbage data

A large part of a data scientist’s job is monotonous and requires cleaning and processing raw data. Almost 80 per cent of his/her time is spent doing that. For many companies, it, in fact, has to be started at digitalising data from handwritten or files.  

Source: Anaconda State of Data Science Report 2021

Not appreciated as non-technical managers

Let’s look at the sales manager taking all the credit for cracking amazing deals throughout the quarter with the help of the data analysis done by the team of less than 5 data scientists, something which took them months to work on. The other side of it also involves the non-technical executives making many assumptions about the skills of a data scientist, if he/she is not an expert in just one of Spark, Hive, Pig, Hadoop, SQL, MySQL, Neo4J, Python, R, Tensorflow, Scala, Pytorch, A/B Testing, NLP or anything machine learning. Frustrating, isn’t it?

Explaining being burnt out

After the long working hours, rigging through data, and managing office politics, it appears hard to believe for people how a data scientist can be burnt out with the ‘sexiest’ job of the century?

Share
Picture of Meeta Ramnani

Meeta Ramnani

Meeta’s interest lies in finding out real practical applications of technology. At AIM, she writes stories that question the new inventions and the need to develop them. She believes that technology has and will continue to change the world very fast and that it is no more ‘cool’ to be ‘old-school’. If people don’t update themselves with the technology, they will surely be left behind.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.