Data science is a domain where working from home needs specific conditions, including the type of projects, access to tools, kind of tasks, staff engagement, and connectivity, and collaboration with the rest of the team/company. But such factors alone can be the source of problems for a data science team to be productive and efficient.
While you may think that not much has to change, and data science professionals can work smoothly from their homes, it may not be the case at all. According to AIM Research, 34% of analytics have reported a negative impact on their productivity due to work from home scenarios. The fact of the matter is that a lot of teams which are new to being remote and may face a host of unforeseen challenges.
For most data teams, the abrupt shift to remote was also forced and has led to many challenges. The majority of data exists on on-prem servers, and this is one of the critical challenges which led to forced cloud migrations.
Now that companies have built fully distributed remote data teams, to protect the sensitive data, identity access management have been put in place. Such policies make it challenging for data scientists to work smoothly and experiment with data to use their creativity.
One challenge is access policies to data scientists if they are quite complex and layered. Data could be behind firewalls due to cybersecurity challenges. Given the many tools data scientists use, when they switch to remote, it can be tough for the IT team to figure out a smarter way to go about permissions and access to data securely.
Collaboration & Communication Challenges
To ensure a data scientist’s productivity from home/remote location, there need to be supportive data science processes and tools with a focus on collaboration and reproducibility. Ad-hoc communication can be the most challenging part, and so it is crucial to have collaboration tools. Team members are more at risk to feel lonely or distracted from the normal collaborative work environment. Remote team members can easily lose track of the big picture and feel isolated, disconnected or left out.
Prior to work from home norms, data science could sit in a room, so all the discussion about work and subtle aspects of the tasks at hand took place in person. Data scientists probably did not think much about documenting everything that they did. Now, remote collaboration requires everyone to document every little aspect of work. In a company where documentation remained terrible for years, the shift away from it can be frustrating and needs a lot of work. Otherwise, if the pipeline breaks, data scientists working in teams would have no idea what to do next.
Working from home denies data scientists of their most potent collaboration techniques. Without the ability to move to someone’s desk to ask a question or get help with some code, it can be challenging to solve problems.
The many dependencies within an ML pipeline require one to store all the components so as to make sure all the features are available both offline and online for deployment. As a result, data scientists should document every smallest thing about the pipeline, models utilised, and their schedules and most of the things around it. The challenging thing is that this can be time-consuming, especially for those not used to it.
Or analysts could talk about a Tableau problem over the email. But when everyone is distributed across different places, this cannot be useful anymore. Now, even the visualisations tools have dashboard/report built in these which need to be documented for better efficiency. The challenge can be more significant when there are ad-hoc requests to re-evaluate and redesign processes and documentation.
Data scientists many times download their own versions of libraries and development tools, but the code developed on one data scientist’s laptop may not necessarily work for another team member who has a different version of packages loaded. Companies may need to use a version control system, and write comments and pull others into their discussion, and online communications.
Provide your comments below
If you loved this story, do join our Telegram Community.
Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
Vishal Chawla is a senior tech journalist at Analytics India Magazine and writes about AI, data analytics, cybersecurity, cloud computing, and blockchain. Vishal also hosts AIM's video podcast called Simulated Reality- featuring tech leaders, AI experts, and innovative startups of India. Reach out at firstname.lastname@example.org