Behind The Code: Why This Data Scientist Does Not Believe In Having A Favourite Algorithm

For our weekly developer column ‘Behind The Code’, we get in touch with some of the brilliant minds from the developer community in India and try to take a look at their journey — from the way they work to the tools they use. This week, we got a chance to interact with Saurabh Choudhary, who is the Data Science Lead at Uber R&D, Bangalore. Saurabh gave us clear insight into the data science domain and has also talked about his journey in the industry.

The Onset

An electrical engineering grad from the Delhi College of Engineering with an MBA in strategy and marketing from the Indian School of Business, Saurabh found his way into the data space in 2006 when BI was the hot new thing and SAS/SPSS ruled the roost.

Saurabh started his journey by working with telecom operators based out of Europe and Southeast Asia, helping them set prices for services, managing customer churn, and understand price elasticity. The interaction between data and real-world business impact fascinated Choudhary so much that he chose to formalize some of his knowledge and expand his field of knowledge.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Today, as the Data Science Lead for Uber’s Bangalore R&D centre, he helps shape Uber and the site’s strategy to grow the ability to tap into the rich data science talent ecosystem in Bangalore. Furthermore, Saurabh also works on end-to-end problems at an Uber scale.

“Data is integral to how we make decisions at an organization, and having a full-stack team across product, data science, engineering, research and design co-located within the same centre lets us take the insights we obtain that much further,” Saurabh further added.

Saurabh’s ML Preferences

While he walked us through his data science journey, we asked him about his preferences when it comes to ML frameworks, programming language, cloud platforms and ML tools. He said he started with using Matlab to run numerical optimisations. And eventually, as the open-source ecosystem evolved, he started using both R and Python. However, Saurabh settled on python in the end simply because how easy it is to read the code.

In terms of cloud platform, Saurabh has used AWS in the past, but right now at Uber, the entire data science team has its own internal tools to build and scale models.

Talking about the machine learning algorithms, Saurabh has said that at Uber, the choice to use an ML algorithm, is driven by business need. He and his team have used a smattering of algorithms such as BERT, XGB, State-space models, etc. in order to tackle problems across agent automation, customer behaviour modelling Geospatial analysis, to name a few.

“I consciously try not to have preferences for ML algorithms because once you have a hammer, everything starts looking like a nail,” Saurabh added.

Tools And Codes

Saurabh prefers to spend his time familiarizing himself with the data, especially when he is working with new sources. In this scenario, he uses simple visualization libraries such as seaborn for a lot of bivariate analysis and just to see the data. He also said that he uses tools such as pandas-profiling to generate quick summaries.

And when asked about coding language that he thinks is important he said, “personally, I believe that the language wars have become meaningless.  The community has evolved dramatically over the last few years and pretty much all major languages are now converging to nearly the same place. The only real choice, in my opinion, is stylistic.”

He also emphasised on the fact that it is important to be able to think algorithmically. Also, the ability to anticipate pipeline and build towards it is key to being effective as a data scientist.

The Learning Phase

Coming to the learning phase, Saurabh said that initially as an early stage practitioner, Imposter Syndrome was real for him. And this was before MOOCs were well established and other learning resources were also thin. “There was a constant desire for using the “best” algorithm,” said Saurabh.

The data science lead has also said that the hardest part for him was to learn that there were no silver bullets. He realised that while specialized technical knowledge adds value, it does not tell how to structure a problem and connect the dots to business impact.

However, the amount of content available today is massive. He believes that someone who is considering whether to invest further into space should consider investing in the Machine Learning course by Andrew Ng (Machine Learning, Stanford). Also, for a more detailed study, Pattern Recognition by Bishop is a book he would recommend highly, especially as a reference book. “Other than that, the Linear Algebra lectures by Prof. Gilbert Strang are phenomenal,” Saurabh added.

Talking about something that he has learned recently, Saurabh said, “I’m trying to learn experimental causal inference at the moment. At Uber, we’re trying to understand the effectiveness of a few customer interventions that have low opt-in rates. Therefore, typical experimental methods like A/B tests are not very useful and we need a causal inference method to assess impact. So far it’s been a fascinating study.”

Here Are Some Pieces Of Advice For Budding Data Scientists

Start with the Basics: Understand fundamental calculus, statistics, core concepts such as gradient descent, kernels, etc.

Temper expectations: Most DS professionals spend 80% or more time just getting the data into the right shape. Then, if the problem demands it, maybe a model is created. Often, simpler solutions are better at creating business value

There is “Science” in Data Science: Experimentation is an integral part of a Data Scientist’s role. I would assert that understanding how to craft experiments and evaluate them is more important to a DS than being able to implement a Convolutional Neural Network (CNN) to detect cat faces.

Plans For The Next Few Years

As data science professional, Saurabh would like to devote some time in absorbing reinforcement learning and genetic algorithms because every day at Uber, he sees great applications for it when developing simulations.

Harshajit Sarmah
Harshajit is a writer / blogger / vlogger. A passionate music lover whose talents range from dance to video making to cooking. Football runs in his blood. Like literally! He is also a self-proclaimed technician and likes repairing and fixing stuff. When he is not writing or making videos, you can find him reading books/blogs or watching videos that motivate him or teaches him new things.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox