Analytics India Magazine got in touch with Saurabh Jha, Director – Data Science at Dell, for our weekly column My Journey In Data Science. Saurabh has over 13 years of experience spread across data engineering and data science domain while working for TCS, IBM Global Business Services, Capgemini, Teradata Professional Services, PwC, Dell, and more.
Saurabh Jha completed his bachelor of engineering in Industrial Production from Visvesvaraya Technological University in 2007. Although his graduation had less to do with computer science, he got interested in data-related skills when one of his cousins had introduced him to data warehousing and data mining in his eleventh grade. “My cousin got a job in Bangalore in SPSS. I had no clue what this field was, but I consider this to be the turning point,” says Saurabh.
However, as his engineering major was not related to computer science, he was a bit concerned about his career. Therefore, Saurabh decided to learn computer science skills during his first-semester break and developed new BI skills in every semester break. Saurabh’s cousin guided him to learn RDBMS and data analysis. This further intrigued him to learn more and pursue his career in the data-related field as he found it to be very close to math and physics.
And when Saurabh returned to the college, he started presenting papers of business intelligence. Throughout his college, he kept learning various data warehousing, design, and architecture of scalable data platforms. Eventually, in his final year, he participated in Microsoft Academic Project Program (MSDN), where he presented ideas around HR analytics. On getting selected, he worked on a real-world project — attrition analysis for SME industries.
Transition From BI To Data Science
Saurabh’s proficiency in data analysis got him a job at TCS as a business intelligence specialist. “I got a chance to work on a TCS patented solution on healthcare analytics as my first project, reporting to North India head of TCS BI practice directly. The first job helped me to explore further in the space of information modelling and data visualisation. I tried to learn everything possible right from business intelligence to data engineering. I used to lock myself in the TCS training room and learn various tools and techniques,” explains Saurabh.
For the next six years, Saurabh focused on several data engineering and data science practices. He was involved in designing large scale enterprise data warehouses. And since 2014, he has been into the analytics space and is currently working on developing intelligent systems using ML and DL.
Saurabh always had a curiosity as to how data is being used by the organisation as he used to put a lot of effort into consolidating and aggregating data in the data warehouse. This drove him to get into data science and explore the journey of finding insights into data.
In parallel, Saurabh was also working on a couple of social problems such as inequality in economics in countries and its impacts on society and digital transparency in legal systems. He was dejected to know that the poor were getting poorer and the rich were getting richer when he was reading a book — Capital in the Twenty-First Century by Thomas Piketty. The author used a lot of mathematical models to sketch out the reality of inequality in the world. Saurabh figured out that data science techniques were the only way to solve problems at scale and create a social impact. All these things had a huge impact on his decision to move into the data science domain.
In addition, Saurabh was working on building a recommendation engine that could connect the right aspirant with the right company. Usually, there is a mismatch between the skills of a fresher and what he/she does professionally. For one, a good Java developer ended up doing testing and vice versa. Though the idea never got completely materialised, such visions to solve strenuous challenges laid the foundation for Saurabh to start his journey in data science.
Before moving into analytics, Saurabh was doing well as a senior data architect and had the opportunity to move to the US, but he took the risk of joining a startup in India. He worked hard and devised effective strategies to pick up mathematics and ML skills over a period of time to slowly transit into data science.
Saurabh, for learning data science, entirely relied on MOOCs, books, and blogs. He believes Andrew NG’ ML and DL courses are one of the finest for anyone to learn. “Initially, I struggled a lot, and the going was tough, but Andrew NG made it look simple in his lectures. Eventually, I was able to pick up ideas with ease,” explains Saurabh. And for books, Saurabh suggests Python Machine learning by Sebastian Raschka, Hands-on ML by Geron, and An Introduction to Statistical Learning from Springer. Besides, some of the blogs that he recommends are of Chris Olah and distill.pub.
Besides, he used to meet experienced data scientists, read stories of people who successfully transitioned to data science, and engage in study groups to keep himself motivated. However, for deep learning, he mostly relied on research papers. “Reading papers frequently is the best way to keep yourself updated with the progress and stay informed with the developments in the data science landscape,” says Saurabh. “Although the speed at which papers are published is too frequent and hard to catch up with all the ideas, being connected to researchers on Twitter assists in coming across the most impactful papers in the domain.
One has to be curious and hungry to learn new things in the ever-changing data science landscape. Saurabh stresses on the fact that the philosophy — unlearn and learn — has higher propensity to drive you to catch up quickly with the rapidly evolving data science field.
Best And Worst Data Science Experience
Initially, Saurabh made mistakes when he tried to move too fast and focused on learning everything together right from mathematics to ML and problem-solving, but he later realised that one needs to be patient and look to create value incrementally. “By moving quickly, I wasted a lot of my time. Had I been a little slower, I would have covered more distance in the long run with clarity and also most importantly, avoiding burnt out,” believes Saurabh.
While talking about some of the challenging experiences in data science, Saurabh recollected a situation when he was struggling to gather data for one of his projects as the business process was very complicated, and data were scattered. What made the matter worse was that they did not have any data dictionary either. “It was very hard for a data scientist to produce results in an environment where you are not supported with right inputs and dedicated environments to run scalable data science experiments,” explains Saurabh. “I feel people in management function tend to misunderstand data science and feel like throwing algorithms to some sample of data to produce results,” he adds.
Besides, Saurabh has had numerous successful experiences in his data science journey. He successfully built the AI team from scratch twice in his career for different organisations. And more recently, his patent in deep learning was approved. Besides, he has been a part of many successful projects ranging from customer segmentation to recommendation engine to time series modelling and intent classification.
Job Responsibilities At Dell
Currently, as a director of data science, Saurabh is overseeing customer and financial services, which are spread across 17 countries with 4500 headcounts. He is responsible for the definition and execution of data science roadmap to drive the automation of business processes by developing intelligence through AI. While serving as an executive-level, he is striving to bring AI research in the mainstream and drive digital transformation solutions. As a subject matter expert on Dell’s AI research group, he collaborates and partners with senior executive leadership across a broad range of relevant stakeholders to drive program strategies and requirements.
Apart from setting up an overall vision for the AI strategy, he also hires experienced data scientists. He looks for problem-solving skills, mathematical thinking, programming skills, understanding of computing, ability to solve problems in algorithms, data munging, data structures and analysis of algorithms, storytelling, theoretical ML, and DL knowledge. In addition, he has the desire and hunger to learn and explore new ideas, the courage to be a part of unsolved problems, and the ability to deal with failures are equally important in applicants.
Advice To Aspirants
Saurabh said data science is a multidisciplinary domain as one needs to catch up with computer science, mathematics, and ML/DL, which cannot be done in a day. Unlike other domains such as java development or web development, in data science, you will witness new developments regularly. Therefore, it is a lifelong process of learning, and if one continues in this journey with patience, Saurabh believes, results will be in the learners’ favour. Data science is a journey and not a destination.
As a piece of advice, Saurabh also suggests that aspirants should solve real-world unsolved AI problems, code every day, play with messy data, learn various tools like Python, SQL, and ML/DL frameworks. He also stresses the importance of being open to picking new technologies in the open-source domain. “Aspirants and practitioners should also learn what goes under the ML/DL algorithm to develop a strong foundation of the subject, develop business acumen, read and implement research papers, be curious to create new opportunity to further develop the ideas of algorithms, and most importantly stay humble and remain focused,” concludes Saurabh.