For our weekly developer column ‘Behind The Code’, we interact with the developer community in India and try to take a look at their journey till date — the way they work, and the tools they use. For this week, we got a chance to interact with Tarun Shrivas, Data Scientist and Instructor at Data Science Dojo (DSD).
Shrivas joined Data Science Dojo about a year ago. His work at DSD is distributed across advanced analytics-based consulting projects for external clients, internal user traffic and user behaviour analytics projects, and data science training.
Sign up for your weekly dose of what's up in emerging technology.
Shrivas is an Electrical Engineering graduate from Jamia Millia Islamia, New Delhi and he has done his Masters in Business Analytics (MSBA) from Seattle University. Before going to the US, he had been working in the marketing research and consulting industry. His work involved data analytics on brand research, consumer research data and drawing insights and recommending a way forward to business clients. “I have worked in different other industries as well including heavy engineering, EPC, and education industries,” said Shrivas.
Talking about the challenges in his data science journey, Shrivas said that since he had prior experience in research and consulting, he was somewhat aware of some of the challenges that data scientists might come across. “I would say that the biggest challenge is to communicate the output of your work to the end client where one would often come across a business audience,” said Shrivas. He emphasised that that data scientists should run their model’s output or data visualisation through a business audience within their own organizations.
Furthermore, he also that there is one more area that offers challenges and that is identifying which problem to attack first. He believes that this would especially be true if, as a data scientist, one has the responsibility to improve internal metrics on its own organization’s business.
Tools And Code
When asked about his job at DSD, Shrivas said that data science training is the core business vertical for Data Science Dojo, and that is why most of the time works with all the common or popular machine learning algorithms. However, for the company’s internal analytics work and for the consulting business, he often uses Xgboost, Support Vector Machines, Latent Dirichlet Allocation algorithms.
Talking about tools and codes, when Shrivas was working in the brand and consumer research and consulting industry, he used to work with tools such as SPSS, STATA, and the good old MS Excel. However, when he was studying for his data science master’s he started using R, Python, SQL, etc. extensively.
When asked which coding language according to him is important and he personally prefers, the DSD data scientist said, “I would say if you are looking for a career in data science you should get familiar with R and Python as most of the data science projects use them.” Apart from that, he also emphasised on query languages such as SQL or PostgreSQL. He also believes that having a working knowledge of distributed engineering platforms such as Hadoop, and Hive would be an added advantage when starting out in the Data Science industry. “Talking about my preference, I like to code in R but honestly there is no clear winner or loser between R and Python,” Shrivas added. “If I have to work on something that is compute-heavy, I also prefer Azure ML studio.”
Advice For Aspiring Data Science Professionals
According to Shrivas, starting with some free of cost learning resources is always a great option. Also, you cannot deny the fact that if you Google out you can always find lectures or tutorials on your area of interest. Here are some of the resources that Shrivas strongly recommends:
- ISLR Textbook
- Statistical Learning on YouTube
- Forecasting: Principles and Practice
- Data Science Dojo on YouTube
- Statistics How-To
Shrivas further said that if someone is not able to follow any regimen and complete the course material on a self-paced learning method, s/he should definitely give a serious thought of going back to the classroom. He believes that attending a data science boot camp will help a lot. Bootcamp or coding workshop costs low and takes the least amount of time from your work schedule and are considered to be the best way to kick start your journey. “However, you need to be very honest and clear about how much investment, in terms of time as well as money, you are ready to make and what are you expecting out of that investment,” said Shrivas.
The Future Direction
Being a data scientist and an educator, Shrivas enjoys sharing his knowledge and experience with others through blogs, tutorials or in an actual classroom setting and going forward he wants to continue doing that.
Apart from that, he is also looking forward to ways in which he can take data science to a much larger audience. He said that people have that notion about data science — that it’s a geek’s domain and unless you are a PhD in statistics or a computer science engineer, data science is not your cup of tea. “However, As a company each one of us at Data Science Dojo truly believes in “Data Science for Everyone” irrespective of current functional expertise,” said Shrivas. Shrivas and the entire DSD team is planning to come up with a platform that reduces the cost of data science learning to the minimum possible.
“As long as you have time, we’ll make the content available to you, free of cost. We want to take the data science learning and training to the bottom of the pyramid,” Shrivas said in conclusion.