MITB Banner

Life of a Data Scientist in an Internet Company

Share

infoedgeThere are so many articles about “What wonders analytics can do for the business” and about data scientist jobs such as “How to be a data scientist”, “How to hire data scientists” etc. But, there is limited information about “What Data Scientists Exactly Do and How Are Their Lives In An Internet Company”. I think that I can cover this topic genuinely since I have worked as a data scientist many years in various industries like Banking, BPO, Defence etc. and currently leading team of data scientists at InfoEdge which is the parent company of various popular portals like Naukri.Com, JeevanSathi.Com, 99Acres.Com, Shiksha.Com etc. In this article, I will cover uniqueness of data science problem in an Internet company in terms of data and user behaviour. Why this problem excites the Data Scientist like me. How Data Scientist spends his life to build scalable real time personalized solutions to solve various challenges of the industry.

Uniqueness of Data Science Problem In An Internet Company

  • Huge Amount of Data

All three V’s i.e. Volume, Variety, Velocity of Data exists in internet domain. Internet company like Naukri (which has more than 70% of market share in Jobs Category in India) generates tons of amount of data every day with all varieties i.e. structured, semi-structured, unstructured. Naukri generates every day lots of amount of data about many millions of Searches, Page Views, Applies, Registration etc.

  • User Behavior

It’s always a challenge to engage an internet user and to bring him back especially on Mobile where conventional way of engaging like searching is difficult. Every user is interested only in services and products which are personalized to him/her. Nobody will engage to the site if we show generic product and services to everyone or segment specific.

Why Data Science Problem in Internet Domain Excites Data Scientists

For a data scientist, large amount of data is the prerequisites to play with. Our job as data scientists is to torture the data until it confess. Data Scientists love challenges and it excites us more if data is large and of varied type like unstructured, semi structured etc.

Internet Domain Challenges provides the opportunities to test and learn various advanced techniques (Machine Learning, Deep Learning, NLP, Semantic Technologies etc.) and Technologies like Hadoop, Nosql-MongoDB, neo4J etc. We at Naukri leverage most of these techniques and technologies to build scalable and accurate Real Time Personalized Recommendation Engines, Notifications Systems and Semantics Search and Alert Systems.

 A Typical Day of Data Scientists in Internet Domain

I can share the typical day of Data Scientists @ InfoEdge.

The day start with looking at the numbers of previous experiments followed by discussion how to improve them further. Then, building the new features (such as how people move from one location to other location while switching the jobs, is there any industry and functional bias with respect to different experience group etc., how important skills and roles for an individuals, is it different than population or segment) by processing the large profile data, behavior data. Rebuilding the model with new set of features and test its performance in Testbed. Once model provides gains in Testbed then productionize the code to integrate in the respective Live Systems (RealTime Recommendations Engines, Alerts, Semantics Search etc.). To test and learn of the new feature, we always run A/B and evaluate its performance. Once experiment is successful then roll out the experiment to all and replicate it to other applications. Pace of experiments is very fast. In a single week, we can build features and see their performance since most of the systems are online and we can evaluate the performance in very quick time.

After the experiments goes successful, it’s time to have fun. Going for Team outing (bowling, movies, lunch and gupsup and leg pulling of each others) is always fun. Having Samosa and Jalebi to nearby sweet shop is favorite of us.

We also have frequent technical discussions of new tools and emerging techniques to keep ourselves upbeat in the industry. We often run POC and pilot of new emerging techniques.


We@InfoEdge feel proud not just contributing to grow the business but also making the life of so many users better by helping them to get desirable jobs (Naukri.Com), matching them to the right life partners(JeevanSathi.Com), screening out to right properties(99Acres.Com) and making out to right education choices (Shiksha.Com).

Share
Picture of Manish Gupta

Manish Gupta

Dr. Manish Gupta is an advanced analytics professional with more than 14 years of experience in building & leading Data Science, Analytics, BI Teams. He holds Ph.D. from Dept. of Mathematics, IIT Delhi in the area of data mining and machine learning with over 15 research/technical publications in leading international journals and conferences with 1 US Patents. He is currently working as Senior Vice President-Analytics at InfoEdge which is the parent company of various popular portals like Naukri.Com, JeevanSathi.Com, 99Acres.Com, Shiksha.Com etc.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.