Active Hackathon

Life of a Data Scientist in an Internet Company

infoedgeThere are so many articles about “What wonders analytics can do for the business” and about data scientist jobs such as “How to be a data scientist”, “How to hire data scientists” etc. But, there is limited information about “What Data Scientists Exactly Do and How Are Their Lives In An Internet Company”. I think that I can cover this topic genuinely since I have worked as a data scientist many years in various industries like Banking, BPO, Defence etc. and currently leading team of data scientists at InfoEdge which is the parent company of various popular portals like Naukri.Com, JeevanSathi.Com, 99Acres.Com, Shiksha.Com etc. In this article, I will cover uniqueness of data science problem in an Internet company in terms of data and user behaviour. Why this problem excites the Data Scientist like me. How Data Scientist spends his life to build scalable real time personalized solutions to solve various challenges of the industry.

Uniqueness of Data Science Problem In An Internet Company

  • Huge Amount of Data

All three V’s i.e. Volume, Variety, Velocity of Data exists in internet domain. Internet company like Naukri (which has more than 70% of market share in Jobs Category in India) generates tons of amount of data every day with all varieties i.e. structured, semi-structured, unstructured. Naukri generates every day lots of amount of data about many millions of Searches, Page Views, Applies, Registration etc.


Sign up for your weekly dose of what's up in emerging technology.
  • User Behavior

It’s always a challenge to engage an internet user and to bring him back especially on Mobile where conventional way of engaging like searching is difficult. Every user is interested only in services and products which are personalized to him/her. Nobody will engage to the site if we show generic product and services to everyone or segment specific.

Why Data Science Problem in Internet Domain Excites Data Scientists

For a data scientist, large amount of data is the prerequisites to play with. Our job as data scientists is to torture the data until it confess. Data Scientists love challenges and it excites us more if data is large and of varied type like unstructured, semi structured etc.

Internet Domain Challenges provides the opportunities to test and learn various advanced techniques (Machine Learning, Deep Learning, NLP, Semantic Technologies etc.) and Technologies like Hadoop, Nosql-MongoDB, neo4J etc. We at Naukri leverage most of these techniques and technologies to build scalable and accurate Real Time Personalized Recommendation Engines, Notifications Systems and Semantics Search and Alert Systems.

 A Typical Day of Data Scientists in Internet Domain

I can share the typical day of Data Scientists @ InfoEdge.

The day start with looking at the numbers of previous experiments followed by discussion how to improve them further. Then, building the new features (such as how people move from one location to other location while switching the jobs, is there any industry and functional bias with respect to different experience group etc., how important skills and roles for an individuals, is it different than population or segment) by processing the large profile data, behavior data. Rebuilding the model with new set of features and test its performance in Testbed. Once model provides gains in Testbed then productionize the code to integrate in the respective Live Systems (RealTime Recommendations Engines, Alerts, Semantics Search etc.). To test and learn of the new feature, we always run A/B and evaluate its performance. Once experiment is successful then roll out the experiment to all and replicate it to other applications. Pace of experiments is very fast. In a single week, we can build features and see their performance since most of the systems are online and we can evaluate the performance in very quick time.

After the experiments goes successful, it’s time to have fun. Going for Team outing (bowling, movies, lunch and gupsup and leg pulling of each others) is always fun. Having Samosa and Jalebi to nearby sweet shop is favorite of us.

We also have frequent technical discussions of new tools and emerging techniques to keep ourselves upbeat in the industry. We often run POC and pilot of new emerging techniques.

We@InfoEdge feel proud not just contributing to grow the business but also making the life of so many users better by helping them to get desirable jobs (Naukri.Com), matching them to the right life partners(JeevanSathi.Com), screening out to right properties(99Acres.Com) and making out to right education choices (Shiksha.Com).

More Great AIM Stories

Manish Gupta
Dr. Manish Gupta is an advanced analytics professional with more than 14 years of experience in building & leading Data Science, Analytics, BI Teams. He holds Ph.D. from Dept. of Mathematics, IIT Delhi in the area of data mining and machine learning with over 15 research/technical publications in leading international journals and conferences with 1 US Patents. He is currently working as Senior Vice President-Analytics at InfoEdge which is the parent company of various popular portals like Naukri.Com, JeevanSathi.Com, 99Acres.Com, Shiksha.Com etc.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Why IISc wins?

IISc was selected as the world’s top research university, trumping some of the top Ivy League colleges in the QS World University Rankings 2022

How does the Indian Army want to use AI?

An AI system that can collect data, analyse them and present the same to the commander in a very short time frame is one of the key requirements for the Indian Army

How Data Science Can Help Overcome The Global Chip Shortage

China-Taiwan standoff might increase Global chip shortage

After Nancy Pelosi’s visit to Taiwan, Chinese aircraft are violating Taiwan’s airspace. The escalation made TSMC’s chairman go public and threaten the world with consequences. Can this move by China fuel a global chip shortage?