MachineHack Winners: How A Data Scientist & An Analyst Secured Leaderboard Rank In MH Predicting The Restaurant Food Cost Hackathon

Dikshant Agarwal and Saurabh Kumar

MachineHack recently concluded its Predicting Restaurant Food Cost Hackathon. Analytics India Magazine talked to the leaderboard rank holders of the hackathon to know about their data science journey and how they solved the problem.

Dikshant Agarwal

Journey In Data Science:

Presently working as a data scientist at a Fintech startup, Dikshant started his career as a product designer in a robotics company. A year later, he enrolled himself into a liberal arts program called Young India Fellowship at Ashoka University. Few months before his graduation there, he had to make a choice about what kind of industry he wanted to work in. “I loved tech and the dynamic nature of it,” he said. After a few discussions with his engineering seniors and friends, he decided to give data science a shot.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

He self-taught himself through MOOCs like the Andrew Ng ML course and books like Introduction to Machine Learning with Python by Andreas C. Muller. He also spent a significant amount of time learning python programming basics. After getting the hang of the basics, he started picking diverse types of projects from various online sources like Kaggle and slowly got comfortable with that analytical mindset and data science approach. Recently, Dikshant started doing hackathons as a way to explore how data gets utilised by different industries, and try novel methods and approaches to better model these different problem statements. Currently, he is working as a Data Scientist in a fin-tech startup.

How did he solve this MachineHack problem:

Download our Mobile App

Dikshant said that the given dataset for the hackathon was particularly interesting since it had the rawest data available for any restaurant. He started by exploring how categorical values of different available features correlated with cost. This initial exploration helped him to gain an understanding of how to further clean and transform the data before modelling. He spent a significant amount of time cleaning and, subsequently, testing performance on different models. He also tried wrangling time data but couldn’t extract significant enough information for his final model. He later finished by tuning his algorithms and stacking them together. For this problem, he used a stacked version of Random Forest, XGBoost, Gradient Boosting and LightGBM. Here is the code on GitHub that Saurabh used for the hackathon.

Experience on MachineHack:

It was his first time participating in a MachineHack hackathon and he said that it was really great to see spontaneity and enthusiasm of other fellow data scientists. Dikshant said, “The data was also, as mentioned before, quite “raw” and interesting to explore and make sense of. It seemed like a really good sample to model the original data and it definitely flexed my data exploration and munging skills. Really excited about what MachineHack offers next!”

Saurabh Kumar

Journey in data science:

Saurabh Kumar is a Group Lead working on Financial Surveillance Analytics at Ameriprise Financial Services Inc. He is an avid data scientist and he first got interested in the subject back in the year 2014, when he learnt about the machine learning algorithm of random forest and its performance in classification tasks compared to traditional classifiers. Since then, he tries to keep his curiosity and consistency in learning about the field by participating in various hackathon platforms. He was inspired and overwhelmed by the ability of ML algorithms to solve a variety of real-world problems.

How did he solve this MachineHack problem:

In this challenge, there were lots of unstructured data features, cuisines and time for example. He used TF-IDF to create features out of them. According to Saurabh, there were low raw features so he created lots of interactive features, which helped his model to identify hidden signals within the data. While modelling, he took the log transform of y and fitted model on the transformed variable. This helped him to reduce the variance of residuals. Finally, he used LGBM regression as his model. Here is the code on GitHub that Saurabh used for the hackathon.

Experience on MachineHack:

Saurabh has participated and has had top ranks in the leaderboard on multiple MachineHack hackathons in the past. “Predict The Data Scientists Salary In India Hackathon” and “Who Let The Dogs Out: Pets Breed Classification hackathon” are two of those. Talking about his experience on MahcineHack, Saurabh says, “I love Machine Hack platform, you guys post interesting problems and now the competition has increased here so it is fun to compete with some of the top minds in data science.”

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Disha Misal
Found a way to Data Science and AI though her fascination for Technology. Likes to read, watch football and has an enourmous amount affection for Astrophysics.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.