Now Reading
MachineHack Winners: How These Data Science Enthusiasts Solved ‘Predicting Food Delivery Time’ Hackathon

MachineHack Winners: How These Data Science Enthusiasts Solved ‘Predicting Food Delivery Time’ Hackathon

Amal Nair

MachineHack recently concluded its 19th successful edition of Data Science hackathons by announcing the winners for Predicting Food Delivery Time – Hackathon by IMS Proschool

The hackathon, sponsored by IMS Proschool, provided the data science aspirants with an interesting problem which deals with predicting how long a restaurant would take to deliver food when ordered online. It was well-received by the Data Science and Machine learning community with active participation from over 320 participants and close to 1,500 registrations. 

Out of the 320 participants, Mohammed Abdul Qavi, Krishna Priya and Thanish Batcha won the first, second and third places respectively on the hackathon leaderboard. Analytics India Magazine introduces you to the winners and their approach to the solution.



#1: Mohammed Abdul Qavi

A Senior Data Scientist at ADP, Mohammed Abdul Qavi solves various problems in the HCM domain. He started his career working on basic statistical models and his interest in Mathematics drew him into the Data Science Space. Abdul Qavi earned his Masters in Industrial Engineering and Operations Research (IEOR) from IIT Bombay in 2013. 

He spends time learning and perfecting his Data Science skills through MOOCs and by reading articles from various sources like Analytics India Magazine, Medium, Kaggle, LinkedIn etc.


W3Schools

Approach To Solving The Problem 

Qavi explains his approach to solving the problem as follows:

Steps:

1. Performed basic EDA and created an initial baseline model using logistic regression model

2. Cleaned the data for numerical features like number of Ratings, Votes, Reviews, Average cost and Minimum order. Trained a Lightgbm model and was surprised to see that the model was not performing that well compared to logistic regression in the initial steps

3. Created few useful derived features like Minimum order to cost ratio, number of restaurants in a given location and number of branches of a given restaurant etc.

4. Determined the set of unique cuisines across all restaurants and created various cuisine features in a one hot encoded way.

5. Also the location information have multiple details in terms of area, city etc. but the details were not consistent and so created a feature City and mapped the Location to the cities accordingly.

6. Ensembled various models using geometric and harmonic mean. Tried ensembling various models like xgboost, lightgbm, random forest, logistic regression, etc. Finally selected Random Forest and Lightgbm and tuned them for final submission.

The Final submission was made based on ensemble of the below three models.

1. Arithmetic mean of probabilities from lightgbm and random forest

2. Harmonic mean of probabilities from lightgbm and random forest

3. Arithmetic mean of probabilities from lightgbm and random forest (changed hyper parameters)

“EDA is the key to generating good features. We might improve our score with better algorithms but with better features we start to compete.”- he adds as a tip to the Data Science community.

“MachineHack is an amazing platform for data scientists to practice, learn, participate and win exciting prizes. I recommend MachineHack to various fresh graduates who are interested to solve various ML problems across industries. The articles on Analytics India Magazine are very informative and should definitely be followed. I always have amazing experience talking to the MachineHack organizers and my concerns related to the hackathon are addressed in quick time. Keep up the good work and looking forward to future competitions. “ – Qavi shared his opinion about MachineHack.

Get the complete code here.

#2: Krishna Priya

Krishna Priya is a final year undergraduate student at IIT Roorkee. While trying to figure out his career path based on his skills and interests, he stumbled up on Data Science, a profession which he admired for having a good mixture of mathematics, business, coding and research. He enjoys doing research on gathering domain knowledge for tackling a problem that requires it. He got the required knowledge and skill sets to step into the Data Science territory through MOOCs and by reading books and blogs and participating in numerous hackathons. 

Approach To Solving The Problem 

Krisna Priya describes his approach as follows.

The solution was mainly based on feature engineering and data cleaning. In the preprocessing step the Cuisine column was transformed into to specific food types like north Indian, south Indian etc. The column Location was split into two new features locality and city. Made interaction features between numerical variables. Feature selection was performed using using Recursive Feature Elimination (RFE) and Feature Importance plot. The best performance was obtained with 5-fold stratified cross validation with no shuffling.

See Also
Meet The MachineHack Champions Who Cracked The ‘Predict The Movie Genre’ Hackathon

It was my first hackathon on MachineHack. I had never solved a food sector problem statement so the research work behind the problem was interesting for me. Hoping to work on a variety of problems in coming months as this is just the beginning.” – Krishna Priya said.

Get the complete code here.

#3: Thanish Batcha

Thanish Batcha did his Bachelors in Electronics and Communications in 2013 and started his career as a Mainframe software engineer. 

After a year of pursuing his career as a Software Engineer, Thanish started exploring the world of Data. He started doing online courses on Data Science and Machine Learning after work which lead him to his Data Science Career. Since then he has been working closely with Data in a wide variety of domains including Healthcare, Telecom and extensively in automobile industry. He has made hackathons his hobby and considers it as the best means of learning and acquiring skills. 

“My overall experience has been good. I have enjoyed working on this problem and the MachineHack platform” Thanish added on his MachineHack experience.

Approach To Solving The Problem :

Thanish explains his approach in the following steps:

1. Replace the data as -999 in the following features

  • [‘-‘, ‘Opening Soon’, ‘NEW’] in Rating
  • [‘-‘] in Votes
  • [‘-‘] in Reviews

2. Feature Engineering:

  • Create separate columns for each of cuisines in the Cuisine feature. 
  • For each of the restaurants create average & median features of their ratings, reviews, votes. 
  • Extract the city names from the Location feature. 
  •  Create separate features as restaurant type from the Rating feature. 
  • Determine the total branches of each of the restaurants.

3. Label encode the categorical features

4. Based on different combinations of the features create an ensemble of two Random Forest models by taking mean.

 Get the complete code here.

Provide your comments below

comments

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top