Meet The MachineHack Champions Who Cracked The ‘Video Game Sales Prediction’ Hackathon

MachineHack successfully concluded its tenth instalment of the weekend hackathon series last Monday. The Video Game Sales Prediction hackathon was greatly welcomed by data science enthusiasts with over 350 registrations and active participation from close to 200 practitioners.

Out of the 230 competitors, three topped our leaderboard. In this article, we will introduce you to the winners and describe the approach they took to solve the problem.

#1| G Mothy

G Mothy is a final year student of Computer Engineering at Army Institute of Technology, Pune. His data science journey started with his internship at IIT Madras under the guidance of IIT professors. 

From then on, he never looked back and had been exploring different areas in data science with the help of his seniors and using platforms like Kaggle and other platforms that host hackathons. He likes exploring various types of hackathons to experiment and acquire new skills.

Approach To Solving The Problem 

Mothy explains his approach briefly as follows:

On Exploring the data, most of the features were categorical in nature. I started trying out different encoding techniques and applying catboost with default parameters.

1. Encoding techniques did not improve the model as expected

2. Single model and single encoding techniques were not enough to capture the insights of the data

3. Since Catboost could handle categorical features internally without the need for encoding externally, this was chosen as the first model

4. With Label Encoding, LGB and XGB models were trained with Stratified K Fold technique of 10, 20 folds respectively

5. Ensembles of the above three models were used as the final model

“The competitions organised by MachineHack are good for beginners to try and learn the concepts in a competitive environment. Overall experience was mostly filled with learnings, and I would love to explore and participate in more such challenges in future.”- he shared his experience.

Get the complete code here.

#2| Kanika Miglani

Kanika first came across mathematical modeling during her graduate programme at the Delhi University. But a proper introduction to data science happened at IIT Tirupati where she is currently enrolled in  Masters in Statistics and Mathematics. Due to the near-perfect course structure and eagerness to learn how to make better decisions using data, she developed a keen interest in machine learning and data science.

“I just enjoyed solving problems using analytical approaches and wanted to incorporate technology into my work,” Kanika said.

She started leveraging the extra time she got during the pandemic and did a bunch of online courses on Python and machine learning.

Her curiosity to learn and the support from her friends pushed her to work on a variety of problems.

“It has been really exciting for me, and I understand that this is just the beginning, so I still am on my way to explore the field and work on myself to get better at it,” Kanika said.

Approach To Solving The Problem 

Kanika explains her approach briefly as follows:

To describe how I came to my conclusions in this project using the data, I started from data pre-processing. My data cleanup techniques involved determining if the data that I am given makes sense and correcting any values that are not logical after I have adequate information.

I combined train and test sets to make sure that the model gets adequate exposure to the features. I used label encoding for Product IDs and One-hot encoding for other categorical variables. Worked with the User and Critic Points by using some polynomial transformations and dropped the irrelevant information.

I merged a few categories which had very less frequency into their nearest neighbours and transformed the format of the variable Year to the number of years as of today.

After trying all the modelling and automated ML techniques like PyCaret, I moved on with the best options available according to the evaluation metric that was used for the leaderboard, RMSE in this case.

I started experimenting with ensembling techniques like stacking and blending of models. In this problem, my best score was given by Voting Regressor with varied weights for various models, mainly boosting algorithms.

I then kept evaluating the performance based on RMSE and checked if I could improve my model any further.

“I came to know about MachineHack from a friend. This was my first-ever hackathon on this platform, and I got to know about it when it was already live. I really liked working on MachineHack as the portal for submission and leaderboard helped in keeping track of my performance smoothly, and the limited number of submissions made me thoroughly analyze every model before I submit any solution. This made the hackathon a lot more competitive and pushed me to be sure about my solutions and not just try anything without using my brains. 

It is providing people with a platform to learn, perform and excel in the field of data science, and that is great. I am really excited about the upcoming events on MachineHack and look forward to leveraging this platform to enhance my skills and connect with like-minded people!” – She shared her MachineHack experience.

Get the complete code here.

#3| Caleb Emelike

Caleb is a Computer Science graduate who recently kicked off his data science career. He was introduced to the data science domain by his mentor, who gave him continuous guidance in choosing the right courses online. He has done various online courses on data science. Besides, while doing online courses, he applied his knowledge in solving problems and by participating in hackathons.

“So far I never regretted choosing data science as a career.” – he said.

Approach To Solving The Problem 

Caleb explains his approach briefly as follows:

Firstly, I tried to understand the data by doing some basic exploration, then I knew some features that would also help me. The major feature engineering I did that gave me a boost was groupby features. I grouped some features with others, finding the number of uniqueness, counts, mean, standard deviation. Then in modelling, I used the catboost, lightgbm and xgboost algorithm to model. Finally, I did a weighted average with catboost and xgboost that gave me my best score.

 

“MachineHack is a really good platform every data scientist should at least try out. One has to put in more work before (s)he gets an improvement.” – he spoke about his MachineHack experience.

Get the complete code here.

Check out new hackathons here.

Download our Mobile App

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR