MITB Banner

Meet The MachineHack Champions Who Cracked The ‘Video Game Sales Prediction’ Hackathon

Share

MachineHack successfully concluded its tenth instalment of the weekend hackathon series last Monday. The Video Game Sales Prediction hackathon was greatly welcomed by data science enthusiasts with over 350 registrations and active participation from close to 200 practitioners.

Out of the 230 competitors, three topped our leaderboard. In this article, we will introduce you to the winners and describe the approach they took to solve the problem.

#1| G Mothy

G Mothy is a final year student of Computer Engineering at Army Institute of Technology, Pune. His data science journey started with his internship at IIT Madras under the guidance of IIT professors. 

From then on, he never looked back and had been exploring different areas in data science with the help of his seniors and using platforms like Kaggle and other platforms that host hackathons. He likes exploring various types of hackathons to experiment and acquire new skills.

Approach To Solving The Problem 

Mothy explains his approach briefly as follows:

On Exploring the data, most of the features were categorical in nature. I started trying out different encoding techniques and applying catboost with default parameters.

1. Encoding techniques did not improve the model as expected

2. Single model and single encoding techniques were not enough to capture the insights of the data

3. Since Catboost could handle categorical features internally without the need for encoding externally, this was chosen as the first model

4. With Label Encoding, LGB and XGB models were trained with Stratified K Fold technique of 10, 20 folds respectively

5. Ensembles of the above three models were used as the final model

“The competitions organised by MachineHack are good for beginners to try and learn the concepts in a competitive environment. Overall experience was mostly filled with learnings, and I would love to explore and participate in more such challenges in future.”- he shared his experience.

Get the complete code here.

#2| Kanika Miglani

Kanika first came across mathematical modeling during her graduate programme at the Delhi University. But a proper introduction to data science happened at IIT Tirupati where she is currently enrolled in  Masters in Statistics and Mathematics. Due to the near-perfect course structure and eagerness to learn how to make better decisions using data, she developed a keen interest in machine learning and data science.

“I just enjoyed solving problems using analytical approaches and wanted to incorporate technology into my work,” Kanika said.

She started leveraging the extra time she got during the pandemic and did a bunch of online courses on Python and machine learning.

Her curiosity to learn and the support from her friends pushed her to work on a variety of problems.

“It has been really exciting for me, and I understand that this is just the beginning, so I still am on my way to explore the field and work on myself to get better at it,” Kanika said.

Approach To Solving The Problem 

Kanika explains her approach briefly as follows:

To describe how I came to my conclusions in this project using the data, I started from data pre-processing. My data cleanup techniques involved determining if the data that I am given makes sense and correcting any values that are not logical after I have adequate information.

I combined train and test sets to make sure that the model gets adequate exposure to the features. I used label encoding for Product IDs and One-hot encoding for other categorical variables. Worked with the User and Critic Points by using some polynomial transformations and dropped the irrelevant information.

I merged a few categories which had very less frequency into their nearest neighbours and transformed the format of the variable Year to the number of years as of today.

After trying all the modelling and automated ML techniques like PyCaret, I moved on with the best options available according to the evaluation metric that was used for the leaderboard, RMSE in this case.

I started experimenting with ensembling techniques like stacking and blending of models. In this problem, my best score was given by Voting Regressor with varied weights for various models, mainly boosting algorithms.

I then kept evaluating the performance based on RMSE and checked if I could improve my model any further.

“I came to know about MachineHack from a friend. This was my first-ever hackathon on this platform, and I got to know about it when it was already live. I really liked working on MachineHack as the portal for submission and leaderboard helped in keeping track of my performance smoothly, and the limited number of submissions made me thoroughly analyze every model before I submit any solution. This made the hackathon a lot more competitive and pushed me to be sure about my solutions and not just try anything without using my brains. 

It is providing people with a platform to learn, perform and excel in the field of data science, and that is great. I am really excited about the upcoming events on MachineHack and look forward to leveraging this platform to enhance my skills and connect with like-minded people!” – She shared her MachineHack experience.

Get the complete code here.

#3| Caleb Emelike

Caleb is a Computer Science graduate who recently kicked off his data science career. He was introduced to the data science domain by his mentor, who gave him continuous guidance in choosing the right courses online. He has done various online courses on data science. Besides, while doing online courses, he applied his knowledge in solving problems and by participating in hackathons.

“So far I never regretted choosing data science as a career.” – he said.

Approach To Solving The Problem 

Caleb explains his approach briefly as follows:

Firstly, I tried to understand the data by doing some basic exploration, then I knew some features that would also help me. The major feature engineering I did that gave me a boost was groupby features. I grouped some features with others, finding the number of uniqueness, counts, mean, standard deviation. Then in modelling, I used the catboost, lightgbm and xgboost algorithm to model. Finally, I did a weighted average with catboost and xgboost that gave me my best score.

 

“MachineHack is a really good platform every data scientist should at least try out. One has to put in more work before (s)he gets an improvement.” – he spoke about his MachineHack experience.

Get the complete code here.

Check out new hackathons here.

PS: The story was written using a keyboard.
Share
Picture of Amal Nair

Amal Nair

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India