MITB Banner

Meet The MachineHack Champions Who Cracked The ‘Glass Quality Prediction’ Hackathon

Share

MachineHack successfully concluded its sixth instalment of the weekend hackathon series last Monday. The Glass Quality Prediction hackathon was greatly welcomed by data science enthusiasts with close to 400 registrations and active participation from over 240 practitioners.

Out of the 246 competitors, three topped our leaderboard. In this article, we will introduce you to the winners and describe the approach they took to solve the problem.

#1: Devesh Darshan

Currently, in his second year of Engineering at Birla Institute of Technology and Science, Pilani, Devesh first came across the term data science during his first year. Like many, he started his journey with the popular Stanford University course by Andrew Ng. His curiosity led him to many other popular online courses as well. He started practising with simple data sets like Titanic and House price on Kaggle. He spends most of his time reading articles and blogs of Analytics India Magazine and Medium to learn new ML techniques.

Approach To Solving The Problem 

Devesh explains his approach as follows:

It was a rather very simple approach from my side. Firstly, I focused on feature engineering and making the data points distribution as normal as possible by applying transformations like log and square root. Then I split the data into train and validation and made some models. I cross-validated the performance using KStratifiedFolds to get a better idea and observed that the ExtraTreesClassifier produced the best result. Then I fine-tuned the model and later used a Bagging ensemble technique to get a more accurate prediction.

“Machine Hack is the best platform for new data scientists to practice and test their skills, as some of the problems stated are very beginner-friendly, unlike Kaggle or other platforms where the problems require experience and a better machine to implement the solution” – Devesh shared his experience.

Get the complete code here.

#2: Vedant Thapa

Vedant is currently pursuing his Master’s Degree in Computer Science from Mithibai College, Mumbai. He first came across the term data science during the final year project of his Bachelor’s program. He was impressed by the applications of data science and started exploring more about it by reading blogs. As he grew more curious, he decided to take a course on Udemy. Later, he joined a data science program at GreyAtom School of data science, where he came across mentors who guided him in building a strong foundation with mathematics and statistics while focusing on practical aspects of data science. Since then, he has been honing and perfecting his data science skills through projects and hackathons.

Approach To Solving The Problem 

Vedant explains his approach briefly as follows.

  1. Started off with EDA and univariate visualisations of the independent variables in training and testing sets. These visualisations suggested that the training and testing sets were from the same distribution
  2. I also found a rule in the ‘x_component’ columns according to which few instances where none of the ‘x_component’ was 1 (target) could be directly classified as 1
  3. The ‘grade_A_component’ and ‘x_component’ columns were found to be one-hot-encoded features
  4. I reversed the one-hot encoding in ‘x_component’ columns and used frequency encoding on it
  5. I engineered some features using numerical independent variables based on basic arithmetic operations like addition, multiplication, subtraction and division
  6. Started training both linear as well as tree-based models, tree-based models outperformed linear models by a significant difference. This was expected as there was a poor linear relationship between independent and dependent variables
  7. Finally, after trying out different bagging and boosting models like CatBoost, XGBoost and LightGBM, ExtraTrees classifier using Stratified 5-Fold CV gave the lowest log loss and standard deviation
  8. I applied the rule found during EDA on the final predictions and submitted it

“MachineHack platform has been invaluable for my learning journey. Each and every hackathon ends up teaching something new. MachineHack has a lot of talented participants and healthy competition, which really forces one to his/her limits on solving the given problem. Any concerns related to the hackathons are also addressed in quick time by the organisers which have helped me a lot as a beginner. Their articles on AIM are really inspiring and informative and should be definitely followed. I intend to continue participating and honing my skills through this amazing platform.” – Vedant shared his experience.

Get the complete code here.

#3: G Mothy

G Mothy is a final year student of Computer Engineering at Army Institute of Technology, Pune. His data science journey started with his internship at IIT Madras under the guidance of the IIT professors. 

From then on, he never looked back and had been exploring different areas in data science with the help of his seniors, and using platforms like Kaggle and other platforms that host hackathons. He likes exploring various types of hackathons to experiment and acquire new skills.

Approach To Solving The Problem 

He explains his approach briefly as follows:

On observing the data grade_A and x_component were one-hot-encoded features. One of the features of grade_A was observed to be a dummy variable, and so the feature grade_A_component_1 was removed. However, in x_components this was not the case.

The integer part of pixel_area and log_area was the same so pixel_area was removed. New features were created with the xmax, xmin, ymin, ymax by applying some arithmetic operations.

On plotting the count plots for x_component features, there were few clear classification conditions.

x_component == 1 -> it is class 1

With these features, ExtraTreesClassifier provided the best local 5-fold cross-validation by removing a few features based on feature importance and log_loss metric.

“The competitions organised by MachineHack are good for beginners to try and learn the concepts in a competitive environment. Overall experience was mostly filled with learnings, and I would love to explore and participate in more such challenges in future.”- he shared his experience.

Get the complete code here.

Check out for new hackathons here.

Share
Picture of Amal Nair

Amal Nair

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.