MITB Banner

Meet This Week’s MachineHack Champions Who Cracked The ‘Message Polarity Prediction’ Hackathon

Share

MachineHack concluded its third instalment of the weekend hackathon series this Monday. The Message Polarity Prediction hackathon was a great success with active participation from 190 participants and close to 350 registrations.

Out of the 190 competitors, three topped our leaderboard. In this article, we will introduce you to the winners and describe the approach they took to solve the problem.

#1: Chandrashekhar Kanduri

Chandrashekhar is a recent graduate in Information Technology. Introduced to data science by his college professors, Chandrashekhar was curious to know about the new and emerging data science field. Being a math person, Chandrashekhar could easily relate to the concepts that lie within data science. He then pursued a 6 months course in data science, which changed his perception of data. Since then, he has been learning and participating in hackathons to improve his skills.

Approach To Solving The Problem 

Chandrashekhar explains his approach as follows:

The given data was all numerical. I first determined the correlation among all the independent variables that were very low, so I decided to include all the predictor variables. The given labels were imbalanced. To balance it, I used random over-sampling technique from the imblearn package. Then I divided the data set into 80:20 train and test ratio and built the model with different algorithms like Random Forest, XGB classifier, LightGBM and CatBoost. CatBoost was able to give high precision and recall. I did hyperparameter tuning using randomised search cv. One of the parameters gave me the highest accuracy and the final model was built using those parameters.

“I have been participating in MachineHack hackathons, and it has improved my fundamentals. This is my sixth competition. I came close enough and missed a chance to win because of the parameter tuning. In this competition, I have spent much time on parameter tuning. I strongly believe MachineHack is providing a great platform for data science aspirants. The platform conducts hackathons on a variety of problems, and I met some cool people through MachineHack.” – he shared his MachineHack experience.

Get the complete code here.

#2: Karan Juneja

Karan is an Electronics and Telecommunication Engineer from PICT, Pune. His data science journey began out of his passion and curiosity for robotics that ultimately led him to data science and machine learning. He has since been acquiring new data science skills from free online resources as well as by participating in hackathons. 

Approach To Solving The Problem 

Karan explains his approach briefly as follows.

The first observation was that the test and train data were normalised together. Also, the dataset was very small, and on top of it, the public leaderboard was being calculated on only 30% of test data. So, I created a baseline solution and was able to get a very high score. Then, I used cross-validation to get another solution which got a very low score on the leaderboard as compared to a normal train test split. I tried different random states to see if my score changed and observed that it varied a lot. This meant that the public score could not be trusted, thus, I had to trust the cross-validation score. Feature selection was skipped as it does not work well on a normalised dataset. Finally, I created a LightGBM model with a 5-fold CV, which turned out to give the best score. 

“It’s an amazing platform. This was my third competition on this platform, and so far, my experience has been sublime,” said Karan.

Get the complete code here.

#3: Niranjan K

Niranjan is a 2017 Mechanical Engineering Graduate. He started his career with a core MNC, which provides mechanical design solutions to its clients. Soon he found out his job to be redundant. Therefore, he started exploring new domains and got acquainted with data science and machine learning. As he kept exploring more and more about this emerging field, he found out to be in line with his interests. Having been inclined towards automation, he always wanted to be in a position where he could directly influence any business. Thus, he quit his job to pursue a postgraduate program in data science from Great Lakes Institute of Management. From then on, he has been honing his programming and data science skills.

“Being from a non-programming background, I never doubted my ability to learn to code quickly. The key is to never stop learning.”- he said.

Approach To Solving The Problem 

Niranjan explains his approach as follows:

  • As the features were normalised frequencies of words/emoji, there was very little room for direct feature engineering, and the number of records was way too less compared to features (the curse of dimensionality).
  • Correlation plots suggested that there were very less multi-collinearity among features.
  • Histograms of each feature gave insight into the distribution of data.
  • Worked parallelly on different models, one with XGboost and LightGBM and another with CatBoost. Both were helpful in identifying and eliminating insignificant features that were used in training the model.
  • The first model was an ensemble of Xgboost and LightGBM with weighted voting, and the final winning score was achieved by a second model using CatBoost classifier with Bayesian bootstrap Sampling.
  • Since the data labels were imbalanced, I used class-weights to give more priority for less frequent class.
  • I tried several other hyperparameter tunings in Catboost and also tried stacking of catboost, xgboost, LightGBM and some dimensionality reduction techniques.

“This was my first hackathon in machine learning. Although I had worked on some knowledge datasets from other data platforms, this experience of participating in the hackathon was wonderful and looking forward to more such learning experiences from MachineHack.”-he shared his MachineHack experience.

Get the complete code here.

PS: The story was written using a keyboard.
Share
Picture of Amal Nair

Amal Nair

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India