MITB Banner

MachineHack Winners: How These Data Science Enthusiasts Solved The ‘Predict Books Price’ Hackathon

Share

MachineHack recently concluded its 18th edition of Machine Learning hackathons by announcing the winners for it’s Predict The Price Of Books challenge.

Shravan Kumar, Divyanshu Suri and Saurabh Kumar secured the first, second and third places respectively on the hackathon leaderboard. Analytics India Magazine introduces you to the winners and their approach to the solution.

#1: Shravan Kumar 

Shravan Kumar is a Senior Manager of Advanced Analytics at Novartis. Though Shravan has been working in the analytics field for some time he always had a keen interest in the predictive analytics domain which he wished to explore. He started with MOOC courses in platforms like Coursera, Edx, Udacity etc. and acquired the skillsets for his area of interest. Having learned the essentials, Shravan’s next focus was on perfecting his skills through online competitions. He is an active participator of hackathons conducted by MachineHack, Kaggle and other platforms.

Shravan’s Approach To Solving The Problem

Shravan explains his approach as follows:

Pre-processing steps:

  • Checked the number of rows, columns, data types of variables, missing values and observed word clouds of text data.
  • Basic pre-processing steps for text are observed with a major focus on two columns i.e., Title and Synopsis. Meta Features are created like ‘number of words’, ‘number of unique words’, ‘number of characters’ etc.
  • Reviews and Ratings values are converted into numeric values.
  • Edition variable was split into two major variables Edition Type and Date column – Month is extracted from Date.
  • Created features with TF-IDF and Count Vectorizer for both the Synopsis and Title variables.
  • More features were created by using Glove vectors for the synopsis, genre, book category variable values.
  • Converted all the text variables ‘Title’, ‘Author’, ‘Synopsis’, ‘Genre’, ‘Year’, ‘Month’ to label encoded values
  • Converted the price of the book into Log(Price) – because it is giving a normal distribution
  • Created count and mean encoded features for all categorical variables

Model building steps:

  • Used 5 fold cross-validation techniques along with LightGBM as algorithm and RMSE as a metric. 
  • Experimented with hyperparameter tuning to achieve better scores, especially by changing the learning rate and seed values. 
  • Choosing the right cross-validation technique and feature preparation helped me achieve the 1st Rank on leaderboard

Click here to view the code.

“MachineHack is a great learning platform. The articles by Analytics India Magazine writers are very helpful and keep all our industry-relevant people updated with news in this industry. Truly MachineHack is one of the best hackathon organisers and data science knowledge portals in India,” he said. 

#2: Divyanshu Suri

Now a Senior Manager of Machine Learning at AXA XL, Divyanshu Suri is not new to MachineHack and has won multiple hackathons. Having done his Bachelors in Statistics from Delhi University and a Masters in Applied Statistics from IIT Bombay, Divyanshu was amazed by the real power of data science in his second job at EXL Service where he worked in insurance analytics. He then went on to participate in many online hackathons gaining knowledge and improving his skills.

Now, as a Senior Manager, he applies predictive analytics to solve a variety of data science problems in commercial and speciality lines.

Divyanshu’s Approach To Solving The Problem

Divyanshu started the competition with exploratory analysis, trying to understand the data.

He then proceeded with data cleaning and feature engineering. In order to find the best fitting model for the problem, he tried different algorithms and compared the performances and then combined the better performing models. 

He built a lot of different models based on a different set of variables, different transformations, different variable creation algorithms, and different ML algorithms and finally used stacking concept to come up with the final model.

Click here to view the code.

“MachineHack is a great platform to learn and apply new data science techniques and ML algorithms and improve your own skillset. It is also a great platform to compete with the other industry experts in the data science community.”- he said.

#3: Saurabh Kumar

A skilled and experienced Data Scientist in a reputed firm, Kumar has shown his expertise multiple times by topping several hackathons at MachineHack.

Kumar’s interest in the field of Data Science and Machine Learning emerged from a single algorithm. His personal experience with the Random Forest Algorithm and its capabilities thrilled him to pursue and advance his skills in the buzzing field. Kumar said he is inspired and overwhelmed by the ability of ML algorithms to solve a variety of real-world problems.

Kumar’s Approach To Solving The Problem

Sourabh Kumar used basic feature engineering and traditional NLP techniques like BOW and TF-IDF and lightgbm for cracking the hackathon.

Click here to view the code.

“I am active on the MachineHack platform since their first hackathon and really enjoy competing here. MachineHack team is very cooperative and is willing to work on feedbacks” – he said.

Share
Picture of Amal Nair

Amal Nair

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.