Now Reading
The Winners Of Weekend Hackathon -Tea Story at MachineHack

The Winners Of Weekend Hackathon -Tea Story at MachineHack

The Weekend Hackathon Edition #2 – The Last Hacker Standing Tea Story challenge concluded successfully on 19 August 2021. The challenge involved creating a time series analysis model that forecasts for 29 weeks . It had almost 240+participants and 110+ solutions posted on the leaderboard. 

Based on the leaderboard score,we have the top 4 winners of the Tea Story Time Series Challenge, who will get free passes to the virtual Deep Learning DevCon 2021, to be held on 23-24 Sept 2021. Here, we look at the winners’ journeys, solution approaches and experiences at MachineHack. 

Register for FREE Workshop on Data Engineering>>

First Rank – Vybhav Nath C A

Vybhav Nath- a final year student at IIT Madras. He entered this field during his second year of college and started participating in MachineHack hackathons from last year. He plans to take up a career in Data Science.

Approach

He says “the problem was unique in the sense that many columns in the test set” had a lot of null value. So this was a challenging task to solve. He kept his preprocessing steps restricted to imputation and replacing N.S tasks. This was the first competition where he didn’t use any ML model. Since many columns had null values, he interpolated the columns to get a fully populated test set. Then the final prediction was just the mean of these Price columns. He thinks this was total “doosra” by the cool MachineHack Team.

Experience

He says, “I always participate in MH hackathons whenever possible. There are a wide variety of problems which test multiple areas. I also get to participate with many Professionals which I found to be a good pointer about where I stand among them.”

Check out his solution here

Second prize – Shubham Bharadwaj

Shubham has been working as a Data Scientist for about 7 years now. He has been working on large datasets for the past 7 years. Started off with SQL then BigData Analytics, then Data Engineering and finally working as a Data Scientist. But he is new to hackathons and this is his fourth hackathon in which he has participated. He loves to solve complex problems.

Approach

The data which was provided was very raw in nature, there were around 70 percent missing values in the test dataset. From his point of view ,finding the best imputation method was the backbone of this challenge.

Preprocessing steps followed:

1. Converting the columns to correct data types, 

2. Imputing the missing values- He tried various methods like filling the null values with mean of each column, mean of that row,  MICE. But the best was KNN imputer with n_neighbors as 3.

For removing the outliers,he used the IQR(InterQuartile Range), which helped in reducing the mean square error. 

Models tried were logistic regression, then XGBRegressor, ARIMA, T-POT, and finally H2OAutoML which yielded the best result.

Experience

Shubham says “I am new to the MachineHack family, and one thing is for sure that I am here to stay. It’s a great place, I have already learned so much. The datasets are of wide variety and the problem statements are unique, puzzling and complex. It’s a must for every aspiring and professional data scientist to upskill themselves.”

Check out his solution here. 

Third prize – Panshul Pant 

Panshul is a Computer Science and Engineering graduate. He has picked up data science mostly from online platforms like Coursera, Hackerearth, MachineHack and by watching videos on YouTube. Going through articles on websites like Analytics India Magazine have also helped him in this journey. This problem was based on a time series which made it unique, though he solved it using machine learning algorithms rather than other traditional ways.

Approach 

There were certain string values like ‘N.S’, ‘No sale’ etc in all numerical columns which I changed to Null values and imputed all the null values. I tried various ways to impute NaNs like with zero, mean, f-fill and b-fill methods .Out of these forward and backward filling methods performed significantly better. Exploring the data he noticed that the prices increased over the months and years, having a trend. The target column’s values were also very closely related to the average of prices of all the independent columns.He kept all data including the outliers without much change as tree based models are quite robust to outliers.

See Also
Hackathon Alert! MachineHack Challenges ML Community To Predict Merchandise Popularity

As the prices were related to time he extracted time based features as well out of which day of week proved to be useful. An average based feature which had the average of all the numerical columns was extremely useful for good predictions. He tried using some aggregate based features as well but they were not of much help. For predictions he used tree based models like lightgbm and xgboost. The combination of both of them using weighted average gave best results.

Experience 

Panshul says “It was definitely a valuable experience. The challenges set up by the organisers are always exciting and unique. Participating in these challenges has helped me hone my skills in this domain.”

Check out his solution here

Fourth prize – Shweta Thakur

Shweta’s fascination with data science started when she realised how numbers can guide decision making. She did a PGP-DSBA course from Great Learning . Even though her professional work does not involve Data Science activity, she loves to challenge herself by working on Data Science projects and participating in Hackathons.

Approach 

Shweta says that the fact that it is a time series problem makes it unique. She observed the trend and seasonality in the dataset and the higher correlation between various variables. Didn’t treat the outliers but tried to treat the missing values with interpolate (linear, spline)method, ffill, bfill, replacing with other values from dataset.Even though some of the features were not as significant in identifying the target but removing them didn’t improve the RMSE. She tried only SARIMAX.

Experience 

Shweta says “It was a great experience to compete with people from different back-ground and expertise.”

Check out his solution here. 

Once again, join us in congratulating the winners of this exciting hackathon – who indeed were the “Last Hackers Standing” of Tea Story- Weekend Hackathon Edition-2 . We will be back next week with the winning solutions of the ongoing challenge –Soccer Fever Hackathon

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top