Last updated April 12, 2022
In Creative AI

Meet the winners of the Machine Learning Hackathon by Swiss Re & MachineHack

Let’s take a look at the winners of the Swiss Re Machine Learning hackathon in collaboration with MachineHack.

Share

Illustration by Swiss Re and MachineHack conclude Machine Learning Hackathon: Meet the winners

Published on April 12, 2022

by Sreejani Bhattacharyya

Swiss Re, in collaboration with MachineHack, successfully completed the Machine Learning Hackathon held from March 11th to 28th for data scientists and ML professionals to predict accident risk scores for unique postcodes. The end goal? To build a machine learning model to improve auto insurance pricing.

The hackathon saw over 1100+ registrations and 300+ participants from interested candidates. Out of those, the top five were asked to participate in a solution showcase held on the 6th of April. The top five entries were judged by Amit Kalra, Managing Director, Swiss Re and Jerry Gupta, Senior Vice President, Swiss Re who engaged with the top participants, understood their solutions and presentations and provided their comments and scores. From that emerged the top three winners!

Let’s take a look at the winners who impressed the judges with their analytics skills and took home highly coveted cash prizes and goodies.

Rank 01: Rahul Pednekar

Pednekar comes with over 19 years of work experience in IT, project management, software development, application support, software system design, and requirement study. He is passionate about new technologies, especially data science, AI and machine learning.

“My expertise lies in creating data visualisations to tell my data’s story & using feature engineering to add new features to give a human touch in the world of machine learning algorithms,” said Pednekar.

Method

Pednekar’s approach consisted of seven steps:

Exploratory data analysis (EDA)

For EDA, Pednekar has analysed the dataset to find out the relationship between:

Number of casualties by month
Number of casualties by day
Number of casualties by hour
Number of casualties by type of day
The most dangerous days to travel
Number of casualties by light conditions
Number of casualties by the speed limit
20 dangerous local authority (highway) with the highest record of road accidents in the UK
20 safest local authority (highway) with the highest record of road accidents in the UK
Number of casualties by the police force
Number of casualties wrt road conditions
Number of casualties wrt road type
Number of casualties with pedestrian crossing
Frequency of number of casualties
Frequency of number of vehicles involved
Distribution of vehicles involved in road accidents

Null value imputation and creating various date-time related features

Image: Rahul Pednekar

Feature engineering (training the dataset), Conversion to objects column

Image: Rahul Pednekar

Merging of datasets and Feature engineering

Here, Pednekar merged “Population” & “Road Network” datasets with train using left join. He created Latitude and Longitude columns by extracting data from the “WKT” columns in Roads_network.

He proceeded to

Drop columns with duplicate “postcode”
Drop these columns: “Rural Urban”,”WKT”,”roadFuncti”,”formOfWay”
Impute Null values with “999”

And added new features:

Density by the police force
Population by the police force

Modelling and Prediction

Pednekar completed the following steps:

Use K-Fold cross-validation using K=10
Train the model using CB Regressor with early stopping=100 iterations and predict the number of casualties.
Final Prediction = Roll up the data at the postcode level and create a column as “accident_risk_index”.

Image: Rahul Pednekar

Image: Rahul Pednekar

Pednekar has thoroughly enjoyed participating in this hackathon. He said, “MachineHack team and the platform is amazing, and I would like to highly recommend the same to all data science practitioners. I would like to thank Machinehack for providing me with the opportunity to participate in various data science problem-solving challenges.”

Check the code here.

Rank 02: Sachin Yadav

Yadav’s data science journey started a couple of years back, and since then, he has been an active participant in hackathons conducted on different platforms. “Learning from fellow competitors and absorbing their ideas is the best part of any data science competition as it just widens the thinking scope for yourself and makes you better after each and every competition,” says Yadav.

Approach

Yadav’s initial approach was around checking for ways how population and road network data could be used in conjunction with train and test data. Due to the high amount of mismatch values for the postcode field, he decided against using the population and road network data.
He started with exploratory data analysis, with which it became evident that the target was of Poisson distribution.
The evaluation metric was MSE, but after comparing the baseline score of learning objective MSE vs Poisson, Yadav decided to go ahead with the Poisson model.
Hyperparameter tuning did not give Yadav good results. Hence, he stuck with baseline model parameters. Additionally, blending or stacking of different models did not result in a good score either.
Finally, he performed feature selection from the existing set of features and submitted the Catboost model with three Kfold cross-validations for final submission.

“MachineHack competitions are unique and have a different business case in each of their hackathons. It gives a field wherein we can practice and learn new skills by applying them to a particular domain case. It builds confidence as to what would work and what would not in certain cases. I appreciate the hard work the team is putting in to host such competitions,” adds Yadav.

Check the code here.

Rank 03: Prudhvi Badri

Badri entered the data science field while pursuing a master’s in computer science at Utah State University in 2014 and had taken classes related to statistics, Python programming and AI, and wrote a research paper to predict malicious users in online social networks.

“After my education, I started to work as a data scientist for a fintech startup company and built models to predict loan default risk for customers. I am currently working as a senior data scientist for a website security company. In my role, I focus on building ML models to predict malicious internet traffic and block attacks on websites. I also mentor data scientists and help them build cool projects in this field,” said Badri.

Approach

Badri mainly focused on feature engineering to solve this problem. He created aggregated features such as min, max, median, sum, etc., by grouping a few categorical columns such as Day_of_Week, Road_Type, etc. He built features from population data such as sex_ratio, male_ratio, female_ratio, etc.

He adds, “I have not used the roads dataset that has been provided as supplemental data. I created a total of 241 features and used ten-fold cross-validation to validate the model. Finally, for modelling, I used a weighted ensemble model of LightGBM and XGBoost.”

Badri has been a member of MachineHack since 2020. “I am excited to participate in the competitions as they are unique and always help me learn about a new domain and let me try new approaches. I appreciate the transparency of the platform sharing the approaches of the top participants once the hackathon is finished. I learned a lot of new techniques and approaches from other members. I look forward to participating in more hackathons in the future on the MachineHack platform and encourage my friends and colleagues to participate too,” concluded Badri.

Check the code here.

The Swiss Re Machine Learning Hackathon, in collaboration with MachineHack, ended with a bang, with participants presenting out-of-the-box solutions to solve the problem in front of them. Such a high display of skills made the hackathon intensely competitive and fun and surely made the challenge a huge success!

Access all our open Survey & Awards Nomination forms in one place