Swiss Re launches Machine Learning Hackathon to predict Accident Risk Score for unique postcodes

The top three winners stand a chance to win prizes worth INR 1.5 lakh.
Swiss Re launches Machine Learning Hackathon to predict Accident Risk Score for unique postcodes

Swiss Re, the world’s leading reinsurance organisation, in collaboration with MachineHack, is set to launch a Machine Learning Hackathon from March 11th to 28th to predict accident risk scores for unique postcodes. The top three winners stand a chance to win prizes worth INR 1.5 lakh.

With a presence across 25 countries, Swiss Re’s tech strategy harnesses data and technology developing smarter and innovative solutions for clients’ value chains.

Swiss Re applies fresh perspectives, knowledge and capital to anticipate and manage risk to create smarter solutions. Swiss Re’s Global Business Solutions Center (BSC) in Bangalore has more than 1,300 professionals leveraging experience, expertise and out-of-the-box thinking to create new business opportunities.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

The hackathon starts on March 11, 2022, at 6:00 PM

Click here to participate in the hackathon. 

Problem statement & description 

Swiss Re is inviting data scientists, machine learning practitioners and analytics professionals to build a machine learning model to improve auto insurance pricing.

According to IBEF, “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20. Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges, and the road will be more vulnerable to accidents. Increased accident rates also lead to more insurance claims and payouts rise for insurance companies.

In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units, e.g. Postal code/district etc.

In this challenge, we are providing you with the dataset to predict the “Accident_risk_index” against the postcodes.Accident_risk_index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID).

Working example:

Train Data (given)  
Accident_IDPostcodeNumber_of_casualities
1AL1 1JJ2
2AL1 1JP3
3AL1 3PS2
4AL1 3PS1
5AL1 3PS1
Modelling Train Data (Rolled up at Postcode level) 
PostcodeDerived_feature1Derived_feature2Accident_risk_index
AL1 1JJ__2
AL1 1JP__3
AL1 3PS__1.33

Submission guidelines

  • The participants are required to predict the ‘Accident_risk_index’ in the test.csv and against the postcode on the test data
  • Then submit your ‘my_submission_file.csv’ on the submission tab of the hackathon page.

Pro-tip: The participants are required to perform feature engineering to the first roll up the train data at postcode level and create a column as “Accident_risk_index” and optimize the model against postcode level.

Few Hypothesis to help you think: “More accidents happen in the latter part of the day as those are office hours causing congestion”

“Postal codes with more single carriage roads have more accidents”

(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)

Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and its characteristics) and population data (contains info about the population at area level). This info is for augmentation of features, but is not mandatory to use.

Evaluation criteria 

mean_squared_error(y_true, y_pred, squared=False)

  • This hackathon supports private and public leaderboards.
  • The public leaderboard is evaluated on 30% of Test data.
  • The private leaderboard will be made available at the end of the hackathon, which will be evaluated on 100% of Test data.
  • The Final Score represents the score achieved based on the Best Score on the public leaderboard.

Prizes

  • First Prize: INR 75,000
  • Second Price: INR 50,000
  • Third Prize: INR 25,000

The hackathon will end on March 28, 2022, at 6:00 PM.

Click here to participate in the hackathon.

Dataset details

  • Train.csv – 4,78,741 rows x 27 columns
  • Test.csv – 1,21,259 rows x 27 columns 
  • Sample Submission.csv — Please check the ‘Evaluation’ section on MachineHack Page for more details on generating a valid submission.

train.csv & test.csv:

  • ‘Accident_ID’,
  • ‘Police_Force’,
  • ‘Number_of_Vehicles’,
  • ‘Number_of_Casualties’,
  • ‘Date’,
  • ‘Day_of_Week’, ‘Time’,’
  • ‘Local_Authority_(District)’, ‘Local_Authority_(Highway)’,
  • ‘1st_Road_Class’,
  • ‘1st_Road_Number’,
  • ‘Road_Type’,
  • ‘Speed_limit’,
  • ‘2nd_Road_Class’,
  • ‘2nd_Road_Number’,
  • ‘Pedestrian_Crossing-Human_Control’,
  • ‘Pedestrian_Crossing-Physical_Facilities’,
  • ‘Light_Conditions’,
  • ‘’Weather_Conditions’,
  • ‘Road_Surface_Conditions’,
  • ‘Special_Conditions_at_Site’,
  • ‘Carriageway_Hazards’,
  • ‘Urban_or_Rural_Area’,
  • ‘Did_Police_Officer_Attend_Scene_of_Accident’,
  • ‘state’,
  • ‘postcode’,
  • ‘country’

# Population: 8,035 rows x 10 columns

population.csv:

  • ​​’postcode’,
  • ‘Rural Urban’,
  • ‘Variable: All usual residents; measures: Value’,
  • ‘Variable: Males; measures: Value’,
  • ‘Variable: Females; measures: Value’,
  • ‘Variable: Lives in a household; measures: Value’,
  • ‘Variable: Lives in a communal establishment; measures: Value’,
  • ‘Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value’,
  • ‘Variable: Area (Hectares); measures: Value’,
  • ‘Variable: Density (number of persons per hectare); measures: Value’

# Road Network: 91,566 rows x 8 columns

roads_network.csv:

  • ‘WKT’,
  • ‘roadClassi’,
  • ‘roadFuncti’,
  • ‘formOfWay’,
  • ‘length’,
  • ‘primaryRou’,
  • ‘distance to the nearest point on rd’,
  • ‘postcode’

Evaluation criteria: Root Mean Square Error

Note: The target variables are all encoded in the training dataset for convenience. Please submit the test results in a similar encoded fashion for us to evaluate your results.

Disqualification: 

  • If any of the details entered are found incorrect, Analytics India Magazine and Swiss Re reserve the right to disqualify any participant. 
  • Any external dataset usage is strictly prohibited. The participants will be disqualified if found using any external dataset.

Skills:

  • Optimising root mean square error
  • Risk prediction 
  • Feature engineering 

The hackathon starts on March 11, 2022 at 6:00 PM

The hackathon will end on March 28, 2022, at 6:00 PM.

Click here to participate in the hackathon.

More Great AIM Stories

Amit Raja Naik
Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM