Another weekend and another exciting hackathon, and this time with an open dataset. Yes, you heard it right !
In this weekend hackathon, we are using an open dataset and we have added some noise in the target variable to keep the spirit of competition right. The participants are provided with 55 distinguishing features to build a classification model that can predict the forest cover type in future.
The goal of this competition is to predict the forest cover types (the predominant kind of tree cover) from strictly cartographic variables (as opposed to remotely sensed data)
The challenge will start on July 17th Friday at 6 pm IST.
Problem Statement & Description
The dataset has been taken from UCI, but to keep the spirit of competition right, we have added some noise in the labels. In this hackathon, we challenge all Machinehackers to predict the forest cover types (the predominant kind of tree cover) from strictly cartographic variables (as opposed to remotely sensed data).
The actual forest cover type for a given 30 x 30-meter cell was determined from US Forest Service (USFS) Region to Resource Information System data. Independent variables were then derived from the data obtained from the US Geological Survey and USFS.
The data is in raw form (not scaled) and contains binary columns of data for qualitative independent variables such as wilderness areas and soil type (one-hot-encoded).
Given are 55 distinguishing factors that can predict the forest cover types. Your objective as a data scientist is to build a machine learning model that can accurately classify the forest cover types (the predominant kind of tree cover) from strictly cartographic variables.
The unzipped folder will have the following files.
- Train.csv – 29050 rows x 55 columns
- Test.csv – 551962 rows x 54 columns
- Sample Submission – Sample format for the submission.
Target Variable: Cover_Type
The datasets will be made available for download on July 17th, Friday at 6 pm IST
This hackathon and the bounty will expire on July 20th, Monday at 7 am IST
Below are the file formats for the provided data
Glimpse of training set, all features are not included.
Glimpse of test set, all features are not included.
Glimpse of sample submission.
The top 3 competitors will receive a free pass to the Computer Vision DevCon 2020
Click here to participate
- One account per participant. Submissions from multiple accounts will lead to disqualification
- The submission limit for the hackathon is 10 per day after which the submission will not be evaluated
- All registered participants are eligible to compete in the hackathon
- This competition counts towards your overall ranking points
- You will not be able to submit once you click the “Complete Hackathon” button. You may ignore this feature
- We ask that you respect the spirit of the competition and do not cheat
- This hackathon will expire on 20th July, Monday at 7 am IST
- Use of any external dataset is prohibited and doing so will lead to disqualification
The leaderboard is evaluated using multi-class log loss for the participant’s submission.
Provide your comments below
If you loved this story, do join our Telegram Community.
Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
Experienced Data Scientist with a demonstrated history of working in Industrial IOT (IIOT), Industry 4.0, Power Systems and Manufacturing domain. I have experience in designing robust solutions for various clients using Machine Learning, Artificial Intelligence, and Deep Learning. I have been instrumental in developing end to end solutions from scratch and deploying them independently at scale.