We are back with a new hackathon and this one is a month-long challenge based on a community-contributed dataset. In this hackathon, we are challenging the machinehack community to build a regression model to analyze and accurately predict the house prices in India.
Accurately predicting house prices can be a daunting task. The buyers are just not concerned about the size(square feet) of the house and there are various other factors that play a key role to decide the price of a house/property.
In this competition, provided the 12 influencing factors your role as a data scientist is to predict the prices as accurately as possible.
The challenge has started on 25th Sep Friday at 6 pm IST.
Problem Statement & Description
In this hackathon, your goal as a data scientist is to create a regression model that provided the 12 influencing factors.
It can be extremely difficult to figure out the right set of attributes that are contributing to understanding the buyer’s behavior as such. This dataset has been collected across various property aggregators across India.
Dataset Description:
The unzipped folder will have the following files.
- Train.csv – 29451 rows x 12 columns (Includes target column)
- Test.csv – 68720 rows x 11 columns
- Sample Submission – Acceptable submission format. (.csv/.xlsx file with 68720 rows)
Attribute Description:
- POSTED_BY – Category marking who has listed the property
- UNDER_CONSTRUCTION – Under Construction or Not
- RERA – Rera approved or Not
- BHK_NO – Number of Rooms
- BHK_OR_RK – Type of property
- SQUARE_FT – Total area of the house in square feet
- READY_TO_MOVE – Category marking Ready to move or Not
- RESALE – Category marking Resale or not
- ADDRESS – Address of the property
- LONGITUDE – Longitude of the property
- LATITUDE – Latitude of the property
The datasets will be made available for download on Sep 25th, Friday at 6 pm IST.
This hackathon and the bounty will expire on Oct 26th, Monday at 7 am IST.
Bounties
The top 3 competitors in this competition will receive a free pass to the Deep Learning DevCon 2020
Rules
- One account per participant. Submissions from multiple accounts will lead to disqualification
- The submission limit for the hackathon is 10 per day after which the submission will not be evaluated
- All registered participants are eligible to compete in the hackathon
- This competition counts towards your overall ranking points
- We ask that you respect the spirit of the competition and do not cheat
- This hackathon will expire on 26th October, Monday at 7 am IST
- Use of any external dataset is prohibited and doing so will lead to disqualification
Evaluation
- The submission will be evaluated using the RMSLE (Root Mean Squared Logarithmic Error) metric. One can use np.sqrt(mean_squared_log_error( actual, predicted))
- This hackathon supports private and public leaderboards
- The public leaderboard is evaluated on 30% of Test data
- The private leaderboard will be made available at the end of the hackathon which will be evaluated on 100% Test data