Weekend hackathons are fun, aren’t they! In our last weekend hackathon, we introduced a new and unique problem statement using UCI open dataset. But, we were big-time disappointed as some of the participants ended up probing the leaderboard. However, we decided to host an open UCI dataset competition again this weekend. So In this weekend hackathon, we have trained a machine learning model to perturb the target column instead of manually adding the noise. Yes, you heard it right, In this hackathon, we are challenging all the MachineHackers to capture our leaderboard and prove their mettle by competing against MachineHack’s AI.
The challenge will start on July 24th Friday at 6 pm IST.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Problem Statement & Description
The dataset was collected from a Combined Cycle Power Plant over 6 years (2006-2011) when the power plant was set to work with a full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH), and Exhaust Vacuum (V) to predict the net hourly electrical energy output (PE) of the plant.A combined-cycle power plant (CCPP) is composed of gas turbines (GT), steam turbines (ST), and heat recovery steam generators.
In a CCPP, the electricity is generated by gas and steam turbines, which are combined in one cycle, and is transferred from one turbine to another. While the Vacuum is collected from and has an effect on the Steam Turbine, the other three of the ambient variables affect the GT performance..
Given are 5 distinguishing factors that can predict the electrical energy output. Your objective as a data scientist is to build a machine learning model that can accurately predict the electrical energy output from various attributes.
Data Description:-
The unzipped folder will have the following files.
- Train.csv – 9568 rows x 5 columns
- Test.csv – 38272 rows x 4 columns
- Sample Submission – Sample format for the submission.
Target Variable: PE (electrical energy output)
The datasets will be made available for download on July 24th, Friday at 6 pm IST
Below are the file formats for the provided data
Train.csv
Test.csv
Sample_Submission.xlsx
Bounties
The top 3 competitors will receive a free pass to the Computer Vision DevCon 2020
Know more about the Computer Vision DevCon 2020.
This hackathon and the bounty will expire on July 27th, Monday at 7 am IST
Rules
- One account per participant. Submissions from multiple accounts will lead to disqualification
- The submission limit for the hackathon is 10 per day after which the submission will not be evaluated
- All registered participants are eligible to compete in the hackathon
- This competition counts towards your overall ranking points
- You will not be able to submit once you click the “Complete Hackathon” button. You may ignore this feature
- We ask that you respect the spirit of the competition and do not cheat
- This hackathon will expire on 27th July, Monday at 7 am IST
- Use of any external dataset is prohibited and doing so will lead to disqualification
Evaluation
The leaderboard is evaluated using Root Mean Squared Error for the participant’s submission.