MachineHack is back with an exciting challenge, in yet another Weekend Hackathon. The challenge is to train a model to predict the genre and quality of the music.
As part of MachineHack Weekend Hackathon Edition #2 — The Last Hacker Standing, we pose unique problem statements every week. Weekend Hackathon Edition #2 will run for six weeks, from 30th July to 9th September 2021.
PARTICIPATE & STAND A CHANCE TO WIN FREE PASSES TO THE DLDC 2021!!!
Problem statement & description
Music has been an important part of our lives since time immemorial. Every artist has a signature, making music a subjective art. We have scales/metrics to measure the quality of music. But, is it possible to train a machine learning model to predict the genre and quality of the music?
Currently, many music aggregator applications rely on machine learning to power their recommendation engine, and curate playlists. MachineHack is challenging data scientists and machine learning practitioners to build a highly scalable ML model for a music aggregator app (Company ABC) to accurately predict the genre of songs in the dataset.
The hackathon will start on 6th August 2021 at 8 pm (IST).
The participants will work on a list of songs provided in the data set.
MachineHack has created a training dataset of 17,996 rows with 17 columns of artist name; track name; popularity; ‘danceability’; energy; key; loudness; mode; ‘speechiness’; ‘acousticness’; ‘instrumentalness’; liveness; valence; tempo; duration in milliseconds and time_signature. It also includes ‘Class’ such as Rock, Indie, Alt, Pop, Metal, HipHop, Alt_Music, Blues, Acoustic/Folk, Instrumental, Country, Bollywood, as the target variable. The dataset for testing includes 7,713 rows with 16 columns.
The prerequisites to attend the hackathon includes knowledge of multi-class classification and the ability to optimise log loss.
The participants must submit a .csv/.xlsx file with exactly 7713 rows with 11 columns for Class/Genres. The submission will return an ‘Invalid Score’ in case of extra columns or rows.
Scikit-learn models support the predict() method to generate the predicted values.
The submission limit for this hackathon is one account per participant.
The evaluation of the hackathon will be done using the Log Loss metric.
The hackathon will also support private and public leaderboards. While the public leaderboard will be evaluated on 30% of the test data, the private leaderboard will be made available at the end of the hackathon and will be assessed on 100% of the test data.
The final score will be based on the ‘Best Score’ on the public leaderboard.
The hackathon will end on 12th August 2021 at 6 pm (IST).
The top three winners will get free passes to the Deep Learning DevCon 2021 (DLDC), scheduled to be held on 23-24 September 2021. In addition, the winners will also get a chance to improve their Global Leader-Board Rankings & become the ultimate MachineHack Grand Master.
- Train.csv — 17996 rows x 17 columns (includes ‘Class’ as a target variable)
- Test.csv — 7713 rows x 16 columns
- Multi-Class Classification
- Optimising Log Loss