Analytics India Magazine’s online hackathon platform, MachineHack is working hard to bring the most stimulating problems for all budding data scientists. The first hackathon, Predicting House Prices In Bengaluru ends on 31 August 2018 and the top three winners are going to receive individual passes to our famous event Cypher 2018.
The hackathon platform is fast becoming the hub for data scientists. We also recently concluded the How To Choose The Perfect Beer hackathon and announced prizes worth ₹50,000. We have also launched a global leaderboard to appreciate the data scientists who are working hard on multiple problems on the platform.
Now the wait for our next hackathon is over. MachineHack has launched the hackathon called Whose Line Is It Anyway: Identify The Author Hackathon. Does anyone know the total number of books in the world? Google’s response is safe but accurate. “The answer changes every time the computation is performed, as we accumulate more data and fine-tune the algorithm. The current number is around 210 million,” says the search engine giant. According to Google, the number of published books written in the world is 129,864,880. And now MachineHack is offering the chance for data scientists and machine learning engineers to find unique patterns in authors’ writing styles and texts.
The dataset is based on English language literature by 10 famous authors. The train and the test data consists of short samples of text, where each sample consists of a set of 10 sentences. These sentences are irrespective of the number of words which constitutes the X data and the corresponding Y data, the author. The training data and test data comprise of 18,977 and of 6,326 samples each. This is a dataset which has been collected over some time to gather the works of the best authors from many generations.