MITB Banner

Watch More

Build The Next Best Code Curator With MachineHack’s New Hackathon

“Can you come up with an algorithm that can predict the bugs, features, and questions based on GitHub titles?”

An average smartphone OS contains more than 10 million lines of code. A million lines of code take 18000 pages to print which is equal to Tolstoy’s War and Peace put together 14 times! There is always a simpler, shorter version of the code along with a longer more exhaustive version.

The number of tools, languages, techniques, and applications that the machine learning ecosystem has nurtured can be overwhelming to a developer. What can be even more daunting is saving the code from going stale. The hidden technical debts within a pipeline can make the product dysfunctional. So, what if there is a tool that does this job for us; to serve us with clean code and answer all your queries?

If you are one of those ML fanatics who think that this can be done and should be done then you should definitely check out this new hackathon brought to you by MachineHack in association with Embold

Embold.io is a software quality platform that enables companies to leverage quality code within a short duration and an easy-to-navigate interface. Embold combines machine learning, rigorous statistical algorithms, and powerful programming techniques to develop cutting edge products for the industry. 

Why Should You Participate?

  • Chance to win bounties worth INR 25,000 by competing against top MachineHackers.
  • Can deploy state of the art language models like BERT.
  • Exposure to solving use cases at the organizational level

Overview Of The Hackathon

In this hackathon, we are challenging the machine learning community to come up with an algorithm that can predict the bugs, features, and questions based on GitHub titles and the text body. With text data, there can be a lot of challenges especially when the dataset is big.  

Leverage the state-of-the-art NLP models like BERT and other pretrained models at your disposal to come up with a best model. The winner’s model will be evaluated using  a code quality score check up on the Embold Code Analysis platform.  

Dataset Description:

  • Training set: 150000 rows x 3 columns (Includes label Column as Target variable)
  • Test set: 30000 rows x 2 columns

Attribute Description:

  • Title – the title of the GitHub bug, feature, question
  • Body – the body of the GitHub bug, feature, question
  • Label – Represents various classes of Labels

Access all our open Survey & Awards Nomination forms in one place >>

Picture of Anurag Upadhyaya

Anurag Upadhyaya

Experienced Data Scientist with a demonstrated history of working in Industrial IOT (IIOT), Industry 4.0, Power Systems and Manufacturing domain. I have experience in designing robust solutions for various clients using Machine Learning, Artificial Intelligence, and Deep Learning. I have been instrumental in developing end to end solutions from scratch and deploying them independently at scale.

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories