Advertisement

Active Hackathon

Stop Bashing ML Hackathons Already, Because They Are Not Close To Real-World

What a gym is for athletes, hackathon platforms are for data scientists and machine learning professionals – a great place to practice and learn.
Stop Bashing ML Hackathons Already, Because They Are Not Close To Real-World

For years, people have been comparing machine learning and data science hackathons with real-world implications. Yet, ironically, the debates are never-ending and often ambiguous.  

For instance, if you look at online hackathon platforms like Kaggle or MachineHack. These platforms allow users to find and publish data sets, explore and build models in a web-based data-science environment, collaborate/work with other data scientists and machine learning engineers, and enter the competition to solve data science and machine learning challenges across experience levels – beginners to intermediate and expert. 

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

Hackathon platforms have been serving as a test-bed for data scientists and machine learning professionals. As per Kaggle, more than 55 per cent of data scientists have less than three years of experience, and six per cent of them pursuing data science have been using machine learning for more than a decade. 

There are a lot more gains than losses by participating in hackathons. Some of the benefits/advantages include: 

  • Learning and collaborating opportunity. Participants get to network with like-minded people and discuss their solutions/approaches to the problems. Plus, working in groups helps to approach a problem from new perspectives and collaborate to achieve results.  
  • Experimenting with many SOTA approaches and datasets 
  • At times, you end up making great contact and landing an awesome job by showcasing your passion and skills to the world. 
  • It is fun to participate and see how you fare on the leaderboard.
  • If you win, the prize money is always a bonus, but that should not be the only criteria to participate/take part in hackathons. 

In this article, we will talk about the differences between hackathon platforms and real-world machine learning projects and draw a clear conclusion between the both. 

ML Lifecycle 

Before we delve deep into understanding the difference between hackathons and real-world machine learning projects, let’s look into a lifecycle of a machine learning project. As explained by Steve Nouri, founder, AI4Diversity, it typically involves: 

  1. Scoping the project 
  2. Collecting the data
  3. Training the model 
  4. Deploying in production  
  5. Repeating 2, 3, 4  
Stop Bashing ML Hackathons Already, Because They Are Not Close To Real-World
A lifecycle of a machine learning project (Source: Neptune.ai) 

Bashing Hackathons 

Many industry experts believe that the hackathon platforms might be an amazing way to experiment and learn. Still, it only aligns with a single stage of the ML lifecycle – i.e., training the model. However, when a data scientist builds a model in the real world and optimises the metric, they need to consider the RoI, inference, re-training cost and costs in general. That is a completely missing puzzle while working on hackathon platforms. 

“To drive the adoption of an ML model within the business stakeholders, it is important we think about ‘interpretability’ as well,” said Sushanth Dasari, data scientist at Trust, stating that it drives a lot of key decisions in each of the steps in the life cycle, which is never the case with a hackathon. 

“In real-world ML projects, 90 per cent of the time is spent on acquiring, cleaning and processing the data, often querying different databases and merging this data. The quality of the input data needs to be carefully assessed and checked for correctness, integrity, and consistency,” said Daniele Gadler, data scientist at ONE LOGIC GmbH. 

Further, he said once the Ml model had been developed and deployed, a lot of time goes into monitoring, re-training the model and re-training it based on newly ingested data (MLOps). Instead, in hackathons, the data is already provided and is generally cleaner than in real-world projects. Furthermore, there are no concerns about real-world issues such as model stability, maintainability, deployability, etc. “You can just focus on developing a super-complex ‘unmaintainable’ huge model with the goal of obtaining the best performance on the data provided for the competition, hoping it will generalise on newly unseen data,” said Gadler. 

Joseph Wehbe, co-founder and CEO of DAIMLAS.com, said that time is wasted improving 0.000001 accuracies on hackathon platforms, but you do not do that in the real world. It focuses only on one performance metric. However, in the real world, you focus on scalability, speed, deployment, and cost. “You don’t learn how to clean raw data. You don’t learn understanding the business problem, deployment skills, team skills interacting with leadership, and analysis to understand what ‘business problem’ you are trying to solve,” he added.  

So what? 

While hackathon platforms like Kaggle, MachineHack, etc., push users to explore new problems, it also helps them understand the science part well enough to do real-world work. 

Hackathon platforms can be as real as real-world, but only the environments are different. For example, ‘what a gym is for athletes, hackathon platforms are for data scientists and machine learning professionals,’ – a great place to practice and learn. 

More Great AIM Stories

Amit Raja Naik
Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

Council Post: How to Evolve with Changing Workforce

The demand for digital roles is growing rapidly, and scouting for talent is becoming more and more difficult. If organisations do not change their ways to adapt and alter their strategy, it could have a significant business impact.

All Tech Giants: On your Mark, Get Set – Slow!

In September 2021, the FTC published a report on M&As of five top companies in the US that have escaped the antitrust laws. These were Alphabet/Google, Amazon, Apple, Facebook, and Microsoft.

The Digital Transformation Journey of Vedanta

In the current digital ecosystem, the evolving technologies can be seen both as an opportunity to gain new insights as well as a disruption by others, says Vineet Jaiswal, chief digital and technology officer at Vedanta Resources Limited

BlenderBot — Public, Yet Not Too Public

As a footnote, Meta cites access will be granted to academic researchers and people affiliated to government organisations, civil society groups, academia and global industry research labs.