The Solution Approach Of Winners Of Fake News Content Detection Hackathon

MachineHack successfully conducted its twentieth installment of the weekend hackathon series this Monday. The Fake News Content Detection: Weekend Hackathon #20 hackathon provided the contestants with an opportunity to develop an NLP model, to combat fake content problems. Data science enthusiasts greatly welcomed the hackathon with over 229 registrations and active participation from close to 89 practitioners.

Out of the 229 competitors, three topped our leaderboard. In this article, we will introduce you to the winners and describe the approach they took to solve the problem.

#1| Salil Gautam

Salil is currently an Associate Data Scientist working at Société Générale. He started Data Science during my 2nd Year of college, since then he started participating in various Hackathons. He has been lucky enough to win 30+ Product based & Data Science Hackathons over these years.

Approach to solve the problem

Salil’s final submission is an ensemble of 3 Models :

Model1 – Catboost Model using Auto ViML

Model2 – Catboost Model using sentence_embeddings [roberta-large tokens]

Model3 – RoBERTa-large [fine tuned] using simple transformers 

Model3 gave him a score of 1.63 on the Public LB, finally, he ensembled all the models to create a robust final submission.

I have been a member of Machine Hack since Dec 2018, I have seen the platform evolve with now having more frequent competitions and a better interface makes it more enjoyable to participate. Looking forward to more such events. ” – Salil shared his opinion about MachineHack.

#2| Eric Vos

Eric is not a Data Scientist, he was learning industrial IT and Robotics 30 years ago. As part of the curriculum, the basics of traditional AI was covered. A few years ago, he was curious about new ML techniques like Neural Networks and Deep Learning. He followed some great MOOC  (Andrew NG, Geoffrey Hinton,..). To practice and improve learned skills, he played in various Data Science competitions and hackathons. 

Approach to solve the problem

Eric started with regular EDA and quickly built a quick baseline using AutoVim. His next kernel combined transformers and RoBERTa with Fastai, used slanted triangular learning rates, discriminate learning rate, and gradual unfreezing. Without tuning the parameters, he obtained rapidly state-of-the-art results. The first level of the blending of his initial baseline with Fastai output improves the result significantly. Later, I was also able to improve the results of Fastai kernel v2 with cosmetic tuning. The final blend of the 3 models output gave him the best result on the Leaderboard. 

I participated in several MH hackathons and learned a lot from published solutions from top Machine Hackers. It’s a great place to improve my machine learning skills and play with various original datasets. I like the ‘weekend’ format, it’s now my weekly brain sport” – Eric shared his opinion.

#3| Chandrashekhar Kanduri

Chandrashekhar graduated in information technology. During his sophomore years, one of his professors used to encourage him to pursue a career in data science. Initially, he didn’t have a clue what it was. In the meantime, he started searching for information on it and he found out that the field is very vast and challenging. Chandrashekhar is a math person and he saw a close relationship between data science and maths. He did a six months course in Data Science. This course changed his perception of data. He found out that everything is just data and patterns. 

Approach to solve the problem

Chandrashekhar started dropping duplicate rows and null values because there were very few rows. Then he used four sentence transformer models(Robert large & base, Bert large & DistilBert-base) on the Text column. He applied the count vectorizer on the Text_Tag column. He also prepared four dataframes using embeddings and count-vectorizer. Then trained separate Votingclassifier models (Catboost and LGBM) on all dataframes, and averaged all dataframes that gave me the best score.

Further, he ensembled a different number of estimators that gave me the best solution. 

MachineHack is an amazing platform, I have been participating in machine hack hackathons and it has improved my fundamentals. This is my second winning hackathon. I strongly believe machine hack is providing a great platform for data science aspirants. This platform conducts hackathons on a variety of problems and I met some cool people through machine hack”Chandrashekhar shared his opinion.

Download our Mobile App

Anurag Upadhyaya
Experienced Data Scientist with a demonstrated history of working in Industrial IOT (IIOT), Industry 4.0, Power Systems and Manufacturing domain. I have experience in designing robust solutions for various clients using Machine Learning, Artificial Intelligence, and Deep Learning. I have been instrumental in developing end to end solutions from scratch and deploying them independently at scale.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.