Now Reading
The Solution Approach Of Winners Of Fake News Content Detection Hackathon

The Solution Approach Of Winners Of Fake News Content Detection Hackathon

Anurag Upadhyaya

MachineHack successfully conducted its twentieth installment of the weekend hackathon series this Monday. The Fake News Content Detection: Weekend Hackathon #20 hackathon provided the contestants with an opportunity to develop an NLP model, to combat fake content problems. Data science enthusiasts greatly welcomed the hackathon with over 229 registrations and active participation from close to 89 practitioners.

Out of the 229 competitors, three topped our leaderboard. In this article, we will introduce you to the winners and describe the approach they took to solve the problem.

#1| Salil Gautam

Salil is currently an Associate Data Scientist working at Société Générale. He started Data Science during my 2nd Year of college, since then he started participating in various Hackathons. He has been lucky enough to win 30+ Product based & Data Science Hackathons over these years.

Approach to solve the problem

Salil’s final submission is an ensemble of 3 Models :

Model1 – Catboost Model using Auto ViML

Model2 – Catboost Model using sentence_embeddings [roberta-large tokens]

Model3 – RoBERTa-large [fine tuned] using simple transformers 

Model3 gave him a score of 1.63 on the Public LB, finally, he ensembled all the models to create a robust final submission.

I have been a member of Machine Hack since Dec 2018, I have seen the platform evolve with now having more frequent competitions and a better interface makes it more enjoyable to participate. Looking forward to more such events. ” – Salil shared his opinion about MachineHack.

#2| Eric Vos

Eric is not a Data Scientist, he was learning industrial IT and Robotics 30 years ago. As part of the curriculum, the basics of traditional AI was covered. A few years ago, he was curious about new ML techniques like Neural Networks and Deep Learning. He followed some great MOOC  (Andrew NG, Geoffrey Hinton,..). To practice and improve learned skills, he played in various Data Science competitions and hackathons. 

Approach to solve the problem

Eric started with regular EDA and quickly built a quick baseline using AutoVim. His next kernel combined transformers and RoBERTa with Fastai, used slanted triangular learning rates, discriminate learning rate, and gradual unfreezing. Without tuning the parameters, he obtained rapidly state-of-the-art results. The first level of the blending of his initial baseline with Fastai output improves the result significantly. Later, I was also able to improve the results of Fastai kernel v2 with cosmetic tuning. The final blend of the 3 models output gave him the best result on the Leaderboard. 

I participated in several MH hackathons and learned a lot from published solutions from top Machine Hackers. It’s a great place to improve my machine learning skills and play with various original datasets. I like the ‘weekend’ format, it’s now my weekly brain sport” – Eric shared his opinion.

See Also

#3| Chandrashekhar Kanduri

Chandrashekhar graduated in information technology. During his sophomore years, one of his professors used to encourage him to pursue a career in data science. Initially, he didn’t have a clue what it was. In the meantime, he started searching for information on it and he found out that the field is very vast and challenging. Chandrashekhar is a math person and he saw a close relationship between data science and maths. He did a six months course in Data Science. This course changed his perception of data. He found out that everything is just data and patterns. 

Approach to solve the problem

Chandrashekhar started dropping duplicate rows and null values because there were very few rows. Then he used four sentence transformer models(Robert large & base, Bert large & DistilBert-base) on the Text column. He applied the count vectorizer on the Text_Tag column. He also prepared four dataframes using embeddings and count-vectorizer. Then trained separate Votingclassifier models (Catboost and LGBM) on all dataframes, and averaged all dataframes that gave me the best score.

Further, he ensembled a different number of estimators that gave me the best solution. 

MachineHack is an amazing platform, I have been participating in machine hack hackathons and it has improved my fundamentals. This is my second winning hackathon. I strongly believe machine hack is providing a great platform for data science aspirants. This platform conducts hackathons on a variety of problems and I met some cool people through machine hack”Chandrashekhar shared his opinion.

What Do You Think?

If you loved this story, do join our Telegram Community.

Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top