MachineHack successfully conducted its eighteenth chapter of the weekend hackathon series this Monday. The Detecting Anomalies in Wafer Manufacturing: Weekend Hackathon #18 hackathon provided the contestants with an opportunity to revisit the classification skills by classifying various classes of products into 2 different classes. The hackathon was greatly welcomed by data science enthusiasts with over 280 registrations and active participation from close to 180 practitioners.
Out of the 280 competitors, three topped our leaderboard. In this article, we will introduce you to the winners and describe the approach they took to solve the problem.
For building the model in this hackathon he used a cost-sensitive LGBM model to train initially then moved on to the AutoGluon package with Stacking and Hyperparameter optimization. His journey with data science started in 2015 when he joined IIIT Bangalore. At IIIT he started learning and exploring ML, post which he joined Amazon as SDE and had a break from ML. Last year he again started with ML, when he joined the ML team at Amazon and started working on ML projects which have a direct impact on people around us, buyers, and sellers on Amazon.
He would like to thank Machine Hack for giving him a chance to keep himself updated with the trend.
“This is my first hackathon on MachineHack. Overall experience was good, I hope in future we see hackathons solving real-life issues. A bit disappointed with discontinuation of Goodies 😛 ” – Manoj shared his opinion about MachineHack
To create a robust anomaly detection model Nepture stacked 3 different models to generate a final model using the AutoGluon package.
Some of the quick insights are as follows:
- Model 1 -Train AutoGluon with AutoStacking, 5 folds, and 5 bagging folds
- Model 2 – Overriding Model 1 outputs with overlapping training data points
- Model 3 – AutoGluon with hyperparameter tuning + outlier detection outputs as features
- Final model – Average the outputs from models 1, 2 & 3
He has always been excited about math and numerical computing which naturally led him to a career in data science. He has been in this space for more than 10 years now and has worked on a variety of ML problems across industries. However, He lost touch with hands-on modeling over time. But now he is trying to dust off his programming skills and play with ML recipes over the weekends, which is great fun!
“Machine Hack has provided the perfect platform for me to practice hands-on ML with interesting problem statements that are byte-sized and can be tackled in a short period of time. Also, a competitive setting is best suited to tune pipelines that would give robust and high performing models. Appreciate AIM for this great service to the learners in ML community” – Neptune shared his opinion.
Nacir explains his solution approach as follows.
- Eliminating features with a low sum of ones.
- Applying gradient boosting classifiers but none of them gave a +90 score
- Applying logistic regression which gave better results (+91 score)
- Feature ranking with recursive feature elimination (RFECV) but the score didn’t improve.
- Trying Tabnet and that got me the best public score +93. ( the key was using AdamW + 15 StratifiedKfold)
His journey in data science started 5 months ago. Back then, he had no clue what a dataframe is, so he started digging in this great field, a course a day, learning everything he came across. Starting with statistics, machine learning, and moving to deep learning and learning how to use different frameworks. For the first time in his entire life, he feels like he is doing something he loved and he can’t get enough doing it. After having a bigger picture of what data science is he moved to participate in different competitions in different platforms to apply what he has learned, in real-life problems.
“MachineHack is one of them. I met great minds and skilled people who helped me in different ways. I learned a lot from each one of them. Without your help, this would not have been possible.! I still have too much to learn, a lot of competitions to participate in, and a long way to go, but “it’s all about the journey, not the destination”. ” – Nacir shared his opinion.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Join Our Telegram Group. Be part of an engaging online community. Join Here.
Experienced Data Scientist with a demonstrated history of working in Industrial IOT (IIOT), Industry 4.0, Power Systems and Manufacturing domain. I have experience in designing robust solutions for various clients using Machine Learning, Artificial Intelligence, and Deep Learning. I have been instrumental in developing end to end solutions from scratch and deploying them independently at scale.