Last updated February 2, 2021
In Creative AI

Product Sentiment Classification: Weekend Hackathon #19

Published on September 4, 2020
by Anurag Upadhyaya

We are back with another weekend hackathon and this weekend we are challenging the machinehack community to build an NLP model to analyze sentiments in the product reviews for various electronic products.

Analyzing sentiments related to various products such as Tablet, Mobile and various other gizmos can be fun and difficult especially when collected across various demographics around the world. In this weekend hackathon, we challenge the machinehackers community to develop a machine learning model to accurately classify various products into 4 different classes of sentiments based on the raw text review provided by the user. Analyzing these sentiments will not only help us serve the customers better but can also reveal a lot of customer traits present/hidden in the reviews.

The challenge will start on 4th Sep Friday at 6 pm IST.

Click here to participate

Problem Statement & Description

The sentiment analysis requires a lot to be taken into account mainly due to the preprocessing involved to represent raw text and make them machine-understandable. Usually, we stem and lemmatize the raw information and then represent it using TF-IDF, Word Embeddings, etc. However, provided the state-of-the-art NLP models such as Transformer based BERT models one can skip the manual feature engineering like TF-IDF and Count Vectorizers.

The dataset collected has close to 9000 rows with 4 columns and the reviews are in the form of raw text. The labels for each review are provided with the training labels such as positive, negative, no sentiment, and can’t be said(neutral sentence).

In this short span of time, we would encourage you to leverage the ImageNet moment (Transfer Learning) in NLP using various pre-trained models to classify the product reviews correctly using Multi-class Log Loss as a metric.

Given are raw customer reviews over various types of products with 4 different sentiment classes. Your objective as a data scientist is to build a natural language processing model that can accurately classify the class of sentiments as close as possible.

Dataset Description:

The unzipped folder will have the following files.

Train.csv – 6364 rows x 4 columns (Inlcudes Sentiment Column as Target)
Test.csv – 2728 rows x 3 columns
Sample Submission.csv – sample format for submission file.

How to Generate a valid Submission File

Sklearn models support the predic_proba() method to generate the probabilities for every class.

You should submit a .csv/.xlsx file with exactly 2728 rows with 4 columns (one column per class). Your submission will return an Invalid Score if you have extra columns or rows.

The file should have exactly 4 (0-3) columns:

Attribute Description:

Text_ID – Unique Identifier
Product_Description – Description of the product review by a user
Product_Type – Different types of product (9 unique products)
Class – Represents various sentiments
- 0 – Cannot Say
- 1 – Negative
- 2 – Positive
- 3 – No Sentiment

Skills:

NLP, Sentiment Analysis
Feature extraction from raw text using TF-IDF, CountVectorizer
Using Word Embedding to represent words as vectors
Using Pretrained models like Transformers, BERT
Optimizing multi-class log loss to generalize well on unseen data

The datasets will be made available for download on Sep 4th, Friday at 6 pm IST.

This hackathon and the bounty will expire on Sep 7th, Monday at 7 am IST.

Click here to participate

Bounties

The top 3 competitors in this competition will receive a free pass to the Deep Learning DevCon 2020

We have also introduced a new set of prizes going forward.

Continous 3 finishes In Weekend Hackathons Top-3 participants on the private leaderboard will be interviewed for #HackeroftheMonth.
Stand a Chance to get an exclusive interview for your Data Science/Machine Learning journey by Analytics India Magazine

Who is the #hackerofthemonth ??

Any participant can become #hackerofthemonth by proving their mettle in the weekend hackathon leaderboards. We will award the #hackerofthemonth community recognition to participants who are in Top-3 for 3-consecutive weekend hackathons in a row. Yes, you got it right, it’s a hattrick!!

Stand a chance to get Interviewed by the biggest AL/ML media-house in the country for your Data Science and Machine Learning journey.

Please note this PRIZE is only for the Weekend Hackathon series of competitions.

Click here to participate

Rules

One account per participant. Submissions from multiple accounts will lead to disqualification
The submission limit for the hackathon is 10 per day after which the submission will not be evaluated
All registered participants are eligible to compete in the hackathon
This competition counts towards your overall ranking points
We ask that you respect the spirit of the competition and do not cheat
This hackathon will expire on 03rd August, Monday at 7 am IST
Use of any external dataset is prohibited and doing so will lead to disqualification

Evaluation

The submission will be evaluated using the Log Loss metric. One can use sklearn.metric.log_loss to calculate the same
This hackathon supports private and public leaderboards
The public leaderboard is evaluated on 30% of Test data
The private leaderboard will be made available at the end of the hackathon which will be evaluated on 100% Test data

Click here to participate

Access all our open Survey & Awards Nomination forms in one place >>

Anurag Upadhyaya

Experienced Data Scientist with a demonstrated history of working in Industrial IOT (IIOT), Industry 4.0, Power Systems and Manufacturing domain. I have experience in designing robust solutions for various clients using Machine Learning, Artificial Intelligence, and Deep Learning. I have been instrumental in developing end to end solutions from scratch and deploying them independently at scale.

Watch More

Product Sentiment Classification: Weekend Hackathon #19

Problem Statement & Description

How to Generate a valid Submission File

The datasets will be made available for download on Sep 4th, Friday at 6 pm IST.

This hackathon and the bounty will expire on Sep 7th, Monday at 7 am IST.

Bounties

The top 3 competitors in this competition will receive a free pass to the Deep Learning DevCon 2020

Rules

Evaluation

Anurag Upadhyaya

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.