How To Implement LSTM RNN Network For Sentiment Analysis

Through this article, we will build a deep learning model using the LSTM Recurrent Neural Network that would be able to classify sentiments of the tweets.
Sentiment Analysis using LSTM

Sentiment Analysis is a predictive modelling task where the model is trained to predict the polarity of textual data or sentiments like Positive, Neural, and negative. Sentimental Analysis is performed by various businesses to understand their customer behaviour towards the products well. It gives them automatic feedback of the customer that helps them to take actions accordingly.

Since we are already overloaded with lots of unstructured data it becomes very tough to analyze the large volume of textual data. But sentiment analysis can be very useful for businesses to label these texts. Sentimental Analysis can be done to compute feedback, reviews of the movies, etc. Even Emotion detection is like part of sentiment analysis where we can analyze the emotion of a person being happy, angry, sad, shock, etc. 

Long Short Term Memory is also known as LSTM that was introduced by Hocheriter & Schmindhuber in 1997. LSTM is a type of RNN network that can grasp long term dependence. They are widely used today for a variety of different tasks like speech recognition, text classification, sentimental analysis, etc. Through this article, we will build a deep learning model using the LSTM Recurrent Neural Network that would be able to classify sentiments of the tweets. 

What are Recurrent Neural Networks and Long Short Term Memory? 

We have already seen feed-forward networks where inputs are multiplied by a weight and then bias is added to that and so on and finally we get output from the last layer. But the problem with these types of networks is they do not store memory and cannot be used in sequential data. Even the input and output of this type of network is fixed. We cannot use these types of networks for problems like Stock Price prediction and similar problems. 

This is the reason Recurrent Neural Networks (RNN) was introduced. RNN was designed in a way such that they can catch the sequential / time series data. In RNN, we multiply with the weight associated with the input of the previous state (w1) and weight associated with output for the previous state. And then we pass them to the Tanh function to get the new state. Now to get the output vector we multiply the new state with an output of Tanh function. Deep networks are not preferred in RNN. 

But RNN suffers from a vanishing gradient problem that is very significant changes in the weights that do not help the model learn. To overcome this LSTM was introduced. You can check this article that explains more about RNN and LSTM “Comparison of RNN LSTM model with Arima Models for Forecasting Problem”. 

Sentiment Analysis using LSTM

Let us first import the required libraries and data. You can import the data directly from Kaggle and use it. There are also many publicly available datasets for sentiment analysis of tweets and reviews. We will use the Twitter Sentiment Data for this experiment. Use the below code to the same. 

import numpy as np

import pandas as pd

from keras.models import Sequential

from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D

from sklearn.model_selection import train_test_split

from sklearn.feature_extraction.text import CountVectorizer

from keras.preprocessing.text import Tokenizer

from keras.preprocessing.sequence import pad_sequences

from keras.utils.np_utils import to_categorical

import re

df = pd.read_csv("Sentiment.csv")

We will now explore the data we just imported. We will first see what all is present in the data. We have checked the different columns for that.



We will only use the tweets and their corresponding sentiments in this experiment. So we will create a new data frame that will only hold these two columns. We will also check the different sentiments present. Use the below code to the same.

new_df = df[['text','sentiment']]



Preprocessing Of Tweets 

We will now preprocess the tweets by excluding unnecessary things from text and convert them to lowercase. Use the below code to perform this.

new_df = new_df[new_df.sentiment != "Neutral"]

new_df['text'] = new_df['text'].str.lower()

new_df['text'] = new_df['text'].re.sub('[^a-zA-z0-9\s]')

After this, we will define the vocabulary size that is to be used and use tokenizer to convert them into vectors. We have stored that into the X variable. Use the below code to do so. 

tokenizer = Tokenizer(num_words=1500, split=' ')


X = tokenizer.texts_to_sequences(new_df['text'])

X = pad_sequences(X)

We then define the LSTM model architecture. Use the below code to define it. The network is similar to Convents networks. The only difference is we have defined two hyperparameters that are embed_dim and lstm_out.  We have then compiled the model using adam optimizer and binary cross-entropy loss.

embed_dim = 128

lstm_out = 196

model = Sequential()

model.add(Embedding(vocabSize, embed_dim,input_length = 28))

model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))


model.compile(loss = 'binary_crossentropy', optimizer='adam',metrics = ['accuracy'])

After this, we encode the sentiments using Label encoder. Use the below code to do that. We have stored the tweets into X and corresponding sentiments into Y.

from sklearn.preprocessing import LabelEncoder

Le = LabelEncoder()

y = Le.fit_transform(new_df['sentiment'])

Then we divide the data set into training and testing sets. Use the below code to do so. After which we passed the training data and validation data to the model. 

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.15, random_state = 42), Y_train,validation_data = (X_test,y_test),epochs = 10, batch_size=32)


Sentiment Analysis using LSTM

Now we will evaluate the model performance. Use the below code to evaluate the model. 



Sentiment Analysis using LSTM

 We got 82% accuracy and loss of 0.655. Now we will make predictions for some of the data and check if the model is able to classify that or not. Use the below code to make the predictions for 5 rows. 

print("Prediction: ",model.predict_classes(X_test[5:10]))

print("Actual: \n",y_test[5:10])


Sentiment Analysis using LSTM

As we can see from the above image the 4 predictions were correctly classified by the model whereas 1 misclassification was done by the model.


Through this article, I have tried to explore Sentiment Analysis using LSTM whereas you can now explore applying this type of sequential network to different problems and build new use cases. You can also explore one more experiment through this article titled “Foreign Rate Exchange Prediction using LSTM RNN Networks”.

Download our Mobile App

Rohit Dwivedi
I am currently enrolled in a Post Graduate Program In Artificial Intelligence and Machine learning. Data Science Enthusiast who likes to draw insights from the data. Always amazed with the intelligence of AI. It's really fascinating teaching a machine to see and understand images. Also, the interest gets doubled when the machine can tell you what it just saw. This is where I say I am highly interested in Computer Vision and Natural Language Processing. I love exploring different use cases that can be build with the power of AI. I am the person who first develops something and then explains it to the whole community with my writings.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.