Now Reading
How To Implement LSTM RNN Network For Sentiment Analysis

How To Implement LSTM RNN Network For Sentiment Analysis

Rohit Dwivedi
Sentiment Analysis using LSTM
W3Schools

Sentiment Analysis is a predictive modelling task where the model is trained to predict the polarity of textual data or sentiments like Positive, Neural, and negative. Sentimental Analysis is performed by various businesses to understand their customer behaviour towards the products well. It gives them automatic feedback of the customer that helps them to take actions accordingly. Since we are already overloaded with lots of unstructured data it becomes very tough to analyze the large volume of textual data. But sentiment analysis can be very useful for businesses to label these texts. Sentimental Analysis can be done to compute feedback, reviews of the movies, etc. Even Emotion detection is like part of sentiment analysis where we can analyze the emotion of a person being happy, angry, sad, shock, etc. 

Long Short Term Memory is also known as LSTM that was introduced by Hocheriter & Schmindhuber in 1997. LSTM is a type of RNN network that can grasp long term dependence. They are widely used today for a variety of different tasks like speech recognition, text classification, sentimental analysis, etc. Through this article, we will build a deep learning model using the LSTM Recurrent Neural Network that would be able to classify sentiments of the tweets. 

What are Recurrent Neural Networks and Long Short Term Memory? 



We have already seen feed-forward networks where inputs are multiplied by a weight and then bias is added to that and so on and finally we get output from the last layer. But the problem with these types of networks is they do not store memory and cannot be used in sequential data. Even the input and output of this type of network is fixed. We cannot use these types of networks for problems like Stock Price prediction and similar problems. 

This is the reason Recurrent Neural Networks (RNN) was introduced. RNN was designed in a way such that they can catch the sequential / time series data. In RNN, we multiply with the weight associated with the input of the previous state (w1) and weight associated with output for the previous state. And then we pass them to the Tanh function to get the new state. Now to get the output vector we multiply the new state with an output of Tanh function. Deep networks are not preferred in RNN. 

But RNN suffers from a vanishing gradient problem that is very significant changes in the weights that do not help the model learn. To overcome this LSTM was introduced. You can check this article that explains more about RNN and LSTM “Comparison of RNN LSTM model with Arima Models for Forecasting Problem”. 

Sentiment Analysis using LSTM

Let us first import the required libraries and data. You can import the data directly from Kaggle and use it. There are also many publicly available datasets for sentiment analysis of tweets and reviews. We will use the Twitter Sentiment Data for this experiment. Use the below code to the same. 

import numpy as np

import pandas as pd

from keras.models import Sequential

from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D

from sklearn.model_selection import train_test_split

from sklearn.feature_extraction.text import CountVectorizer

from keras.preprocessing.text import Tokenizer

from keras.preprocessing.sequence import pad_sequences

from keras.utils.np_utils import to_categorical

import re

df = pd.read_csv("Sentiment.csv")

We will now explore the data we just imported. We will first see what all is present in the data. We have checked the different columns for that.

print(data.columns)

Output:

We will only use the tweets and their corresponding sentiments in this experiment. So we will create a new data frame that will only hold these two columns. We will also check the different sentiments present. Use the below code to the same.

new_df = df[['text','sentiment']]

print(data.sentiment)

Output:

Preprocessing Of Tweets 

We will now preprocess the tweets by excluding unnecessary things from text and convert them to lowercase. Use the below code to perform this.

new_df = new_df[new_df.sentiment != "Neutral"]

new_df['text'] = new_df['text'].str.lower()

new_df['text'] = new_df['text'].re.sub('[^a-zA-z0-9\s]')

After this, we will define the vocabulary size that is to be used and use tokenizer to convert them into vectors. We have stored that into the X variable. Use the below code to do so. 

tokenizer = Tokenizer(num_words=1500, split=' ')

tokenizer.fit_on_texts(data['text'].values)

X = tokenizer.texts_to_sequences(new_df['text'])

X = pad_sequences(X)

We then define the LSTM model architecture. Use the below code to define it. The network is similar to Convents networks. The only difference is we have defined two hyperparameters that are embed_dim and lstm_out.  We have then compiled the model using adam optimizer and binary cross-entropy loss.

embed_dim = 128

lstm_out = 196

model = Sequential()

model.add(Embedding(vocabSize, embed_dim,input_length = 28))

model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))

See Also

model.add(Dense(2,activation='softmax'))

model.compile(loss = 'binary_crossentropy', optimizer='adam',metrics = ['accuracy'])

After this, we encode the sentiments using Label encoder. Use the below code to do that. We have stored the tweets into X and corresponding sentiments into Y.

from sklearn.preprocessing import LabelEncoder

Le = LabelEncoder()

y = Le.fit_transform(new_df['sentiment'])

Then we divide the data set into training and testing sets. Use the below code to do so. After which we passed the training data and validation data to the model. 

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.15, random_state = 42)

model.fit(X_train, Y_train,validation_data = (X_test,y_test),epochs = 10, batch_size=32)

Output:

Sentiment Analysis using LSTM

Now we will evaluate the model performance. Use the below code to evaluate the model. 

model.evaluate(X_test,y_test)

Output:

Sentiment Analysis using LSTM

 We got 82% accuracy and loss of 0.655. Now we will make predictions for some of the data and check if the model is able to classify that or not. Use the below code to make the predictions for 5 rows. 

print("Prediction: ",model.predict_classes(X_test[5:10]))

print("Actual: \n",y_test[5:10])

Output: 

Sentiment Analysis using LSTM

As we can see from the above image the 4 predictions were correctly classified by the model whereas 1 misclassification was done by the model.

Conclusion

Through this article, I have tried to explore Sentiment Analysis using LSTM whereas you can now explore applying this type of sequential network to different problems and build new use cases. You can also explore one more experiment through this article titled “Foreign Rate Exchange Prediction using LSTM RNN Networks”.

What Do You Think?

If you loved this story, do join our Telegram Community.


Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top