Comparing ARIMA Model and LSTM RNN Model in Time-Series Forecasting

In this article, we will see a comparison between two time-series forecasting models - ARIMA model and LSTM RNN model. Both of these models are applied in stock price prediction to see the comparison between them.
arima and lstm

There are many business applications of time series forecasting such as stock price prediction, sales forecasting, weather forecasting etc. A variety of machine learning models are applied in this task of time series forecasting. Every model has its own advantages and disadvantages. In this article, we will see a comparison between two time-series forecasting models – ARIMA model and LSTM RNN model. Both of these models are applied in stock price prediction to see the comparison between them. 


The ARIMA model, or Auto-Regressive Integrated Moving Average model is fitted to the time series data for analyzing the data or to predict the future data points on a time scale. The biggest advantage of this model is that it can be applied in cases where the data shows evidence of non-stationarity. 

The auto-regressive means that the evolving variable of interest is regressed on its own prior value and moving average indicates that the regression error is actually a linear combination of error terms whose values occurred contemporaneously and at various times in the past. The significance of integration in the ARIMA model is that the data values have been replaced with the difference between their values and the previous values


Sign up for your weekly dose of what's up in emerging technology.

For more details on time series analysis using the ARIMA model, please refer to the following articles:-

  1. An Introductory Guide to Time  Series Forecasting
  2. Time Series Modeling and Stress Testing – Using ARIMAX

LSTM Recurrent Neural Network

LSTM, or Long-Short-Term Memory Recurrent Neural Networks are the variants of Artificial Neural Networks. Unlike the feedforward networks where the signals travel in the forward direction only, in LSTM RNN, the data signals travel in backward directions as well as these networks have the feedback connections. The LSTM RNN is popularly used in time series forecasting. For more details on this model, please refer to the following articles:-

Download our Mobile App

  1. How to Code Your First LSTM Network in Keras
  2. Hands-On Guide to LSTM Recurrent Neural Network For Stock Market Prediction.

Now, we will see a comparison of forecasting by both the above models. For implementation, we have used the historical prices of stocks to train and test our models. The historical values of stocks are downloaded by nsepy that is a python API. 

Implementation of Time Series Forecasting

First of all, we need to import all the required libraries. nsepy must be installed using ‘pip install nsepy’ before importing it here. To use LSTM model, the TensorFlow must be installed as the TensorFlow backend is applied for LSTM model. The pmdarima must also be installed using ‘pip install pmdarima’ to use ARIMA model. 

#Importing the libraries
from nsepy import get_history as gh
import datetime as dt
from matplotlib import pyplot as plt
from sklearn import model_selection
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from pmdarima import auto_arima 
import warnings 
from statsmodels.tsa.seasonal import seasonal_decompose 
from statsmodels.tsa.statespace.sarimax import SARIMAX 

Once the libraries are installed, we need to fetch the data by passing the start date and the end date to the API function. The downloaded data will be preprocessed after that.

#Setting start and end dates and fetching the historical data
start = dt.datetime(2013,1,1)
end = dt.datetime(2019,12,31)
stk_data = gh(symbol='SBIN',start=start,end=end)

#Data Preprocessing
stk_data['Date'] = stk_data.index
data2 = pd.DataFrame(columns = ['Date', 'Open', 'High', 'Low', 'Close'])
data2['Date'] = stk_data['Date']
data2['Open'] = stk_data['Open']
data2['High'] = stk_data['High']
data2['Low'] = stk_data['Low']
data2['Close'] = stk_data['Close']

Once we are ready with the dataset, we will fit the ARIMA model using the below code snippet and plot the result.

# Ignore harmless warnings 

# Fit auto_arima function to Stock Market Data
stepwise_fit = auto_arima(data2['Close'], start_p = 1, start_q = 1, max_p = 3, max_q = 3, m = 12, start_P = 0, seasonal = True, d = None, D = 1, trace = True, error_action ='ignore', suppress_warnings = True, stepwise = True)         

# To print the summary 

# Split data into train / test sets 
train = data2.iloc[:len(data2)-150] 
test = data2.iloc[len(data2)-150:]

model = SARIMAX(data2['Close'],  order = (0, 1, 1),  seasonal_order =(2, 1, 1, 12)) 

result = 

start = len(train) 
end = len(train) + len(test) - 1

# Predictions for one-year against the test set 
predictions = result.predict(start, end, typ = 'levels').rename("Predictions") 

# plot predictions and actual values 
predictions.plot(legend = True) 
test['Close'].plot(legend = True)

arima model 

After visualizing the time-series plot using the ARIMA model, we will see the same analysis by LSTM model.

train_set = data2.iloc[0:1333:, 1:2].values
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(train_set)
X_train = []
y_train = []
for i in range(60, 1333):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0]) 
X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

#Defining the LSTM Recurrent Model
regressor = Sequential()
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(LSTM(units = 50))
regressor.add(Dense(units = 1))

#Compiling and fitting the model
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error'), y_train, epochs = 15, batch_size = 32)

#Fetching the test data and preprocessing
testdataframe = gh(symbol='SBIN',start=dt.datetime(2018,5,23),end=dt.datetime(2018,12,31))
testdataframe['Date'] = testdataframe.index
testdata = pd.DataFrame(columns = ['Date', 'Open', 'High', 'Low', 'Close'])
testdata['Date'] = testdataframe['Date']
testdata['Open'] = testdataframe['Open']
testdata['High'] = testdataframe['High']
testdata['Low'] = testdataframe['Low']
testdata['Close'] = testdataframe['Close']
real_stock_price = testdata.iloc[:, 1:2].values
dataset_total = pd.concat((data2['Open'], testdata['Open']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(testdata) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 235):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))

#Making predictions on the test data
predicted_stock_price = regressor.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)

#Visualizing the prediction
plt.plot(real_stock_price, color = 'r', label = 'Close')
plt.plot(predicted_stock_price, color = 'b', label = 'Prediction')

lstm model

By comparing the two forecasting plots, we can see that the ARIMA model has predicted the closing prices very lower to the actual prices. This large variation in prediction can be seen at the majority of the places across the plot. But in the case of the LSTM model, the same prediction of closing prices can be seen higher than the actual value. But this variation can be observed at few places in the plot and majority of the time, the predicted value seems to be nearby the actual value. So we can conclude that, in the task of stock prediction, the LSTM model has outperformed the ARIMA model. 

Finally, for more satisfaction, we will try to find out the Root Mean Squared Error (RMSE) in prediction by both the models.

from sklearn.metrics import mean_squared_error 
from import rmse 

# RMSE for ARIMA model
err_ARIMA = rmse(test["Close"], predictions) 
print('RMSE with ARIMA', err_ARIMA)

#RMSE for LSTM Model
err_LSTM = rmse(test["Close"], predicted_stock_price)
print('RMSE with LSTM', err_LSTM)

Seeing the RMSEs, it is clear now that the LSTM model has the best performance in this task. 

More Great AIM Stories

Dr. Vaibhav Kumar
Dr. Vaibhav Kumar is a seasoned data science professional with great exposure to machine learning and deep learning. He has good exposure to research, where he has published several research papers in reputed international journals and presented papers at reputed international conferences. He has worked across industry and academia and has led many research and development projects in AI and machine learning. Along with his current role, he has also been associated with many reputed research labs and universities where he contributes as visiting researcher and professor.

AIM Upcoming Events

Regular Passes expire on 3rd Mar

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 17th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, Virtual
Deep Learning DevCon 2023
27 May, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox