Now Reading
Comparing ARIMA Model and LSTM RNN Model in Time-Series Forecasting

Comparing ARIMA Model and LSTM RNN Model in Time-Series Forecasting

arima and lstm
W3Schools

There are many business applications of time series forecasting such as stock price prediction, sales forecasting, weather forecasting etc. A variety of machine learning models are applied in this task of time series forecasting. Every model has its own advantages and disadvantages. In this article, we will see a comparison between two time-series forecasting models – ARIMA model and LSTM RNN model. Both of these models are applied in stock price prediction to see the comparison between them. 

ARIMA Model

The ARIMA model, or Auto-Regressive Integrated Moving Average model is fitted to the time series data for analyzing the data or to predict the future data points on a time scale. The biggest advantage of this model is that it can be applied in cases where the data shows evidence of non-stationarity. 

The auto-regressive means that the evolving variable of interest is regressed on its own prior value and moving average indicates that the regression error is actually a linear combination of error terms whose values occurred contemporaneously and at various times in the past. The significance of integration in the ARIMA model is that the data values have been replaced with the difference between their values and the previous values



For more details on time series analysis using the ARIMA model, please refer to the following articles:-

  1. An Introductory Guide to Time  Series Forecasting
  2. Time Series Modeling and Stress Testing – Using ARIMAX

LSTM Recurrent Neural Network

LSTM, or Long-Short-Term Memory Recurrent Neural Networks are the variants of Artificial Neural Networks. Unlike the feedforward networks where the signals travel in the forward direction only, in LSTM RNN, the data signals travel in backward directions as well as these networks have the feedback connections. The LSTM RNN is popularly used in time series forecasting. For more details on this model, please refer to the following articles:-

  1. How to Code Your First LSTM Network in Keras
  2. Hands-On Guide to LSTM Recurrent Neural Network For Stock Market Prediction.

Now, we will see a comparison of forecasting by both the above models. For implementation, we have used the historical prices of stocks to train and test our models. The historical values of stocks are downloaded by nsepy that is a python API. 

Implementation of Time Series Forecasting

First of all, we need to import all the required libraries. nsepy must be installed using ‘pip install nsepy’ before importing it here. To use LSTM model, the TensorFlow must be installed as the TensorFlow backend is applied for LSTM model. The pmdarima must also be installed using ‘pip install pmdarima’ to use ARIMA model. 

#Importing the libraries
from nsepy import get_history as gh
import datetime as dt
from matplotlib import pyplot as plt
from sklearn import model_selection
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from pmdarima import auto_arima 
import warnings 
from statsmodels.tsa.seasonal import seasonal_decompose 
from statsmodels.tsa.statespace.sarimax import SARIMAX 

Once the libraries are installed, we need to fetch the data by passing the start date and the end date to the API function. The downloaded data will be preprocessed after that.

#Setting start and end dates and fetching the historical data
start = dt.datetime(2013,1,1)
end = dt.datetime(2019,12,31)
stk_data = gh(symbol='SBIN',start=start,end=end)


#Data Preprocessing
stk_data['Date'] = stk_data.index
data2 = pd.DataFrame(columns = ['Date', 'Open', 'High', 'Low', 'Close'])
data2['Date'] = stk_data['Date']
data2['Open'] = stk_data['Open']
data2['High'] = stk_data['High']
data2['Low'] = stk_data['Low']
data2['Close'] = stk_data['Close']

Once we are ready with the dataset, we will fit the ARIMA model using the below code snippet and plot the result.

See Also
Time Series Analysis using Simple Exponential Smoothing

#####################ARIMA###############################
# Ignore harmless warnings 
warnings.filterwarnings("ignore") 

# Fit auto_arima function to Stock Market Data
stepwise_fit = auto_arima(data2['Close'], start_p = 1, start_q = 1, max_p = 3, max_q = 3, m = 12, start_P = 0, seasonal = True, d = None, D = 1, trace = True, error_action ='ignore', suppress_warnings = True, stepwise = True)         

  
# To print the summary 
stepwise_fit.summary() 

# Split data into train / test sets 
train = data2.iloc[:len(data2)-150] 
test = data2.iloc[len(data2)-150:]

# Fit a SARIMAX
model = SARIMAX(data2['Close'],  order = (0, 1, 1),  seasonal_order =(2, 1, 1, 12)) 


result = model.fit() 
result.summary() 


start = len(train) 
end = len(train) + len(test) - 1

  
# Predictions for one-year against the test set 
predictions = result.predict(start, end, typ = 'levels').rename("Predictions") 

  
# plot predictions and actual values 
predictions.plot(legend = True) 
test['Close'].plot(legend = True)

arima model 

After visualizing the time-series plot using the ARIMA model, we will see the same analysis by LSTM model.

#############LSTM########################
train_set = data2.iloc[0:1333:, 1:2].values
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(train_set)
X_train = []
y_train = []
for i in range(60, 1333):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0]) 
X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))


#Defining the LSTM Recurrent Model
regressor = Sequential()
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = 1))


#Compiling and fitting the model
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')
regressor.fit(X_train, y_train, epochs = 15, batch_size = 32)


#Fetching the test data and preprocessing
testdataframe = gh(symbol='SBIN',start=dt.datetime(2018,5,23),end=dt.datetime(2018,12,31))
testdataframe['Date'] = testdataframe.index
testdata = pd.DataFrame(columns = ['Date', 'Open', 'High', 'Low', 'Close'])
testdata['Date'] = testdataframe['Date']
testdata['Open'] = testdataframe['Open']
testdata['High'] = testdataframe['High']
testdata['Low'] = testdataframe['Low']
testdata['Close'] = testdataframe['Close']
real_stock_price = testdata.iloc[:, 1:2].values
dataset_total = pd.concat((data2['Open'], testdata['Open']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(testdata) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 235):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))


#Making predictions on the test data
predicted_stock_price = regressor.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)


#Visualizing the prediction
plt.figure()
plt.plot(real_stock_price, color = 'r', label = 'Close')
plt.plot(predicted_stock_price, color = 'b', label = 'Prediction')
plt.xlabel('Date')
plt.legend()
plt.show()

lstm model

By comparing the two forecasting plots, we can see that the ARIMA model has predicted the closing prices very lower to the actual prices. This large variation in prediction can be seen at the majority of the places across the plot. But in the case of the LSTM model, the same prediction of closing prices can be seen higher than the actual value. But this variation can be observed at few places in the plot and majority of the time, the predicted value seems to be nearby the actual value. So we can conclude that, in the task of stock prediction, the LSTM model has outperformed the ARIMA model. 

Finally, for more satisfaction, we will try to find out the Root Mean Squared Error (RMSE) in prediction by both the models.

######RMSE#######
from sklearn.metrics import mean_squared_error 
from statsmodels.tools.eval_measures import rmse 

# RMSE for ARIMA model
err_ARIMA = rmse(test["Close"], predictions) 
print('RMSE with ARIMA', err_ARIMA)


#RMSE for LSTM Model
err_LSTM = rmse(test["Close"], predicted_stock_price)
print('RMSE with LSTM', err_LSTM)
rmse

Seeing the RMSEs, it is clear now that the LSTM model has the best performance in this task. 

What Do You Think?

If you loved this story, do join our Telegram Community.


Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top