 # A Comprehensive Guide To Regression Techniques For Time Series Forecasting

In mathematics, time series is a series of data points listed with respect to time; most commonly, it is a sequence taken at successive equal intervals point in time. Common examples of time series are daily closing values of the stock market, counts of sunspots etc. Time series analysis comprises methods for analysing time-series data to extract meaningful statistical information and other data characteristics. In contrast, time series forecasting uses a model to predict future values based on previously observed values. In this article, we are going to explore the following regression techniques used for time series forecasting;

1. AR and MA
2. ARIMA
3. SARIMA
4. VAR

## Code implementation of Regression Techniques

The dataset we are using for all the techniques remains the same and can be found here. The dataset contains weather data collected for the city of Delhi for four years, from 2013 to 2017.

#### THE BELAMY

``` import pandas as pd

Lets plot the line chart for humidity.

```import plotly.express as px
fig = px.line(data, x=data.date, y='humidity', title='Humidity with slider')
fig.update_xaxes(rangeslider_visible=True)
fig.show() ```
##### 1. Autoregressive and Moving Average (AR and MA):

In multiple regression models, we forecast variables of interest using a linear combination of predictors. Here in the autoregressive model, we forecast the variable of interest using a linear combination of past values of the variable. The term autoregression indicates it is a regression of variables against itself.

The model can be formulated as;

Where: Yt is the value of time series at time t

C is the intercept

Ø is the slope coefficient

Yt-p is the lagged values of time series

ε is the error term

This method is suitable for univariate time series without trend and a seasonal component.

Code Implementation

``` # AR example
from statsmodels.tsa.ar_model import AutoReg
# fit model
train,test = data[0:1000],data[1000:]
model = AutoReg(train.humidity, lags=350)
model_fit = model.fit()
# make prediction
pred = model_fit.predict(len(train),len(test)+len(train)-1,dynamic=False)
plt.plot(test.humidity)
plt.plot(pred,color='red') ```

Rather than using past forecast values in regression, a moving average model uses past forecast errors in a regression-like model. In other words, the moving average models the next sequence as a linear function of residual error from the mean process at an earlier time step. Thus, it combines both autoregressive and moving average models.

This method is suitable for univariate time series without trend and seasonal component.

Code Implementation:

``` #MA model
from statsmodels.tsa.arima.model import ARIMA
# fit model
model = ARIMA(train.humidity,order=(300,0,0))
model_fit = model.fit()
# make prediction
pred = model_fit.predict(len(train),len(test)+len(train)-1)
plt.plot(test.humidity)
plt.plot(pred,color='red') ```
##### 2. Autoregressive integrated moving average (ARIMA):

It explicitly creates a suite of standard structure in time series data and it provides a simple and powerful method for forecasting. It combines both autoregressive and moving average models as well as a differencing pre-processing step of the sequence to make the sequence stationary.

This method supports univariate time series with trend and without seasonal component.

The statsmodel library provides the capability to fit ARIMA models.

Code Implementation:

``` from statsmodels.tsa.arima.model import ARIMA
train,test = data.humidity[0:1000],data.humidity[1000:]
X = train
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
for i in range(len(test)):
model = ARIMA(history, order=(5,1,0))
model_fit = model.fit()
output = model_fit.forecast()
pred = output
predictions.append(pred)
true = test[i]
history.append(obs)
print('predicted=%f, expected=%f' % (pred, true))
plt.plot(test)
plt.plot(predictions, color='red') ```
##### 3. Seasonal Autoregressive integrated moving average (SARIMA):

An extension of ARIMA that supports the direct modeling of the seasonal component of the series is called SARIMA. The problem with ARIMA is that it does not support seasonal data i.e repeating cycles. ARIMA expects data that is not seasonal or seasonal component removed

SARIMA adds the three hyperparameters to specify the AR, differencing and moving average for the seasonal component of series

This model suitable for univariate time series with trend and seasonal component.

Code Implementation:

``` from statsmodels.tsa.statespace.sarimax import SARIMAX
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
# walk-forward validation
for t in range(len(test)):
model = SARIMAX(history,seasonal_order=(3, 1, 0, 2))
model_fit = model.fit()
output = model_fit.forecast()
pred = output
predictions.append(pred)
true = test[t]
history.append(true)
print('predicted=%f, expected=%f' % (pred, true))
plt.plot(test)
plt.plot(predictions, color='red') ```
##### 4. Vector Autoregression (VAR):

The vector autoregression model can predict when two or more time series influence each other means the relationship involved in time series is bi-directional. This model considers each variable as a function of past values that are to be predicted, nothing but the time lag of the series. For all this, it considers an autoregressive model.

The main difference between the previous model and VAR is, those models are unidirectional, where predictors influence the Y but not vice-versa. Whereas the VAR model is bidirectional, variables influence each other.

This model is suitable for multivariate time series without trend and seasonal components.

Code Implementation:

``` x1 = data.humidity.values
x2 = data.meantemp.values
list1 = list()
for i in range(len(x1)):
x3 = x1[i]
x4 = x2[i]
row1 = [x3,x4]
list1.append(row1) ```

Fit and forecast to few steps

``` from statsmodels.tsa.vector_ar.var_model import VAR
# fit model
model = VAR(list1)
model_fit = model.fit()
# make prediction
forecast = model_fit.forecast(model_fit.y, steps=5)
print(forecast) ```

Output:

``` [[95.76561271 10.57589906]
[92.08148688 11.10511153]
[88.87374484 11.59330815]
[86.07847799 12.04540676]
[83.64040052 12.46567364]] ```

## Conclusion

This article has seen the major techniques used to forecast time series entities with a practical use case. The most time-consuming thing in the univariate techniques is adjusting the lag values; the proper lag value decides the nature of forecasting. The rest of the techniques are straightforward.

## More Great AIM Stories

### OpenAI Launches \$100 Mn Fund To Catch AI Startups Young Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

## Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Telegram Channel

Discover special offers, top stories, upcoming events, and more.  