This post assumes that the reader has a basic understanding of time series forecasting. We can get a full introduction to the forecasting analysis from here. And if you are starting to learn about it, then you can refer to this blog.
Introduction
The smoothing techniques are the members of time series forecasting methods or algorithms, which use the weighted average of a past observation to predict the future values or forecast the new value. These techniques are well suited for time-series data having fewer deviations with time. It synchronizes ETS (error, trends, seasonality)components to more computable and smoothed parameters.
In this article, we will learn about the three different smoothing methods:
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
- Simple exponential smoothing
- Double exponential smoothing
- Triple exponential smoothing
For this article, we can get the data set from here. In the following steps, we will use the python programming language to build a time series forecasting model using smoothing techniques and compare them to understand the whole procedure better. The following code implementation is in reference to the official implementation.
Code Implementation for Exponential Smoothing
Set up the environment in Colab
Requirements : Python 3.6 or above, Pandas 1.2.5, numpy 1.21.0, scikit-learn 0.24.2, statsmodels 0.12.2
Importing the required libraries import pandas as pd import numpy as np from sklearn import metrics import matplotlib.pyplot as plt %matplotlib inline from sklearn.model_selection import ParameterGrid from statsmodels.tsa.api import SimpleExpSmoothing from statsmodels.tsa.api import Holt from statsmodels.tsa.api import ExponentialSmoothing
Loading the Data Set
The data is about the share pricing and volume of Facebook; we are going to work on the column ‘close ‘ present in the data.
data.head() data = pd.read_csv('FB.csv')
Generating the train and test data for the “close” column, which contains closing information of Facebook share.
close = data['Close'] testclose = close.iloc[-30:] trainclose = close.iloc[:-30]
Defining functions
Defining some functions to calculate the mean absolute percentage error and another one will provide us with evaluations metrics.
Function 1 :
def MAPE(y_true, y_pred): y_true, y_pred = np.array(y_true), np.array(y_pred) return np.mean(np.abs((y_true - y_pred) / y_true)) * 100
Function 2:
def timeseries_evaluation_metrics(y_true, y_pred): print('Evaluation metric results: ') print(f'MSE value : {metrics.mean_squared_error(y_true, y_pred)}') print(f'MAE value : {metrics.mean_absolute_error(y_true, y_pred)}') print(f'RMSE value : {np.sqrt(metrics.mean_squared_error(y_true, y_pred))}') print(f'MAPE value : {mean_absolute_percentage_error(y_true, y_pred)}') print(f'R2 score : {metrics.r2_score(y_true, y_pred)}',end='\n\n')
Simple Exponential Smoothing Function
In simple exponential smoothing function, the predicted value at t + 1 is based on the weight of value at time t or predicted weight at time t in mathematics; it can be expressed as
Where,
???? = smoothing factor (0 ≤ ???? ≥ 1)
???????? = current observation value
???? = smoothed static
Simple (single) exponential smoothing is the simplest form of the smoothing algorithms and this functionality is used for the prediction of those datasets where trends and seasonality are not present.
We assume that our time series data has the following:
- Level
- No trends
- No seasonality
- Noise
Let’s use the Simple Exponential Smoothing(SES) technique with our data set.
For this, we are finding the most suited parameter for SES.
In SES, we can iterate for different values of these parameters:
- Smoothing_level(Float,optional)
Fitting the Simple Exponential Smoothing Model
We are applying a grid search to find the best fit. It will fill values in the model between 0 to 1 and let us know about the highest possible RMSE value.
temp_df = pd.DataFrame() for i in [0 , 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90,1]: print(f'Fitting for smoothing level= {i}') fit_v = SimpleExpSmoothing(np.asarray(trainclose)).fit(i) fcst_pred_v= fit_v.forecast(len(testclose)) timeseries_evaluation_metrics(testclose, fcst_pred_v)
This output is so long and is difficult to compare between root mean square errors.
What if we gather all the RMSE value in one table? Let’s try this. We are making a loop to make all the RMSE value in one table.
temp_df = pd.DataFrame() for i in [0 , 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90,1]: fit_v = SimpleExpSmoothing(np.asarray(trainclose)).fit(i) fcst_pred_v= fit_v.forecast(len(testclose)) rmse = np.sqrt(metrics.mean_squared_error(testclose, fcst_pred_v)) df3 = {'smoothing parameter':i, 'RMSE': rmse} temp_df = temp_df.append(df3, ignore_index=True) temp_df.sort_values(by=['RMSE'])
The above table shows that the RMSE value is lowest at the smoothing parameter 1.0; we will fit a model with this smoothing parameter only and check for different evaluation functions.
SES = SimpleExpSmoothing(np.asarray(trainX)) fit_SES = SES.fit(smoothing_level = 1, optimized=False) fcst_gs_pred = fit_SES.forecast(len(testclose)) timeseries_evaluation_metrics(testclose, fcst_gs_pred)
An integrated function searches for the best fit of parameters and will show the result accordingly. Let’s see how it works.
In the next mode, we won’t provide any smoothing parameter; we will just change the model’s status optimized from False to True.
SES = SimpleExpSmoothing(np.asarray(trainclose)) fit_SES_auto = SES.fit(optimized= True, use_brute = True) fcst_auto_pred = fit_SES_auto.forecast(len(testclose)) timeseries_evaluation_metrics(testclose, fcst_auto_pred)
output:
By comparing the evaluation matrix value, we can see that the model performs better when we let it search for better parameter instead of providing it from grid search.
We can check the summary of the model by:
fit_SES_auto.summary()
As we can see, it selected smoothing parameter = 0.9808447 and performed much better.
Let’s plot the results for its performance :
Here we are just changing the index of the predicted values and making it equal to the test set to get better visualization and comparison between the two models.
For grid search model-
df_fcst_gs_pred = pd.DataFrame(fcst_gs_pred, columns=['Close_grid_Search']) df_fcst_gs_pred["new_index"] = range(len(trainclose), len(close)) df_fcst_gs_pred = df_fcst_gs_pred.set_index("new_index")
For automatic model-
df_fcst_auto_pred = pd.DataFrame(fcst_auto_pred, columns=['Close_auto_search']) df_fcst_auto_pred["new_index"] = range(len(trainclose), len(close)) df_fcst_auto_pred = df_fcst_auto_pred.set_index("new_index")
Making the trend plot.
plt.rcParams["figure.figsize"] = [16,9] plt.plot(trainclose, label='Train') plt.plot(testclose, label='Test') plt.plot(df_fcst_gs_pred, label='Simple Exponential Smoothing using custom grid search') plt.plot(df_fcst_auto_pred, label='Simple Exponential Smoothing using optimized=True') plt.legend(loc='best') plt.show()
In the plot, we can see that, as we discussed before, the performance of both models was almost identical. The red and green lines may be aligned to each other. But the overall performance of the simple smoothing technique is not that good.
We will discuss the other smoothing techniques, and let’s see if they will work better.
Double Exponential Smoothing
As we discussed the simple exponential smoothing, it consists of one smoothing parameter. The Double exponential smoothing technique consists of two smoothing parameters: level component and trend component for each period. The equation of double exponential smoothing is as follows.
Lt = α Yt + (1 – α) [Lt –1 + Tt –1]
Tt = γ [Lt – Lt –1] + (1 – γ) Tt –1
ŷt = Lt –1 + Tt –1
If the observation occurs at number one, the parameters level and trends are estimated at the zeroth time and must be started to proceed further. The starting method determines how the smoothed values are obtained in one or two ways, either with optimal weight or with specified weight.
In the equation, the notations are as follows:
Lt = level at time t
α = weight of level
Tt = trend at time t
γ = weight of trend
Yt = data value at time t
ŷt = predicted value at time t
The double exponential smoothing technique is a more reliable method for handling data wherein data trend is present, but not seasonality, so the assumption for it, are as follows:
- Level
- Trends
- No seasonality
- Noise
In the double exponential smoothing function, we can iterate values of the following parameters:
- damped(bool, optional)
- smoothing_level(float, optional)
- smoothing_slope(float, optional)
- damping_slope(float, optional)
Fitting the Double Exponential Smoothing Models
We will be performing the grid search technique to find out the best fit parameter for the better performance of the model.
param_grid_DES = {'smoothing_level': [0.10, 0.20,.30,.40,.50,.60,.70,.80,.90], 'smoothing_slope':[0.10, 0.20,.30,.40,.50,.60,.70,.80,.90], 'damping_slope': [0.10, 0.20,.30,.40,.50,.60,.70,.80,.90], 'damped': [True, False]} pg_DES = list(ParameterGrid(param_grid_DES))
output:
In this grid search, we have provided a list of possible combinations of those four parameters. Each combination will go through the model fitting process. We try to pick the best combination from it based on the best RMSE and R-square value.
df_results_DES = pd.DataFrame(columns=['smoothing_level', 'smoothing_slope', 'damping_slope', 'damped', 'RMSE','R²']) for a,b in enumerate(pg_DES): smoothing_level = b.get('smoothing_level') smoothing_slope = b.get('smoothing_slope') damping_slope = b.get('damping_slope') damped = b.get('damped') fit_Holt = Holt(trainclose, damped=damped).fit(smoothing_level=smoothing_level, smoothing_slope=smoothing_slope, damping_slope=damping_slope, optimized=False) fcst_gs_pred_Holt = fit_Holt.forecast(len(testclose)) df_pred = pd.DataFrame(fcst_gs_pred_Holt, columns=['Forecasted_result']) RMSE = np.sqrt(metrics.mean_squared_error(testclose, df_pred.Forecasted_result)) r2 = metrics.r2_score(testclose, df_pred.Forecasted_result) df_results_DES = df_results_DES.append({'smoothing_level':smoothing_level, 'smoothing_slope':smoothing_slope, 'damping_slope':damping_slope, 'damped':damped, 'RMSE':RMSE, 'R²':r2}, ignore_index=True) df_results_DES.sort_values(by=['RMSE','R²']).head(10)
Output:
As we can see in the output, the influence of damping_slope is lower than smoothing_level, smoothing_slope and damped, RMSE, and R-square values remain the same till the 9th line of the list. So we are selecting the 806th index as our best smoothing parameters.
By using those parameters we will check for the evaluation metric values:
DES = Holt(trainclose,damped=damped_setting_DES) fit_Holt = DES.fit(smoothing_level=smoothing_level_value_DES, smoothing_slope=smoothing_slope_value_DES, damping_slope=damping_slope_value_DES ,optimized=False) fcst_gs_pred_Holt = fit_Holt.forecast(len(testclose)) timeseries_evaluation_metrics_func(testclose, fcst_gs_pred_Holt)
Output:
As we have the option, we let the model search for the best fit of its parameters as we have done with the simple exponential smoothing. Then, we just need to put the data into the model and change the optimized status from False to True.
DES = Holt(trainclose) fit_Holt_auto = DES.fit(optimized= True, use_brute = True) fcst_auto_pred_Holt = fit_Holt_auto.forecast(len(testclose)) timeseries_evaluation_metrics_func(testclose, fcst_auto_pred_Holt)
There is, again, not very significant changes in the results as we compare the results of the model with grid search and the model with optimized True status.
We can check for the summary of the model to know about the parameter that is selected by the time of model fitting.
fit_Holt_auto.summary()
Output:
We will make a trend plot to visualize the comparison between the models of double exponential smoothing.
plt.rcParams["figure.figsize"] = [16,9] plt.plot(trainclose, label='Train') plt.plot(testclose, label='Test') plt.plot(fcst_gs_pred_Holt, label='Double Exponential Smoothing with custom grid search') plt.plot(fcst_auto_pred_Holt, label='Double Exponential Smoothing using optimized=True') plt.legend(loc='best') plt.show()
Output:
The double exponential smoothing performed far better than single exponential smoothing on this data set in the visualisation. We have seen how the model performs in this dataset, where we assumed no seasonality but trend and level are present with the noise. For time series with seasonality, the recommended technique is triple exponential smoothing, which we will be discussing in our next step.
Triple Exponential Smoothing
Triple exponential smoothing is used when three high-frequency signals ( level, trend, seasonality ) are available in time series; till now, we have learned about the level and trends removal and making predictions. We most commonly use triple exponential smoothing when we have seasonality in our time series.
There are two types of seasonality: ‘multiplicative’ and ‘additive’ in nature. So, for example, if we are selling agricultural commodities and we sell 10000 kg more, every year in February than we sell every year in January, here components get added together to make a time series, or in the trend, the peak is the same throughout the time series and hence seasonality is additive.
But, if we sell 20% more every year in summer months than in winter months, then, in this case, the component will multiply together to make a time series, or in the trend we see the amplitude of seasonal activity increasing or decreasing or the peak is having multiplicative growth or shrink; hence, seasonality is multiplicative.
In the case of additive seasonality forecasting algorithms use the formulas are:
- St = α × (Xt − Ct−L) + (1 − α) × (St−1 + Φ × Bt−1)
- Bt = β × (St − St−1) + (1 − β) × Φ × Bt−1
- Ct = γ × (Xt − St) + (1 − γ) × Ct−L
- Ft+m = St + (Φ + Φ2 + … + Φm) × Bt + Ct−L+1+((m−1)mod L)
Here α, β, and γ are smoothing constants whose value varies between 0 to 1.
Φ = damped smoothing factor
X = Observation
S = smoothed observation
B = trend factor
C = seasonal index
F = the forecast at m period ahead
t = the index that denotes a time period.
Let’s assume that a time series consists of the following:
- Level
- Trends
- Seasonality
- Noise
In the triple exponential smoothing function we can iterate between eight parameter those are:
- trend ({‘additive’, ‘multiplicative’, None}, optional)
- seasonal ({‘additive’, ‘multiplicative’, None}, optional)
- seasonal_periods(int, optional)
- smoothing_level(float, optional)
- smoothing_slope(float, optional)
- damping_slope(float, optional)
- use_boxcox({True, False, ‘log’, ‘float’}, optional)
- remove_bias(bool, optional)
- use_basinhopping(bool, optional)
Fitting the Triple Exponential Smoothing Models
In this grid search, we have provided a list of possible combinations of those eight parameters. Each combination will go through the model fitting process. We try to pick the best combination based on the best RMSE and R-square value.
param_grid_TES = {'trend': ['add', 'mul'], 'seasonal' :['add', 'mul'], 'seasonal_periods':[3,6,12], 'smoothing_level': [.20, .40, .60, .80], # extended search grid: [.10,.20,.30,.40,.50,.60,.70,.80,.90] 'smoothing_slope':[.20, .40, .60, .80], # extended search grid: [.10,.20,.30,.40,.50,.60,.70,.80,.90] 'damping_slope': [.20, .40, .60, .80], # extended search grid: [.10,.20,.30,.40,.50,.60,.70,.80,.90] 'damped' : [True, False], 'use_boxcox':[True, False], 'remove_bias':[True, False],'use_basinhopping':[True, False]} pg_TES = list(ParameterGrid(param_grid_TES))
Let’s fit the model.
df_results_TES = pd.DataFrame(columns=['trend','seasonal_periods','smoothing_level', 'smoothing_slope', 'damping_slope','damped','use_boxcox','remove_bias', 'use_basinhopping','RMSE','R²']) for a,b in enumerate(pg_TES): trend = b.get('trend') smoothing_level = b.get('smoothing_level') seasonal_periods = b.get('seasonal_periods') smoothing_level = b.get('smoothing_level') smoothing_slope = b.get('smoothing_slope') damping_slope = b.get('damping_slope') damped = b.get('damped') use_boxcox = b.get('use_boxcox') remove_bias = b.get('remove_bias') use_basinhopping = b.get('use_basinhopping') fit_ES = ExponentialSmoothing(trainclose, trend=trend, damped=damped, seasonal_periods=seasonal_periods).fit(smoothing_level=smoothing_level, smoothing_slope=smoothing_slope, damping_slope=damping_slope, use_boxcox=use_boxcox, optimized=False) fcst_gs_pred_ES = fit_ES.forecast(len(testclose)) df_pred = pd.DataFrame(fcst_gs_pred_ES, columns=['Forecasted_result']) RMSE = np.sqrt(metrics.mean_squared_error(testclose, df_pred.Forecasted_result)) r2 = metrics.r2_score(testclose, df_pred.Forecasted_result) df_results_TES = df_results_TES.append({'trend':trend, 'seasonal_periods':seasonal_periods, 'smoothing_level':smoothing_level, 'smoothing_slope':smoothing_slope, 'damping_slope':damping_slope,'damped':damped, 'use_boxcox':use_boxcox, 'remove_bias':remove_bias, 'use_basinhopping':use_basinhopping, 'RMSE':RMSE,'R²':r2}, ignore_index=True) df_results_TES.sort_values(by=['RMSE','R²']).head(10)
Output:
As we can see in the output, there are no changes in RMSE and R-square value till the 10th line of the list. So we are selecting the 6237th index as our best smoothing parameters. By using those parameters we will check for the evaluation metric values:
TES = ExponentialSmoothing(trainclose, trend=trend_setting_TES, damped=damped_setting_TES, seasonal_periods=seasonal_periods_values_TES) fit_ES = TES.fit(smoothing_level=smoothing_level_values_TES, smoothing_slope=smoothing_slope_values_TES, damping_slope=damping_slope_values_TES, use_boxcox=use_boxcox_setting_TES, remove_bias=remove_bias_setting_TES, optimized=False) fcst_gs_pred_ES = fit_ES.forecast(len(testclose)) timeseries_evaluation_metrics(testclose, fcst_gs_pred_ES)
Output:
As we have the option, we let the model search for the best fit of its parameters as we have done with the simple exponential smoothing and double exponential smoothing.
We just need to put the data into the model and change the status of the optimized from False to True.
TES = ExponentialSmoothing(trainclose) fit_ES_auto = TES.fit(optimized= True, use_brute = True) fcst_auto_pred_ES = fit_ES_auto.forecast(len(testclose)) timeseries_evaluation_metrics_func(testclose, fcst_auto_pred_ES)
We can check for the summary of the model to know about the parameters selected by the time of model fitting.
fit_ES_auto.summary()
Output:
We will make a trend plot to visualize the comparison between the models of triple exponential smoothing.
plt.rcParams["figure.figsize"] = [16,9] plt.plot(trainclose, label='Train') plt.plot(testclose, label='Test') plt.plot(fcst_gs_pred_ES, label='Triple Exponential Smoothing with custom grid search') plt.plot(fcst_auto_pred_ES, label='Triple Exponential Smoothing using optimized=True') plt.legend(loc='best') plt.show()
Output:
From the graph above, we discover that we have got an almost similar trend on the prediction of test data(green trend) to the target test data(orange trend) with the help of triple exponential smoothing with custom grid search model, but in case of triple exponential smoothing with optimized = True model, we have got the slight deviation in the trend of our predicted data(red trend), it might be because of little seasonality changes or expectedly higher deviation in trend in some data points.
There are a lot of time series forecasting methods. This article discussed simple methods to utilise exponential smoothing methods to predict or forecast time series values with low deviation over time.
References:
All the content written in the entire post was created using the following sources:
- Time series analysis – smoothing methods
- Data source – FB.xlsx
- Colab notebook for codes of exponential smoothing techniques