Time-series forecasting is a crucial machine learning problem in various fields including the stock market, climate, healthcare, business planning, space science, communication engineering and traffic flow. Time-series forecasting is the systematic analysis of historic (past) signal correlations to predict future outcomes. Time-series forecasting can be grouped roughly into two classifications based on the model outputs: probabilistic time-series forecasting and deterministic time-series forecasting.
Probabilistic time-series forecasting aims to develop a distribution of predictions. Since the future is stochastic in nature, it is hard to arrive at a single prediction. Generative models such as cVAE and GANs mostly do follow this probabilistic approach in developing most-likely diverse predictions. These models sample future values from a learnt approximation distribution. These models employ quantile loss or mean square error loss or its variants to make diverse predictions. However, these predictions are not sharp but smooth and vary remarkably from the ground truth predictions.
Deterministic time-series forecasting aims to develop a single, sharp and realistic forecast. Deep neural networks including ARIMA, recurrent neural networks and transformers mostly do follow this approach and exhibit state-of-the-art results on many tasks. They are great in modeling complex non-linear relationships between variables and time. Recent deep neural network models attempt to improve their architecture for realistic and sharp predictions. Some models attempt to employ the mean squared error loss function in training to handle the blurred prediction issues. Though the deep neural networks provide single close-to-accurate forecasting, they are unable to generate distributed predictions considering that the future is stochastic in nature.
French researchers, Vincent Le Guen and Nicholas Thome, have introduced a probabilistic time-series forecasting model that blends the deterministic approach and the probabilistic approach to yield sharp, distributed and diverse forecasts. It is named STRIPE, the acronym for Shape and Time Diversity in Probabilistic Forecasting. STRIPE consists in a structured shape and time diversity mechanism based on DPP (Determinantal Point Processes) that is differentiable. This model also has an iterative sampling mechanism to have control over diversity structure. The predictions of STRIPE fit is close to the ground truth, making the model state-of-the-art in the time-series forecasting domain.
STRIPE is built upon the traditional Seq2Seq (sequence-to-sequence) architecture. It incorporates DILATE loss function to deliver non-stationary, time and space diverse forecasts. STRIPE receives historical data as input, performs shape and time sampling and produces two disentangled latent representations one for shape-diversity and another for time-diversity. These latent representations are fed into respective decoders to produce two different predictions one for shape-diversity and another for time-diversity. By training this set-up end-to-end, the base encoder-decoder learns the pattern precisely in shape and temporal space.
STRIPE needs a PyTorch environment. GPU is optional but preferred.
Step-1: Install STRIPE
The following command installs STRIPE from its source.
!git clone https://github.com/vincent-leguen/STRIPE.git
Step-2: Install dependencies
The following commands install the libraries
properscoring that are required for training and evaluation of the STRIPE model.
!pip install tslearn !pip install properscoring
Step-3: Create the environment
Following codes change the directory to
content/STRIPE/ to import the modules and classes. Create a PyTorch environment.
%cd content/STRIPE/ import numpy as np import torch import random from torch.utils.data import DataLoader import warnings; warnings.simplefilter('ignore') from data.synthetic_dataset import create_synthetic_dataset_multimodal, SyntheticDataset from models.models import cVAE, STRIPE, STRIPE_conditional, TestSampler_Sequential from trainer.trainer import train_model, train_STRIPE, eval_model
Step-4: Configure the device
The following codes check for GPU’s availability and enable a GPU runtime, if available.
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") random.seed(0)
Step-5: Prepare data and set training parameters
The following codes prepare data, split it into train-evaluation sets and declare necessary training parameters. Parameters can be altered suitably based on the device configuration and memory availability.
batch_size = 100 N = 100 N_input = 20 N_output = 20 sigma = 0.01 gamma = 0.01 X_train_input,X_train_target,X_test_input,X_test_target = create_synthetic_dataset_multimodal(N,N_input,N_output,sigma) dataset_train = SyntheticDataset(X_train_input,X_train_target) dataset_test = SyntheticDataset(X_test_input,X_test_target) trainloader = DataLoader(dataset_train, batch_size=batch_size,shuffle=True, num_workers=0) testloader = DataLoader(dataset_test, batch_size=10,shuffle=False, num_workers=0) input_size = 1 rnn_units = 128 nlayers = 1 bidirectional = False latent_dim = 16 fc_units = 10
Step-6: Train a Probabilistic model
Train a probabilistic model composed of a variational auto-encoder with the following codes. Here, a cVAE is employed.
model_dilate = cVAE(input_size,rnn_units,nlayers,bidirectional,latent_dim,fc_units,N_output,device).to(device) train_model(model_dilate, trainloader, testloader, loss_type='dilate', nsamples=10, learning_rate=0.001, device=device, epochs=501, gamma=gamma, alpha=0.5, print_every=50, eval_every=100, verbose=1)
Step-7: Train STRIPE-shape network
Train the STRIPE-shape part from end-to-end with the following codes.
nshapes = 10 stripe_shape = STRIPE('shape',nshapes, latent_dim, N_output, rnn_units).to(device) train_STRIPE(cvae=model_dilate, stripe=stripe_shape, trainloader=trainloader, testloader=testloader, device=device, mode_stripe='shape', nsamples=nshapes, quality='', diversity_kernel='dtw', learning_rate=0.001, epochs=16, print_every=2,eval_every=5, alpha=0.5)
Step-8: Train STRIPE-time network
Train the STRIPE-time part from end-to-end with the following codes.
ntimes = 10 stripe_time = STRIPE_conditional('time',ntimes, latent_dim, N_output, rnn_units).to(device) train_STRIPE(cvae=model_dilate,stripe=stripe_time, trainloader=trainloader, testloader=testloader, device=device, mode_stripe='time', nsamples=ntimes, quality='', diversity_kernel='tdi', learning_rate=0.001, epochs=1, print_every=16,eval_every=5, alpha=0.5)
Step-9: Evaluate the model
Evaluate the complete model on the test data based on DILATE loss with the following codes.
test_sampler = TestSampler_Sequential(model_dilate, stripe_shape, stripe_time) _,_ = eval_model(test_sampler, testloader,nsamples=10, device=device, gamma=0.01,mode='test_sampler')
Performance of STRIPE
The developers evaluated STRIPE on two time-series benchmarks:
- Traffic (hourly road occupancy rates from California Department of Transportation)
- Electricity (hourly electricity consumption measurements)
The evaluation of STRIPE and other competing models is based on the mean square error loss (MSE) and DILATE loss. STRIPE greatly outperforms the recent state-of-the-art models such as the N-Beats Algorithm and probabilistic deep AR model. Qualitative analysis reveals that the STRIPE outcomes superimpose the ground truth with sharpness and diversity.
Read more in the original research paper here.
Find the source code repository here.
Know more about trending time-series models here.