Guide To GluonTS and PytorchTS For Time-Series Forecasting (With Python Implementation)

GluonTS is a toolkit that is specifically designed for probabilistic time series modeling, It is a subpart of the Gluon organization, Gluon is an open-source deep-learning interface that allows developers to build neural nets without compromising performance and efficiency. AWS and Microsoft first introduced it on October 12th, 2017 that provides many different neural network architectures and leverages the deep learning models. It combines many packages into one like mxnet– a lightweight, portable, flexible Distributed/Mobile Deep learning model; for Python, R, Julia, Scala. Go, Javascript, and more.

Gluon’s goal is to leverage Jupyter notebooks’ strengths to present graphics, equations, and code together in one place.

GluonTS

gluon logo

“GluonTS simplifies the development of and experimentation with time series models for common tasks such as forecasting or anomaly detection. It provides all necessary components and tools that scientists need for quickly building new models, for efficiently running and analyzing experiments and for evaluating model accuracy.”

— GluonTS Arxiv Research paper

GluonTS is a toolkit that is specifically designed for probabilistic time series modeling, GlounTS provides the utilities for loading and looping over time-series datasets. It also provides state of the art models for time series forecasting, and a building block to define your own models and quickly test with different solutions, GluonTS has many features like you can:

  • Train and evaluate any inbuilt models on your custom dataset
  • Quickly create your solution using GluonTS
  • It provides custom abstractions and building blocks to create custom models.
  • Provide multiple baseline algorithms for comparison.
  • Plotting and evaluation facilities
  • Artificial and real datasets.

Installation

pip install gluonts
# as gluonts relies on mxnet
# install MXnet using pip
pip install mxnet

Getting Started 

We have seen time series forecasting using TensorFlow and PyTorch, but they come with a lot of code and require great proficiency over the framework. GluonTS provide simple and on point code for running your time series forecasting here is an example code to run GluonTS for predicting Twitter volume with DeepAR.

You can run the following code in a cloud development environment at: https://github.com/mmaithani/data-science/blob/main/Gluonts_twitter_volume_forecasting.ipynb

#importing gluonTS utilities and pandas
from gluonts.dataset import common
from gluonts.model import deepar
from gluonts.trainer import Trainer
import pandas as pd

#getting train datatset of twitter volume
url = "https://raw.githubusercontent.com/numenta/NAB/master/data/realTweets/Twitter_volume_AMZN.csv"
df = pd.read_csv(url, header=0, index_col=0)
data = common.ListDataset([{
    "start": df.index[0],
    "target": df.value[:"2015-04-05 00:00:00"]
}],
                          freq="5min")
#initializing trainers and deepAR estimators
trainer = Trainer(epochs=10)
estimator = deepar.DeepAREstimator(
    freq="5min", prediction_length=12, trainer=trainer)
predictor = estimator.train(training_data=data)

prediction = next(predictor.predict(data))
print(prediction.mean)
prediction.plot(output_file='graph.png')
output

Let’s take an example dataset to forecast time series dataset using GluonTS

#importing modules
%matplotlib inline
import mxnet as mx
from mxnet import gluon
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json

Importing inbuilt GluonTS datasets

from gluonts.dataset.repository.datasets import get_dataset, dataset_recipes
from gluonts.dataset.util import to_pandas
print(f"Available datasets: {list(dataset_recipes.keys())}")

We are going to use the m4_hourly dataset, now the datasets provided by gluonts are objects that consist of three main attributes we will see later.

dataset = get_dataset("m4_hourly", regenerate=True)

Three main attributes of GluonTs dataset are dataset.train is a training dataset, dataset.test is a testing dataset, and dataset.metadata contains metadata of the dataset. Let’s plot and iterate the dataset using the following command.

entry = next(iter(dataset.train))
train_series = to_pandas(entry)
train_series.plot()
plt.grid(which="both")
plt.legend(["train series"], loc="upper left")
plt.show()
gluonts

Similarly, you can plot test dataset

entry = next(iter(dataset.test))
test_series = to_pandas(entry)
test_series.plot()
plt.axvline(train_series.index[-1], color='r') # end of train dataset
plt.grid(which="both")
plt.legend(["test series", "end of train series"], loc="upper left")
plt.show()
dataset test series

Preprocessing 

N = 10  # number of time series
T = 100  # number of timesteps
prediction_length = 24
freq = "1H"
custom_dataset = np.random.normal(size=(N, T))
start = pd.Timestamp("01-01-2019", freq=freq)  # can be different for each time series

Now for splitting dataset and converting it to gluonts format use following commands:

from gluonts.dataset.common import ListDataset
# train dataset: cut "prediction_length", add "target" and "start" fields
train_ds = ListDataset([{'target': x, 'start': start} 
                        for x in custom_dataset[:, :-prediction_length]],
                       freq=freq)
# test dataset: using whole dataset, add "target" and "start" fields
test_ds = ListDataset([{'target': x, 'start': start} 
                       for x in custom_dataset],
                      freq=freq)

Training

GlounTS comes with its own hyper parameters and feedforward neural network like SimpleFeedForwardEstimator that accepts an input window of length context_length and predicts the distribution of the value. Let;s import necessary training methods and assign estimators values using following commands:

from gluonts.model.simple_feedforward import SimpleFeedForwardEstimator
from gluonts.trainer import Trainer
estimator = SimpleFeedForwardEstimator(
    num_hidden_dimensions=[10],
    prediction_length=dataset.metadata.prediction_length,
    context_length=100,
    freq=dataset.metadata.freq,
    trainer=Trainer(ctx="cpu", 
                    epochs=5, 
                    learning_rate=1e-3, 
                    num_batches_per_epoch=100
                   )
)
#start training
predictor = estimator.train(dataset.train)

Evaluate

For evaluating the models we have further make_evaluation_predictions function that automates the process of prediction and model evaluation.

from gluonts.evaluation.backtest import make_evaluation_predictions
forecast_it, ts_it = make_evaluation_predictions(
    dataset=dataset.test,  # test dataset
    predictor=predictor,  # predictor
    num_samples=100,  # number of sample paths for evaluation
)

Convert the generators to list to ease the computations and examine the first element of these lists:

forecasts = list(forecast_it)
tss = list(ts_it)
# first entry of the time series list
ts_entry = tss[0]

Convert the first five value of time-series from pandas to NumPy and initialize first entry of dataset.test

np.array(ts_entry[:5]).reshape(-1,)
dataset_test_entry = next(iter(dataset.test))

Similarly first 5 values and forecast entries

dataset_test_entry['target'][:5]
forecast_entry = forecasts[0]

Output

For visualizing the outputs use following commands:

def plot_prob_forecasts(ts_entry, forecast_entry):
    plot_length = 150 
    prediction_intervals = (50.0, 90.0)
    legend = ["observations", "median prediction"] + [f"{k}% prediction interval" for k in prediction_intervals][::-1]

    fig, ax = plt.subplots(1, 1, figsize=(10, 7))
    ts_entry[-plot_length:].plot(ax=ax)  # plot the time series
    forecast_entry.plot(prediction_intervals=prediction_intervals, color='g')
    plt.grid(which="both")
    plt.legend(legend, loc="upper left")
    plt.show()
plot_prob_forecasts(ts_entry, forecast_entry)
output of gluonts

You can also evaluate the quality of time series forecast using evaluator class, that can compute the aggregate performance metrics,

from gluonts.evaluation import Evaluator
evaluator = Evaluator(quantiles=[0.1, 0.5, 0.9])
agg_metrics, item_metrics = evaluator(iter(tss), iter(forecasts), num_series=len(dataset.test))
print(json.dumps(agg_metrics, indent=4))

PyTorch-ts

You can achieve similar results using a third party framework called PyTorch-ts, built by Zalando Research, that is specifically designed for PyTorch enthusiasts, Pytorch-ts is probabilistic Time Series forecasting framework based on GluonTS backend and its installation and usage are pretty easy, you can find the source code here, There very minimal changes in Pytorch-ts as it used the Pytorch time series model by utilizing GluonTS as its API for loading dataset, transforming and testing.

Installation

$ pip install pytorchts

Getting Started

We are going to use a dataset volume of tweets mentioning the AMZN ticker symbol, to leverage the power of this model, first import the necessary packages using below commands, Notebook is available at: https://github.com/mmaithani/data-science/blob/main/PyTorch_ts_time_series_forecasting(gluonts).ipynb

import pandas as pd
import torch
import matplotlib.pyplot as plt

from pts.dataset import ListDataset
from pts.model.deepar import DeepAREstimator
from pts import Trainer
from pts.dataset import to_pandas

Import the Amazon tweets dataset: and plot the first 100 data points using pandas and plt.

url = "https://raw.githubusercontent.com/numenta/NAB/master/data/realTweets/Twitter_volume_AMZN.csv"
df = pd.read_csv(url, header=0, index_col=0, parse_dates=True)
df[:100].plot(linewidth=2)
plt.grid(which='both')
plt.show(}
timestamp fo twitter dataset

Train the dataset using Pytorch-ts, we are using the data up to midnight on April 5th, 2015, and a 5mins data so req is set to 5min with 15epochs and we are expecting prediction for next hour, so prediction_length is set to 12.

training_data = ListDataset(
    [{"start": df.index[0], "target": df.value[:"2015-04-05 00:00:00"]}],
    freq = "5min")
# parameter initialization
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
estimator = DeepAREstimator(freq="5min",
                            prediction_length=12,
                            input_size=43,
                            trainer=Trainer(epochs=15,
                                            device=device))
predictor = estimator.train(training_data=training_data)

Model is trained so lets forecast the hour following the midnight on 15-04-2015:

test_data = ListDataset(
    [{"start": df.index[0], "target": df.value[:"2015-04-15 00:00:00"]}],
    freq = "5min")
for test_entry, forecast in zip(test_data, predictor.predict(test_data)):
    to_pandas(test_entry)[-60:].plot(linewidth=2)
    forecast.plot(color='b', prediction_intervals=[50.0, 90.0])
plt.grid(which='both')
output gluonts and pytorchts forecasting

Conclusion

We have discussed time series forecasting using GluonTS a forecasting library explicitly made for probabilistic time series problems and the outputs were quite satisfactory. We saw the same approach using PytorchTs (PyTorch-based time series framework backed by Gluon) also the Gluon integrates many other features. There are third party libraries are also been made on top of GluonTS that we are not discussing in this article like pytorch-ts which is a PyTorch-based Probabilistic time series forecasting model based on GluonTS backend.

Working notebooks and other resources used in the above demonstration:

Download our Mobile App

Mohit Maithani
Mohit is a Data & Technology Enthusiast with good exposure to solving real-world problems in various avenues of IT and Deep learning domain. He believes in solving human's daily problems with the help of technology.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring