Although several machine learning and deep learning models have been adopted for time series forecasting tasks, parametric statistical approaches like ARIMA still reign supreme while dealing with low granularity data. Orbit is a Python framework created by Uber for Bayesian time series forecasting and inference; it is built upon probabilistic programming packages like PyStan and Uber’s own Pyro. Orbit currently supports the implementations of the following forecasting models:
- Exponential Smoothing (ETS)
- Damped Local Trend (DLT)
- Local Global Trend (LGT)
It also supports the following sampling methods for model estimation:
- Markov-Chain Monte Carlo (MCMC) as a full sampling method
- Maximum a Posteriori (MAP) as a point estimate method
- Variational Inference (VI) as a hybrid-sampling method on approximate distribution
Orbit refined two of the models, namely DLT and LGT. These tweaked models were compared with popular time series models such as SARIMA and Facebook Prophet. Symmetric mean absolute percentage error (SMAPE) was used as the forecast metric for comparing these models.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Here Xt is the value measured at time t, and h is the forecast horizon.
The benchmark study was conducted on five datasets:
- US and Canada rider first-trips with Uber (20 weekly series by city)
- US and Canada driver weekly first-trips with Uber (20 weekly series by city)
- Worldwide first-orders with Uber Eats(15 daily series by country)
- M3 series (1428 monthly series)
- M4 series (359 weekly series)
Orbit’s refined models consistently deliver better accuracy than the other time series models in terms of SMAPE.
Implementing Damp Local Trend models with Orbit
- Install orbit from PyPI
!pip install orbit-ml
For other installation, methods see this.
- Import necessary libraries and classes.
import pandas as pd import numpy as np from datetime import timedelta from orbit.models.dlt import DLTMAP, DLTAggregated, DLTFull from orbit.diagnostics.plot import plot_predicted_data from orbit.diagnostics.plot import plot_predicted_components from orbit.utils.dataset import load_iclaims
- Load the data for forecasting.
We are going to use the iclaims_example
dataset provided by Orbit. It contains the weekly initial claims for US unemployment benefits against a few related google trend queries (unemploy, filling, and job)from Jan 2010 – June 2018.
# load data df = load_iclaims() date_col = 'week' response_col = 'claims' #split the dataset test_size = 52 train_df = df[:-test_size] test_df = df[-test_size:]
- Create and train the DLT models.
We will try out the three different wrappers for the Damped Local Trend model provided in Orbit- DLTMAP
, DLTAggregated
, and DLTFull
.
- DLT model for MAP (Maximum a Posteriori) prediction
#define the DLTMAP model dlt = DLTMAP( response_col=response_col, date_col=date_col, seasonality=52, seed=8888, ) #train the model dlt.fit(df=train_df) #make inference predicted_df = dlt.predict(df=test_df) #Plot the results _ = plot_predicted_data(training_actual_df=train_df, predicted_df=predicted_df, date_col=date_col, actual_col=response_col, test_actual_df=test_df, title='Prediction with DLTMAP Model')
- DLTFull
In full prediction, the prediction occurs as a function of each parameter posterior sample, and the prediction results are aggregated after prediction.
#define the DLTFull model dlt = DLTFull( response_col=response_col, date_col=date_col, seasonality=52, seed=8888 ) #train the model dlt.fit(df=train_df) #make inference predicted_df = dlt.predict(df=test_df) #plot results _ = plot_predicted_data(training_actual_df=train_df, predicted_df=predicted_df, date_col=dlt.date_col, actual_col=dlt.response_col, test_actual_df=test_df, title='Prediction with DLTFull Model')
- DLTAggregated
In aggregated prediction, the parameter posterior samples are reduced using aggregate_method ({ 'mean', 'median' })
before performing a single prediction.
#define the DLTAggregated model dlt = DLTAggregated( response_col=response_col, date_col=date_col, seasonality=52, seed=8888 ) #train the model dlt.fit(df=train_df) #make inference predicted_df = dlt.predict(df=test_df) #plot results plot_predicted_data(training_actual_df=train_df, predicted_df=predicted_df, date_col=dlt.date_col, actual_col=dlt.response_col, test_actual_df=test_df, title='Prediction with DLTAggregated Model')
More information about the different models is available here.
Code for the above implementation has been taken from the official example notebook available here.
Last Epoch (Endnote)
This article introduced Orbit, Uber’s open-source Python library for time series forecasting. With the help of the underlying probabilistic programming packages, Orbit introduces multiple model refinements like additional global trends, transformation for multiplicative form, noise distribution, and priors’ choice. These refined models outperform other common forecasting models like SARIMA and Facebook Prophet. As per the paper, the developers are adding new models and features like dual seasonality support, full Pyro integration, etc.
To better understand the mathematics behind the refinements and to know more about the different models and methods provided by Orbit, see: