Advertisement

Active Hackathon

Hands-On Guide To Orbit: Uber’s Python Framework For Bayesian Forecasting & Inference

Orbit is an open-source Python framework created by Uber for Bayesian time series forecasting and inference.
Orbit feature image

Although several machine learning and deep learning models have been adopted for time series forecasting tasks, parametric statistical approaches like ARIMA still reign supreme while dealing with low granularity data. Orbit is a Python framework created by Uber for Bayesian time series forecasting and inference; it is built upon probabilistic programming packages like PyStan and Uber’s own Pyro. Orbit currently supports the implementations of the following forecasting models:

  • Exponential Smoothing (ETS)
  • Damped Local Trend (DLT)
  • Local Global Trend (LGT)

It also supports the following sampling methods for model estimation:

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.
  • Markov-Chain Monte Carlo (MCMC) as a full sampling method
  • Maximum a Posteriori (MAP) as a point estimate method
  • Variational Inference (VI) as a hybrid-sampling method on approximate distribution

Orbit refined two of the models, namely DLT and LGT. These tweaked models were compared with popular time series models such as SARIMA and Facebook Prophet. Symmetric mean absolute percentage error (SMAPE) was used as the forecast metric for comparing these models. 

Symmetric mean absolute percentage error (SMAPE)

Here Xt is the value measured at time t, and h is the forecast horizon.

The benchmark study was conducted on five datasets:

  •  US and Canada rider first-trips with Uber (20 weekly series by city)
  •  US and Canada driver weekly first-trips with Uber (20 weekly series by city)
  •  Worldwide first-orders with Uber Eats(15 daily series by country)
  • M3 series (1428 monthly series)
  • M4 series (359 weekly series)
Orbit’s refined models vs Prophet and SARIMA

Orbit’s refined models consistently deliver better accuracy than the other time series models in terms of SMAPE.

Implementing Damp Local Trend models with Orbit

  1. Install orbit from PyPI

!pip install orbit-ml

For other installation, methods see this.

  1. Import necessary libraries and classes.
 import pandas as pd
 import numpy as np
 from datetime import timedelta
 from orbit.models.dlt import DLTMAP, DLTAggregated, DLTFull
 from orbit.diagnostics.plot import plot_predicted_data
 from orbit.diagnostics.plot import plot_predicted_components
 from orbit.utils.dataset import load_iclaims 
  1. Load the data for forecasting.

We are going to use the iclaims_example dataset provided by Orbit. It contains the weekly initial claims for US unemployment benefits against a few related google trend queries (unemploy, filling, and job)from Jan 2010 – June 2018.

 # load data
 df = load_iclaims()
 date_col = 'week'
 response_col = 'claims'
 #split the dataset
 test_size = 52
 train_df = df[:-test_size]
 test_df = df[-test_size:] 
  1. Create and train the DLT models.

We will try out the three different wrappers for the Damped Local Trend model provided in Orbit-  DLTMAP, DLTAggregated, and  DLTFull.

  • DLT model for MAP (Maximum a Posteriori) prediction
 #define the DLTMAP model
 dlt = DLTMAP(
     response_col=response_col,
     date_col=date_col,
     seasonality=52,
     seed=8888,
 )

 #train the model
 dlt.fit(df=train_df)

 #make inference
 predicted_df = dlt.predict(df=test_df)

 #Plot the results
 _ = plot_predicted_data(training_actual_df=train_df, predicted_df=predicted_df, 
                     date_col=date_col, actual_col=response_col,  
                     test_actual_df=test_df, title='Prediction with DLTMAP Model') 
Prediction with DLTMAP Model
  • DLTFull

In full prediction, the prediction occurs as a function of each parameter posterior sample, and the prediction results are aggregated after prediction.

 #define the DLTFull model
 dlt = DLTFull(
     response_col=response_col,
     date_col=date_col,
     seasonality=52,
     seed=8888
 )

 #train the model
 dlt.fit(df=train_df)

 #make inference
 predicted_df = dlt.predict(df=test_df)

 #plot results
 _ = plot_predicted_data(training_actual_df=train_df, predicted_df=predicted_df, 
                     date_col=dlt.date_col, actual_col=dlt.response_col, 
                     test_actual_df=test_df, title='Prediction with DLTFull Model') 
Prediction with DLTFull Model
  • DLTAggregated

In aggregated prediction, the parameter posterior samples are reduced using aggregate_method ({ 'mean', 'median' }) before performing a single prediction.

 #define the DLTAggregated model
 dlt = DLTAggregated(
     response_col=response_col,
     date_col=date_col,
     seasonality=52,
     seed=8888
 )

 #train the model
 dlt.fit(df=train_df)

 #make inference
 predicted_df = dlt.predict(df=test_df)

 #plot results
 plot_predicted_data(training_actual_df=train_df, predicted_df=predicted_df, 
                     date_col=dlt.date_col, actual_col=dlt.response_col, 
                     test_actual_df=test_df, title='Prediction with DLTAggregated Model') 
Prediction with DLTAggregated Model

More information about the different models is available here.

Code for the above implementation has been taken from the official example notebook available here.

Last Epoch (Endnote)

This article introduced Orbit, Uber’s open-source Python library for time series forecasting. With the help of the underlying probabilistic programming packages, Orbit introduces multiple model refinements like additional global trends, transformation for multiplicative form, noise distribution, and priors’ choice. These refined models outperform other common forecasting models like SARIMA and Facebook Prophet. As per the paper, the developers are adding new models and features like dual seasonality support, full Pyro integration, etc.

To better understand the mathematics behind the refinements and to know more about the different models and methods provided by Orbit, see:

More Great AIM Stories

Aditya Singh
A machine learning enthusiast with a knack for finding patterns. In my free time, I like to delve into the world of non-fiction books and video essays.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

Data Science Skills Survey 2022 – By AIM and Great Learning

Data science and its applications are becoming more common in a rapidly digitising world. This report presents a comprehensive view to all the stakeholders — students, professionals, recruiters, and others — about the different key data science tools or skillsets required to start or advance a career in the data science industry.

How to Kill Google Play Monopoly

The only way to break Google’s monopoly is to have localised app stores with an interface as robust as Google’s – and this isn’t an easy ask. What are the options?

[class^="wpforms-"]
[class^="wpforms-"]