Now Reading
Comprehensive Guide To Facebook’s Prophet With Python Code

Comprehensive Guide To Facebook’s Prophet With Python Code

Prophet

Prophet, a Facebook Research’s project, has marked its place among the tools used by ML and Data Science enthusiasts for time-series forecasting. Open-sourced on February 23, 2017 (blog), it uses an additive model to forecast time-series data. This article aims at providing an overview of the extensively used tool along with its Pythonic demonstration.

Highlighting features of Prophet

  • It performs time-series forecasting “at scale” which means memory usage and computations complexity are not big-deal concerns for the Prophet while making a forecast.
  • It can fit time-series data having non-linearity in trends as well as holiday effects.
  • It works quite well with data having daily, weekly, monthly and/or yearly seasonality and in cases where we have several seasons of recorded historical data for making future forecasts.
  • It has R and Python APIs for time-series forecasting.
  • It can be downloaded as a CRAN or PyPI package.
  • It is highly susceptible to missing data, outliers and erratic changes in time-series data.
  • It makes use of the Stan platform for making forecasts quickly and with easily interpretable parameters.

NOTE: ‘Trend’ in time-series refers to an overall change in the data with time. While the term ‘seasonality’ means the way the data changes over a specific period e.g. week, month, year etc.

Register for our upcoming Masterclass>>

Working of Prophet

Working of Prophet

Image source: Facebook blog

Prophet employs an additive regression model having four constituents at its core:

  • A curve for detecting changes in trends of the variable for which forecast is to be made by picking variation-points from the time-series data.
  • A yearly seasonal component (uses Fourier series)
  • A weekly seasonal component
  • A customizable list representing holiday effects in the data

Practical implementation

Here’s a demonstration of using Python API for forecasting avocados’ prices using Prophet. The dataset used is available on Kaggle. The code implementation has been done using Google Colab and fbprophet 0.7.1 library. Step-wise implementation of the code is as follows:

Looking for a job change? Let us help you.
  1. Install the fbprophet Python library.

!pip install fbprophet

  1. Import required libraries
 import numpy as np
 import pandas as pd
 import matplotlib.pyplot as plt
 import seaborn as sns
 from fbprophet import Prophet 
  1. Load the avocado dataset.

df = pd.read_csv('avocado.csv')

  1. Display the initial records of the dataset.

df.head()

Output:

Prophet data
  1. Get information about columns, number of entries, data types etc. of the dataset.

df.info()

Output:

Prophet data info
  1. Sort the DataFrame in ascending order of recorded date and create a new DataFrame having sorted records.

df1 = df.sort_values("Date")

Display some initial records of the sorted data.

df1.head()

Output:

Prophet sorted data
  1. Plot the recorded prices and observe the trend.

First, get the minimum and maximum dates in the historical data.

df1[‘Date’].min()

Output: 2015-01-04

df2[‘Date’].max()

Output: 2018-03-25

These outputs show that we have records from January 2015 to March 2018.Plot the prices of that period.

 plt.figure(figsize=(25,10))
 plt.plot(df1['Date'],df1['AveragePrice']) 

Output:

Prophet plot1
  1. We can also observe region-wise distribution of the data.
 plt.figure(figsize=(25,12))
 sns.countplot(x='region',data=df1)
 plt.xticks(rotation=45) 

Output:

 (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
         17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
         34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
         51, 52, 53]), <a list of 54 Text major ticklabel objects>) 
Prophet plot2

The plot shows that the data is balanced i.e. equally distributed region-wise.

  1. Know the year-wise count of records in the data.

sns.countplot(x='year',data=df1)

Output:

Prophet plot3
  1. Prophet expects a DataFrame as input in which there are two columns specifically named as ‘ds’ and ‘y’. ‘ds’ is the datestamp column while ‘y’ is the numeric variable for which forecast is to be made.So we need to keep only the ‘Date’ and ‘AveragePrice’ columns of df1 DataFrame and rename them as ‘ds’ and ‘y’ respectively.

Extract the two required columns

 df1 = df1[['Date','AveragePrice']]
 df1 

Output:

Prophet selected columns

Rename the columns

 df1.columns = ['ds','y']
 #Display initial columns to check if the columns have got renamed
 df1.head() 

Output:

Prophet renamed columns
  1. Forecast the future prices using Prophet.

Create a Prophet instance

m = Prophet()

Fit the historical data

m.fit(df1)

Create a DataFrame with future dates for forecast. 

 future = m.make_future_dataframe(periods=365)
 #periods=365 specifies that forecast will be made for next 1 year 

df1 has dates till 25/3/2018 so ‘future’ will be till 25/3/2019. Predict the prices for this new data having future dates as well

forecast = m.predict(future)

Get information on the ‘forecast’ DataFrame created by Prophet.

forecast.info()

Prophet forecast dataframe info

Display a few initial records of ‘forecast’.

forecast.head()

See Also

Condensed output:

Prophet forecast dataframe

11)  Plot the data with recorded as well as forecasted prices.     

 figure = m.plot(forecast,xlabel='Date',ylabel='Price')

  Output:

Prophet forecast plot

Our original data had monthly records till February 2019. The blue-shaded portion of the  above plot shows the prices predicted for the next one year’s span, i.e. till February 2019.

Actual recorded prices have been marked with black dots in the above plot, while the The blue non-linear line shows the average predicted prices.

  1. Plot the components of the forecast.

figure = m.plot_components(forecast)

Output:

Prophet forecast components
  1. The above forecast is made for all regions in general. We can make forecast for a specific region as follows:

Extract data of the required region from the original data.

df2 = df[df['region']=='West']

Display initial records.

df.head()

Output:

  1. Sort the regional data in ascending order of dates.

df2 =  df2.sort_values('Date')

Plot the recorded prices for that specific region.

 plt.figure(figsize=(15,10))
 plt.plot(df2['Date'],df2['AveragePrice']) 

Output:

  1. Extract the ‘Date’ and ‘AveragePrice’ column and rename them as ‘ds’ and ‘y’ respectively.
 df2 = df2[['Date','AveragePrice']]
 df2.columns = ['ds','y'] 
  1. Create Prophet instance and fit the data
 m = Prophet()
 m.fit(df2) 

Forecast prices for the next one year for that specific region.

 future = m.make_future_dataframe(periods=365)
 forecast = m.predict(future) 
  1. Plot the recorded and forecasted prices for the region.
figure = m.plot(forecast,xlabel='Date',ylabel='Price')

Output:

(Black dots: actual price values, Blue curve: predicted prices)

figure = m.plot_components(forecast)

Output:

  • Check Google colab notebook for the whole code here.

References

What Do You Think?

Join Our Discord Server. Be part of an engaging online community. Join Here.


Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top