Giotto-Time is an open-source Python library to perform time-series forecasting in machine learning. It is built on top of SciKit-Learn with a few modifications and wrappings to do end-to-end time-series analysis in a single go. Giotto-Time gives importance to every task associated with the time-series analysis. With Giotto-Time library, Giotto spans its list of powerful open source tools to perform various machine learning tasks.

Time-series forecasting is a centuries-old method of predicting the future with past data in hand. It finds applications in a variety of domains, including e-commerce, finance, space science, weather forecasting and medical sciences. Unlike time-insensitive structured data, time-series data do need more care in every stage of problem solving. Preprocessing time-series data is one of the difficult tasks that needs field expertise to perform.

How To Start Your Career In Data Science?

Giotto-Time is introduced to make the time-series modeling tasks simple. This library presents data preprocessing, data cleaning, data extraction, data analysis, forecast modeling and causality testing with very few lines of code. Data analysis in Giotto-Time is associated with data visualization, for which the library introduces special plots. The data visualization module is built on top of the MatPlotLib library.

We explore the Giotto-Time library in the sequel with some examples and hands-on codes. Giotto-Time is available as a PyPi package. We can simply pip install it.

!pip install giotto-time

## Time-Series Forecasting with Giotto-Time

Import the necessary libraries and modules.

import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from gtime.preprocessing import TimeSeriesPreparation from gtime.compose import FeatureCreation from gtime.feature_extraction import Shift, MovingAverage from gtime.feature_generation import PeriodicSeasonal, Constant, Calendar from gtime.model_selection import horizon_shift, FeatureSplitter from gtime.forecasting import GAR

Define a function to generate some synthetic time-series data using Pandas’ testing module.

def test_time_series(): from pandas.util import testing as testing testing.N, testing.K = 500, 1 df = testing.makeTimeDataFrame( freq="D" ) return df

Generate synthetic time-series data.

time_series = test_time_series() print(f'Time series shape: {time_series.shape}') print(f'Time series index type: {time_series.index.__class__}')

Output:

The time-series data should be in `PeriodIndex`

format to proceed further. The Giotto-Time library offers a time-series preprocessing module using which we can transform the data from `DatetimeIndex`

to `PeriodIndex`

.

time_series_preparation = TimeSeriesPreparation() period_index_time_series = time_series_preparation.transform(time_series) print(f'Time series index type after the preprocessing: \n{period_index_time_series.index.__class__}')

Output:

Let’s visualize the time-series data.

period_index_time_series.plot(figsize=(10, 5)) plt.show()

Extract features and generate new features using the `FeatureCreation`

API of Giotto-Time. Here, moving average of time period is determined and appended as a feature. In addition, a temporal shift is performed to generate two new features.

# Feature generation pipeline dft = FeatureCreation( [('s0', Shift(0), ['time_series']), ('s1', Shift(1), ['time_series']), ('ma3', MovingAverage(window_size=3), ['time_series']), ])

Fit the time-series data into the feature generation pipeline.

X = dft.fit_transform(period_index_time_series) X.head(6)

Output:

Generate the ground truth (output variable) using `horizon_shift`

method.

y = horizon_shift(period_index_time_series, horizon=3) y.head()

Output:

Next, split the data into train and test sets using the `FeatureSplitter`

method. Sample some data from each split part.

feature_splitter = FeatureSplitter() X_train, y_train, X_test, y_test = feature_splitter.transform(X, y) X_train.tail()

Output:

X_test

Output:

y_train.tail()

Output:

y_test

Output:

Develop a simple linear regression model from the SciKit-Learn library. Build a Generalized Auto-Regressive (GAR) model on top of the linear regression model to perform a simple time-series forecasting, and train the model with the training dataset.

lr = LinearRegression() model = GAR(lr) model = model.fit(X_train, y_train)

Once the model is trained, infer the future by predicting it.

predictions = model.predict(X_test) predictions

Output:

Find the Colab Notebook here with above code implementation.

## Time-Series Plotting with Giotto-Time

Import the necessary libraries and modules.

import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline from gtime.preprocessing import TimeSeriesPreparation from gtime.plotting import seasonal_plot, seasonal_subplots, lag_plot, acf_plot

Load the Kansas Wheat Index data from the Giotta-Time’s official google cloud storage.

df_sp = pd.read_csv('https://storage.googleapis.com/l2f-open-models/giotto-time/examples/data/WheatTr.csv', sep='\t') df_column = df_sp.set_index('Effective date ')['S&P GSCI Kansas Wheat']

Transform the data into `PeriodIndex`

format and fill the missing values.

df_column.index = pd.to_datetime(df_column.index) time_series_preparation = TimeSeriesPreparation(output_name='Wheat price index') period_index_time_series = time_series_preparation.transform(df_column) df = period_index_time_series.resample('D').fillna(method='ffill')

Calculate logarithmic value of sales returns and generate a returns data.

returns = (np.log(df / df.shift(1))).dropna() returns.columns = ['Wheat price returns']

Plot the Wheat price and returns to visualize the data.

ax = df.plot(figsize=(10, 5)) ax = returns.plot(ax=ax, secondary_y=True)

Output:

Seasonal plots are powerful tools in Giotto-Time library that give an overall picture of how the time-series data vary over seasons such as yearly, monthly, weekly, etc. The following codes generate seasonal plots for price index data.

fig = plt.figure(figsize=(6,6)) m1 = fig.add_subplot(111, title='Seasonal plot (year/monthly)') seasonal_plot(df, 'year', freq='M', agg='last', ax=m1) plt.plot()

Output:

Plot monthly returns with seasonal plot in polar form.

fig = plt.figure(figsize=(6, 6)) m2 = fig.add_subplot(111, projection='polar') seasonal_plot(returns, 'year', freq='M', agg='last', ax=m2, polar=True) m2.set_title('Monthly returns') plt.plot()

Output:

Seasonal plots can also be realized through Whisker’s box plot. This plot gives the basic statistical summary such as mean, mode, quartiles, minimum and maximum entries.

seasonal_subplots(returns, 'year', 'M', agg='last', box=True) plt.show()

Output:

Lag plots have a prominent place in time-series analysis. It compares the data with its own temporal lags. Giotto-Time’s lag plots are simple to execute. Let’s visualize the price index data in a lag plot with three different lags, one day, one month and one year.

lag_plot(df, lags=[1, 30, 365]) plt.show()

Output:

Let’s visualize the lag plot for the returns data.

lag_plot(returns, lags=[1, 30, 365]) plt.show()

Output:

Autocorrelation of price index seems good even up to a lag of one month. But, in the case of returns, the plot is random irrespective of the lag.

Find the Colab Notebook here with the above code implementation.

## Wrapping up

We discussed the open-source time-series forecasting Python library, Giotto-Time. We went through hands-on practice with Python codes on two tasks.

- Time-series forecasting
- Time-series data plotting

Giotto-Time’s full potential can be explored with real-world time-series problems consisting of data cleaning, data analysis, feature generation, forecasting and causality testing.

### Further reading:

- Github repository
- Official documentation
- Simple models with Giotto-Time
- Hierarchical models with Giotto-Time
- Slack Community Workplace

## Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.Join our Telegram Group. Be part of an engaging community

A geek in Machine Learning with a Master's degree in Engineering and a passion for writing and exploring new things. Loves reading novels, cooking, practicing martial arts, and occasionally writing novels and poems.