Mohit is a Data & Technology Enthusiast with good exposure…

**Time series** refers to plotting data points in sequential time order. Now those data points can use a data of an athlete’s performance, cricket player according to most run in one-day, weather reading every month, the daily closing price of company stock. **Time series ***analysis** *is also the same term, but it is concerned with taking that data-points and cleaning, understanding, and forecasting them using some tools or programming languages. Now time series is sometimes called panel data. Panel data is a general class, multidimensional dataset, on the side time series dataset is a one-dimensional panel.

Let’s talk about **Time series **** forecasting **as we already know that time series analysis is all about analyzing the time series data and extracting meaningful insights from it.

**Time-series **** Forecasting **is more of using

**models**to predict future values based on previously observed cleaned processed time series data.

## Components of Time Series

There are four categories of a component of time series: **Trend**, **Seasonal **& **Cycle Variation**, and **Random or Irregular movements. **Seasonal changes are more of a short time change.

**Trends**show the insights about higher or the lower peak in any dataset.

**Periodic fluctuations**are the type of time series which shows repetition in their visualization over a while. They are of two types:

**Seasonal Variations:**These periodic fluctuations change over a regular period, and change happens in less than a year**Cyclic Variations:**These periodic fluctuations changes over more than one year of the time cycle.

**Random Movement time series**or Noise**:**In this data points are unpredictable, and it hard to make a time series forecasting on these kinds of data because we can’t find patterns easily.

Real world data before cleaning always has some noise, trends, and seasonality.

## Tensorflow models for forecasting

Now time series forecasting or predictive modeling can be done using any framework, TensorFlow provides us a few different styles of models for like Convolution Neural Network (CNN), Recurrent Neural Networks (RNN), you can forecast a single time step using a single feature or you can forecast multiple steps and make all predictions at once using Single-shot.

### Setup

The necessary module you need import to get started they will help you in modeliing, visulization, file handling, data exploration and all sort of thing.

```
import os
import datetime
import IPython
import IPython.display
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf
mpl.rcParams['figure.figsize'] = (8, 6)
mpl.rcParams['axes.grid'] = False
```

Lets’s take the Weather dataset from Max Planck Institute for Biogeochemistry , this dataset contains 14 different feature: air temperature, humidity, atmospheric pressure. From 2003 these datapoints were collected on basis of every 10 minute. Let’s explore the dataset:

#download the zip file of dataset file_path = tf.keras.utils.get_file( origin='https://storage.googleapis.com/tensorflow/tf-keras-datasets/jena_climate_2009_2016.csv.zip', fname='jena_climate_2009_2016.csv.zip', extract=True) csv_path, _ = os.path.splitext(file_path) #explore the dataset df = pd.read_csv(csv_path) df = data[5::6] date_time = pd.to_datetime(df.pop('Date Time'), format='%d.%m.%Y %H:%M:%S') df.head()

```
#Data visualization over the years with some features
plot_cols = ['T (degC)', 'p (mbar)', 'rho (g/m**3)']
plot_features = df[plot_cols]
plot_features.index = date_time
_ = plot_features.plot(subplots=True)
plot_features = df[plot_cols][:480]
plot_features.index = date_time[:480]
_ = plot_features.plot(subplots=True)
```

Let’s clean data for better modelling and visualization:

wv = df['wv (m/s)'] bad_wv = wv == -9999.0 wv[bad_wv] = 0.0 max_wv = df['max. wv (m/s)'] bad_max_wv = max_wv == -9999.0 max_wv[bad_max_wv] = 0.0 df['wv (m/s)'].min() #convert wind direction and velocity column into a wind vector wv = df.pop('wv (m/s)') max_wv = df.pop('max. wv (m/s)') # Convertion to radians. wd_rad = df.pop('wd (deg)')*np.pi / 180 #wind x and y components. df['Wx'] = wv*np.cos(wd_rad) df['Wy'] = wv*np.sin(wd_rad) # max wind x and y components dataframe. df['max Wx'] = max_wv*np.cos(wd_rad) df['max Wy'] = max_wv*np.sin(wd_rad) #let’s plot plt.hist2d(df['Wx'], df['Wy'], bins=(50, 50), vmax=400) plt.colorbar() plt.xlabel('Wind X [m/s]') plt.ylabel('Wind Y [m/s]') ax = plt.gca() ax.axis('tight')

Let’s convert date time in seconds and convert the signals to sin cos format :

timestamp_s = date_time.map(datetime.datetime.timestamp) day = 24*60*60 year = (365.2425)*day df['Day sin'] = np.sin(timestamp_s * (2 * np.pi / day)) df['Day cos'] = np.cos(timestamp_s * (2 * np.pi / day)) df['Year sin'] = np.sin(timestamp_s * (2 * np.pi / year)) df['Year cos'] = np.cos(timestamp_s * (2 * np.pi / year))

Plot time of day signal sin and cos function

plt.plot(np.array(df['Day sin'])[:25]) plt.plot(np.array(df['Day cos'])[:25]) plt.xlabel('Time [h]') plt.title('Time of day signal')

Split the data for time series forecasting

column_indices = {name: i for i, name in enumerate(df.columns)} n = len(df) train_df = df[0:int(n*0.7)] val_df = df[int(n*0.7):int(n*0.9)] test_df = df[int(n*0.9):] num_features = df.shape[1]

Data normalization:as it is a crucial step before training your neural network, for normalization we are going to subtract the mean and divide by the standard deviation.

train_mean = train_df.mean() train_std = train_df.std() train_df = (train_df - train_mean) / train_std val_df = (val_df - train_mean) / train_std test_df = (test_df - train_mean) / train_std

#### Let’s plot the violenplot of all the feature to see if data is biased

df_std = (df - train_mean) / train_std df_std = df_std.melt(var_name='Column', value_name='Normalized') plt.figure(figsize=(12, 6)) ax = sns.violinplot(x='Column', y='Normalized', data=df_std) _ = ax.set_xticklabels(df.keys(), rotation=90)

### Data Windowing

In tensorflow, we have to do data windowing of our input dataframe, so that it can be used in further multiple models and we can see which forecast better. Also, rest of this section defines a **WindowGenerator **class. This class will contain all the logic for the input and label indices.

It also handles the indexes and offset, split window feature into (feauture, labels) pairs and plot the content of resulting window. Also this class will generate batches of these windows from train, test, and evaluation dataset, using *tf.data.Dataset.*

class WindowGenerator(): def __init__(self, input_width, label_width, shift, train_df=train_df, val_df=val_df, test_df=test_df, label_columns=None): # Store the raw data. self.train_df = train_df self.val_df = val_df self.test_df = test_df # Work out the label column indices. self.label_columns = label_columns if label_columns is not None: self.label_columns_indices = {name: i for i, name in enumerate(label_columns)} self.column_indices = {name: i for i, name in enumerate(train_df.columns)} # Work out the window parameters. self.input_width = input_width self.label_width = label_width self.shift = shift self.total_window_size = input_width + shift self.input_slice = slice(0, input_width) self.input_indices = np.arange(self.total_window_size)[self.input_slice] self.label_start = self.total_window_size - self.label_width self.labels_slice = slice(self.label_start, None) self.label_indices = np.arange(self.total_window_size)[self.labels_slice] def __repr__(self): return '\n'.join([ f'Total window size: {self.total_window_size}', f'Input indices: {self.input_indices}', f'Label indices: {self.label_indices}', f'Label column name(s): {self.label_columns}'])

With the help of above code you can create window of your choice, let’s create a demo window:

w1 = WindowGenerator(input_width=6, label_width=1, shift=1, label_columns=['T (degC)']) w1

Create tensorflow dataset using **tf.data.Datasets** utilities and create a make_dataset function that will take the **time-series dataframe.**

```
def make_dataset(self, data):
data = np.array(data, dtype=np.float32)
ds = tf.keras.preprocessing.timeseries_dataset_from_array(
data=data,
targets=None,
sequence_length=self.total_window_size,
sequence_stride=1,
shuffle=True,
batch_size=32,)
ds = ds.map(self.split_window)
return ds
WindowGenerator.make_dataset = make_dataset
```

Now **WindowGenerator** is holding the train, test and validation data, Let’s procede further for training

def make_dataset(self, data): data = np.array(data, dtype=np.float32) ds = tf.keras.preprocessing.timeseries_dataset_from_array( data=data, targets=None, sequence_length=self.total_window_size, sequence_stride=1, shuffle=True, batch_size=32,) ds = ds.map(self.split_window) return ds WindowGenerator.make_dataset = make_dataset

### Using Tensorflow Single Step model

inputs(t=0) –> | Model –> | Predictions(t=) | Labels(t=2) |

This model is used when we have this sort of simplest data to forecast and it return a single predicted value(predicting 1hour in future).

As we already setup the **WindoowGenerator** object, let’s configure it to run for single step model i.e. **(input, label) pair**.

def make_dataset(self, data): data = np.array(data, dtype=np.float32) ds = tf.keras.preprocessing.timeseries_dataset_from_array( data=data, targets=None, sequence_length=self.total_window_size, sequence_stride=1, shuffle=True, batch_size=32,) ds = ds.map(self.split_window) return ds WindowGenerator.make_dataset = make_dataset

Create baseline class to compare your model outputs with it:

class Baseline(tf.keras.Model): def __init__(self, label_index=None): super().__init__() self.label_index = label_index def call(self, inputs): if self.label_index is None: return inputs result = inputs[:, :, self.label_index] return result[:, :, tf.newaxis]

Evaluate the model:

baseline = Baseline(label_index=column_indices['T (degC)']) baseline.compile(loss=tf.losses.MeanSquaredError(), metrics=[tf.metrics.MeanAbsoluteError()]) val_performance = {} performance = {} val_performance['Baseline'] = baseline.evaluate(single_step_window.val) performance['Baseline'] = baseline.evaluate(single_step_window.test, verbose=0)

Let’s create a wider WindowGenerator that generates window 24h

wide_window = WindowGenerator( input_width=24, label_width=24, shift=1, label_columns=['T (degC)']) wide_window

print('Input shape:', wide_window.example[0].shape) print('Output shape:', baseline(wide_window.example[0]).shape)

#### Plot baseline model forecasting

wide_window.plot(baseline)

- The
**blue “inputs”***line*shows the input temperature at each time step. **Green “Labels”***dots*show the**prediction**value.**Orange “Prediction”***cross*is the predictive output by our model.

## Conclusion

We discussed time series, time series analysis, components of time series and a code example of doing time series forecasting on a weather dataset by our **single-step model **and the result were pretty close to accurate, now there are many other models for time series forecasting you can use like **Linear model**(a *layer.dense* with no activation is called linear model), Dense, Multistep Dense, Convolutional neural network and recurrent neural network.

We didn’t cover the whole tutorial here which is not possible with one article for reading a full demonstrated explanation. Please refer to the official website of TensorFlow here as now you have a basic understanding of what time-series forecasting is all about! An extended version of code is available here.

## Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.##### You can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Mohit is a Data & Technology Enthusiast with good exposure to solving real-world problems in various avenues of IT and Deep learning domain. He believes in solving human's daily problems with the help of technology.