MITB Banner

A guide to automated time-series modelling with FEDOT

FEDOT is a framework that supports automated machine learning modelling and is available to us as open-source. Using this framework we can customise the pipeline of machine learning modelling procedures.

Share

In recent years, we have witnessed the emergence of various automated machine learning approaches. There are the frameworks and libraries for AutoML that have surprised the data science practitioners community with their results. FEDOT is also such a framework that can provide us with various features of automated machine learning. In this article, we are going to discuss FEDOT for automated machine learning and we will also cover an example of time series modelling using FEDOT. The major points to be discussed in the article are listed below. 

Table of contents 

  1. What is FEDOT?
  2. Time series modelling using FEDOT 
    1. Importing data
    2. Importing modules from FEDOT
    3. Data processing using FEDOT
    4. Defining task and model
    5. Initiating modelling 
    6. Making predictions
    7. Visualizing pipeline

What is FEDOT?

FEDOT is a framework that supports automated machine learning modelling and is available to us as open-source. Using this framework we can customise the pipeline of machine learning modelling procedures. This library can be utilised for real-world problems in a simple and automated way where under the hood it uses various evolutionary approaches of modelling. Using this framework we can resolve problems related to classification, regression, clustering, and time series modelling. 

One thing which attracts us to this framework is that it has a variety of modules that can be used for end-to-end modelling. This framework also provides modules for basic processes like preprocessing of data, feature engineering, model optimization, etc. using this framework we can also build graphs that can tell us the procedure used to solve any problem is followed by the framework. Some of the other features of the framework are as follows:

  • The architecture of the framework is flexible to create machine learning models using various types of data.
  • This framework can support basic and popular libraries of machine learning like SK-Learn, Keras, Statsmodel, etc.
  • This framework can also be used to embed models related to specific areas into pipelines like ODE and PDE.
  • Using this framework we can enable ourselves to use a variety of models and increase the explainability of modelling procedures.

We can install this framework using the following lines of code.

!pip install fedot

After installation, we are ready to perform any machine learning operation using FEDOT.

Time series modelling using FEDOT 

In this section, we will look at an example of how we can perform time series analysis using the FEDOT library.

Importing data 

Before starting the procedure we are required to acquire time-series data, for this example we are using the traffic data that can be found here.  Let’s import the data.

import pandas as pd
df = pd.read_csv('/content/drive/MyDrive/Yugesh/fedot/trafic.csv', parse_dates=['datetime'])
df.head(10)

Output:

Here in the above output, we can see that in the data we have two variables[datetime and value] that will be needed in time series modelling.

Let’s plot this data.

import matplotlib.pyplot as plt
from pylab import rcParams
rcParams['figure.figsize'] = 18, 7
df.plot('datetime', 'value',c='magenta')
plt.show()

Output:

Here we can see our time series where we have values of vehicles with dates. Now, after importing data we are ready to use FEDOT modules for time series modelling.

Importing modules from FEDOT

from fedot.api.main import Fedot
from fedot.core.repository.tasks import Task, TaskTypesEnum, TsForecastingParams
from fedot.core.data.data import InputData
from fedot.core.data.data import train_test_data_setup
from fedot.core.repository.dataset_types import DataTypesEnum

In the above we have called FEDOT API, modules for solving tasks, modules for split, fit and predict, and FEDOTs data type module.

Data processing using FEDOT

Let’s prepare the data according to the FEDOT modules.

We can load and split our data using the following lines of codes

input_data = InputData.from_csv_time_series(task, '/content/drive/MyDrive/Yugesh/fedot/trafic.csv', target_column='value')
train_data, test_data = train_test_data_setup(input_data)

Defining task and model

Let’s check the length of the data.

print(f'Length of the time series - {len(df)}')

Output:

Here we can see that the length of our data is 801. So 144 prediction values will be enough so in the next we define a task for modelling using the modules of FEDOT.

task = Task(TaskTypesEnum.ts_forecasting,
            TsForecastingParams(forecast_length=144))

Here in the above, we have defined a time series forecasting task where we will get 144 predictions.

Initiating the modelling process

Let’s initiate the model using the FEDOT API.

model = Fedot(problem='ts_forecasting', task_params=task.task_params)
chain = model.fit(features=train_data)

Output:

Here in the above output, we can see that after hyperparameters tuning this API has started the modelling. 

Making predictions 

Now we are required to make some predictions that can be done using the following lines of codes.

forecast = model.predict(features=test_data)
forecast

Output:

Here we can see the prediction from our models. Now for better optimization, we are required to visualize the prediction. This can be done using the following lines of codes.

Defining function for visualization

import numpy as np
from sklearn.metrics import mean_absolute_error
traffic = np.array(df['value'])
def display_results(actual_time_series, predicted_values, len_train_data, y_name = 'Traffic volume'):    
    plt.plot(np.arange(0, len(actual_time_series)), 
             actual_time_series, label = 'Actual values', c = 'green')
    plt.plot(np.arange(len_train_data, len_train_data + len(predicted_values)), 
             predicted_values, label = 'Predicted', c = 'blue')
    # Plot black line which divide our array into train and test
    plt.plot([len_train_data, len_train_data],
             [min(actual_time_series), max(actual_time_series)], c = 'black', linewidth = 1)
    plt.ylabel(y_name, fontsize = 15)
    plt.xlabel('Time index', fontsize = 15)
    plt.legend(fontsize = 15, loc='upper left')
    plt.grid()
    plt.show()
    
    mae_value = mean_absolute_error(actual_time_series[len_train_data:], predicted_values)
    print(f'MAE value: {mae_value}')

Visualizing the results

Here we can see the predictions that are close to the test data. We have also put the mean absolute results with the function. In the below, we can see the MAE.

As we can see here that we have obtained good results from this modelling. 

Visualizing pipeline 

Let’s check the pipeline from which our modelling procedure has gone.

chain.show()
 
print('Obtained chain:')
for node in chain.nodes:
    print(f'{node.operation}, params: {node.custom_params}')

Output:

In the above output, we can see that the modelling has gone through the ridge regression for the time series model and we can also see what parameters are being used for modelling.

Final words

In this article, we have discussed FEDOT, which is a framework available as open-source for automated machine learning modelling. Using the FEDOT framework, we have seen an example of time series modelling where the results are very satisfactory.

References 

Share
Picture of Yugesh Verma

Yugesh Verma

Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.