A complete tutorial on Arauto for time-series analysis and modelling

Arauto is an open-source project for time series analysis using which we can perform various analyses on our time series data. Also, we can use various time series models from the ARIMA family using it.

Time-series analysis and modelling are one of the complex parts of data science and machine learning. An expert on time series modelling needs to perform various tests and iterate between the models to get optimal results. What if using a tool we can make this complex procedure very easy and explainable to the beginners. This sounds like a very helpful statement. So in this article, we will discuss a tool that can make the time series analysis very easy and explainable, named Arauto. The major points to be discussed in this article are listed below.

Table of Contents

  1. What is Arauto?
  2. Installing Arauto
  3. Using Arauto for Time-Series Modelling

What is Arauto?

Arauto is an open-source project for time series analysis using which we can perform various analyses on our time series data. Also, we can use various time series models from the ARIMA family using it. Some of the examples of the models are  AR, MA, ARMA, ARIMA, SARIMA, ARIMAX, and SARIMAX

We can say that Arauto is a tool that allows us to perform time series analysis and modelling without writing a lot of code. By providing an intuitive experience, it supports exogenous variables. Using this tool we can customize the process from choosing a specific transformation function to testing different time series parameters. Talking about the features of arauto which can be utilized between the time series analysis are as follows:

  • It supports independent variables, or we can say exogenous regressors.
  • Using the feature of seasonal decomposition, we can analyze the trend, seasonality, and resid of the time series.
  • It has a feature for checking stationarity on the time series using the dickey-fuller test.
  • Various transformations can be performed in the time series like first-order differences and seasonal log differences.
  • They have the facility to ACF and PACF plots and functions which can be utilized to perform term estimation.
  • There is a facility to perform hyperparameter tuning using grid search technology.
  • At the end of any analysis or modelling procedure, we can get the python codes from arauto for that procedure.  
  • One of the best features of Arauto is, we can get the suggestion for the procedure according to the data.

Installing Arauto using Google Colab

We can install this project on the web, docker, or in the local system. Since I am using the Google Colab in the process we will also get to know how we can make it work in the Google Colab environment.  

Let’s start by installing it in the Google Colab environment. At the end of the procedure, we will learn how we can use conda in Colab and how we can make a python virtual environment in Colab. Since we are using the Colab, we are required to mount our drive on the runtime first. The below-given codes can be used to mount the drive.

from google.colab import drive
drive.mount('/content/drive')

Output:

By clicking on the link, we will get an authorization code and using the code we mount the drive on the runtime. Using the below code we can install conda in the Google Colab environment.

!pip install -q condacolab
import condacolab
condacolab.install()

Output:

Let’s check the version of conda which we have received.

!conda --version

Output:

Let’s check the location where we have installed the conda.

!which conda

Output:

After this installation, the conda is prepared to work with the Arauto project. Let’s start the installation process of Arauto. Using the below code we can clone the GitHub repository in google drive.

!git clone https://github.com/paulozip/arauto.git

Output:

The below code will help us to set the cloned package as our working directory.

!cd arauto

Let’s make a python virtual environment where we can install the Arauto project.

#creatring environment 
!conda create --name arauto_env

Output:

The below code can help us in activating the environment which we have made in the above.

# activate your conda environment
%%bash
source activate arauto_env
python
 
# python commands are ready to run within your environment
import sys
print("Python version")
print (sys.version)

Output:

After activation of the environment we are ready to install the project using the following lines of code:

!pip install requirements.txt

Output:

These are the following packages we have in the requirements.txt.

After installation, we are ready to use Arauto for time series analysis and modelling. Using the below code we can start the Streamlit application.

!streamlit run /content/arauto/run.py

Output:

Using the links we can access the application. We can also directly use the web application of Aruto by clicking on the link.

Using Arauto for time series analysis and modelling 

As in the introduction of Arauto, we talked about its features. In this section, we will be using some of its features so that we can create a basic understanding of the usage of the Arauto project. For practising time series analysis and modelling, we have the facility of the following datasets.

Whenever a time series is generated, it generates a frequency of time and according to the series we can use the following option for frequency.

After selecting data and frequency we are ready for the time series analysis. Scrolling down the panel on the left side will provide us with the following options.

Using these options we can select the count of data on the validation set and choose the graphs according to our choice and requirement. Below are some of the graphs and tests made using the monthly_air_passenger.csv:

Historical data

In the above graph, we can see the overall trend of the times series which is followed through the below given years. 

Seasonal decomposition

In the above image, we can see different components of the time series(seasonality, trend, and resid). 

ACF and PACF plots

In the above plots, we can see the plots for ACF and PACF.  To maintain the stationarity of the data we are required to perform data transformation. For which we can easily choose and iterate between the following options.

For performing force data transformation, Arauto has an automated feature that already suggests the best transformation with the ADF test like the following.

In the above results, we can see what is the best option for transformation. With this, we also get suggestions for the best model and parameters values to fit the data.

Now by just choosing the forecasting period and clicking on the button ”Do Your Magic”, we can train the model on data and can get the results. Below images are results that I get from modelling using the Arauto suggested options only.

Train set prediction

Test set prediction

Out-of-sample Forecast

The above forecasting came on the Plotly dashboards and the above predictions on train and test data seem like we are copying it. With this, we also get the code for the procedure which can be used to cross-check Aruto’s model and analysis.  

To perform the analysis and modelling on the self-generated data we can use the Arauto REST API for sending it. By modifying the below code according to the path of the data we can upload the data.

curl -X POST \
  http://SERVER_ADDRESS:5000/upload_file \
  -H 'content-type: multipart/form-data' \
  -F file=@PATH_TO_YOUR_FILE

Here we have seen the implementation of time series analysis on the Arauto application.

Final words

In this article, we have seen the Arauto package and how its features can make the time series analysis and modelling robust and more accurate. Since only a few codes are required in installation, we can say it is a low-code tool for time series analysis. I encourage readers to use such tools to make the analysis conceptually stronger and robust than before.

References:

More Great AIM Stories

Yugesh Verma
Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.

More Stories

OUR UPCOMING EVENTS

8th April | In-person Conference | Hotel Radisson Blue, Bangalore

Organized by Analytics India Magazine

View Event >>

30th Apr | Virtual conference

Organized by Analytics India Magazine

View Event >>

MORE FROM AIM
Yugesh Verma
All you need to know about Graph Embeddings

Embeddings can be the subgroups of a group, similarly, in graph theory embedding of a graph can be considered as a representation of a graph on a surface, where points of that surface are made up of vertices and arcs are made up of edges

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM