Active Hackathon

# A complete tutorial on Arauto for time-series analysis and modelling

Arauto is an open-source project for time series analysis using which we can perform various analyses on our time series data. Also, we can use various time series models from the ARIMA family using it.

Time-series analysis and modelling are one of the complex parts of data science and machine learning. An expert on time series modelling needs to perform various tests and iterate between the models to get optimal results. What if using a tool we can make this complex procedure very easy and explainable to the beginners. This sounds like a very helpful statement. So in this article, we will discuss a tool that can make the time series analysis very easy and explainable, named Arauto. The major points to be discussed in this article are listed below.

Table of Contents

#### THE BELAMY

##### Sign up for your weekly dose of what's up in emerging technology.
1. What is Arauto?
2. Installing Arauto
3. Using Arauto for Time-Series Modelling

What is Arauto?

Arauto is an open-source project for time series analysis using which we can perform various analyses on our time series data. Also, we can use various time series models from the ARIMA family using it. Some of the examples of the models are  AR, MA, ARMA, ARIMA, SARIMA, ARIMAX, and SARIMAX

We can say that Arauto is a tool that allows us to perform time series analysis and modelling without writing a lot of code. By providing an intuitive experience, it supports exogenous variables. Using this tool we can customize the process from choosing a specific transformation function to testing different time series parameters. Talking about the features of arauto which can be utilized between the time series analysis are as follows:

• It supports independent variables, or we can say exogenous regressors.
• Using the feature of seasonal decomposition, we can analyze the trend, seasonality, and resid of the time series.
• It has a feature for checking stationarity on the time series using the dickey-fuller test.
• Various transformations can be performed in the time series like first-order differences and seasonal log differences.
• They have the facility to ACF and PACF plots and functions which can be utilized to perform term estimation.
• There is a facility to perform hyperparameter tuning using grid search technology.
• At the end of any analysis or modelling procedure, we can get the python codes from arauto for that procedure.
• One of the best features of Arauto is, we can get the suggestion for the procedure according to the data.

Installing Arauto using Google Colab

We can install this project on the web, docker, or in the local system. Since I am using the Google Colab in the process we will also get to know how we can make it work in the Google Colab environment.

Let’s start by installing it in the Google Colab environment. At the end of the procedure, we will learn how we can use conda in Colab and how we can make a python virtual environment in Colab. Since we are using the Colab, we are required to mount our drive on the runtime first. The below-given codes can be used to mount the drive.

``````from google.colab import drive
drive.mount('/content/drive')``````

Output:

By clicking on the link, we will get an authorization code and using the code we mount the drive on the runtime. Using the below code we can install conda in the Google Colab environment.

``````!pip install -q condacolab
import condacolab
condacolab.install()
``````

Output:

Let’s check the version of conda which we have received.

`!conda --version`

Output:

Let’s check the location where we have installed the conda.

`!which conda`

Output:

After this installation, the conda is prepared to work with the Arauto project. Let’s start the installation process of Arauto. Using the below code we can clone the GitHub repository in google drive.

`!git clone https://github.com/paulozip/arauto.git`

Output:

The below code will help us to set the cloned package as our working directory.

`!cd arauto`

Let’s make a python virtual environment where we can install the Arauto project.

``````#creatring environment
!conda create --name arauto_env``````

Output:

The below code can help us in activating the environment which we have made in the above.

``````# activate your conda environment
%%bash
source activate arauto_env
python

# python commands are ready to run within your environment
import sys
print("Python version")
print (sys.version)
``````

Output:

After activation of the environment we are ready to install the project using the following lines of code:

`!pip install requirements.txt`

Output:

These are the following packages we have in the requirements.txt.

After installation, we are ready to use Arauto for time series analysis and modelling. Using the below code we can start the Streamlit application.

`!streamlit run /content/arauto/run.py`

Output:

Using the links we can access the application. We can also directly use the web application of Aruto by clicking on the link.

Using Arauto for time series analysis and modelling

As in the introduction of Arauto, we talked about its features. In this section, we will be using some of its features so that we can create a basic understanding of the usage of the Arauto project. For practising time series analysis and modelling, we have the facility of the following datasets.

Whenever a time series is generated, it generates a frequency of time and according to the series we can use the following option for frequency.

After selecting data and frequency we are ready for the time series analysis. Scrolling down the panel on the left side will provide us with the following options.

Using these options we can select the count of data on the validation set and choose the graphs according to our choice and requirement. Below are some of the graphs and tests made using the monthly_air_passenger.csv:

Historical data

In the above graph, we can see the overall trend of the times series which is followed through the below given years.

Seasonal decomposition

In the above image, we can see different components of the time series(seasonality, trend, and resid).

ACF and PACF plots

In the above plots, we can see the plots for ACF and PACF.  To maintain the stationarity of the data we are required to perform data transformation. For which we can easily choose and iterate between the following options.

For performing force data transformation, Arauto has an automated feature that already suggests the best transformation with the ADF test like the following.

In the above results, we can see what is the best option for transformation. With this, we also get suggestions for the best model and parameters values to fit the data.

Now by just choosing the forecasting period and clicking on the button ”Do Your Magic”, we can train the model on data and can get the results. Below images are results that I get from modelling using the Arauto suggested options only.

Train set prediction

Test set prediction

Out-of-sample Forecast

The above forecasting came on the Plotly dashboards and the above predictions on train and test data seem like we are copying it. With this, we also get the code for the procedure which can be used to cross-check Aruto’s model and analysis.

To perform the analysis and modelling on the self-generated data we can use the Arauto REST API for sending it. By modifying the below code according to the path of the data we can upload the data.

``````curl -X POST \
http://SERVER_ADDRESS:5000/upload_file \
-H 'content-type: multipart/form-data' \
-F file=@PATH_TO_YOUR_FILE``````

Here we have seen the implementation of time series analysis on the Arauto application.

Final words

In this article, we have seen the Arauto package and how its features can make the time series analysis and modelling robust and more accurate. Since only a few codes are required in installation, we can say it is a low-code tool for time series analysis. I encourage readers to use such tools to make the analysis conceptually stronger and robust than before.

References:

## More Great AIM Stories

### Is Molecular Biology Going To Be A Bastion Of Large Tech Firms?

Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.

## Our Upcoming Events

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Telegram Channel

Discover special offers, top stories, upcoming events, and more.

### Indian IT is Trying to Make Their Metaverse Happen

TCS is working on 60 metaverse projects globally.

### Should we call Rust a Failed Programming Language?

Rust has been ranked as the most liked language by its users for two years in surveys but programmers say otherwise

### WhatsApp Journeys – Instant Gratification with No frills

It is not merely the availability of customers on WhatsApp that is of value but also, the ease of their journey.

### Ouch, Cognizant

The company has reduced its full-year 2022 revenue growth guidance to 8.5% – 9.5% in constant currency from the 9-11% in the previous quarter

### Why the Government is Right to Block the Startup Sales to Big Tech

This concentration of power and wealth in big tech mirrors the rise in inequality in the broader society.

### Lessons from Tech Firms’ internal skill-building platforms

When the trends of re-shuffling, reassessing and re-inventing are widespread among employees, providing adequate career advancement opportunities seems wiser

### IT attrition might be down, but let’s not cheer yet

Tech Mahindra is one of the few IT companies to have witnessed a decline in attrition, noting a 2 per cent drop compared to the previous quarter.

### DataStax in a crowded NoSQL Market

With Astra Streaming integrated into Astra DB, DataStax delivers an open stack that unifies all aspects of real-time data

### Now Microsoft wants a share of the ‘AI image generator’ pie

Compared to DALL-E, Imagen and Midjourney, NUWA-Infinity can generate high-resolution images with arbitrary sizes and support long-duration video generation, says Microsoft

### The curious case of Google Cloud revenue

Porat had earlier said that Google Cloud was putting in money to make more money, but even with the bucket-loads of money that it was making, profitability was still elusive.

[class^="wpforms-"]
[class^="wpforms-"]