MITB Banner

Perform Time Series Analysis And Forecasting Using R Programming Language

This article illustrates how to perform time-series analysis and forecasting using the R programming language.

Share

time series analysis and forecasting with R

This article illustrates how to perform time-series analysis and forecasting using the R programming language. Time series analysis refers to an important statistical technique for studying the trends and characteristics of collecting data points indexed in chronological order. On the other hand, time series forecasting involves the task of getting insights from recorded time series data and making future predictions based on them. 

If you are unfamiliar with R and its basic concepts, check out this article before proceeding. We have already covered several articles on time series analysis and time series forecasting but with Pythonic code.

Practical implementation using R

The code here has been implemented using RStudio IDE (version 1.2.1335). You can download RStudio from here. Step-wise explanation of the code is as follows:

Time-Series Analysis

  1. Install rmeta R package.

install.packages("rmeta")

Check if the package has been installed by displaying the whole list of packages   

installed in the library.

installed.packages()

Output: 

 package ‘rmeta’ successfully unpacked and MD5 sums checked
 The downloaded binary packages are in
 C:\Users\Lenovo\AppData\Local\Temp\RtmpUJ2vKU\downloaded_packages 
  1. Read the data containing age of various kings when they died. The dataset is available here.
 kings_data <- scan("http://robjhyndman.com/tsdldata/misc/kings.dat",skip=3)
 kings_data 

Output:

R data1
  1. Store the kings’ data into a time series object for performing time series analysis.
 kings_ts_data <- ts(kings_data) 
 kings_ts_data 

Output:

R time series1
  1. Similarly, read the New York City (NYC)’s data containing a monthly record of a number of births in the city. (Raw data is available here).
 NYC_data <- scan("http://robjhyndman.com/tsdldata/data/nybirths.dat")
 NYC_data 

Output:

R data2
  1. Store NYC’s data into an R time series object. ‘frequency’ parameter of the ts() function should be set to 12 for month-wise data. Also, we specify start=c(1950,1)) as a parameter to ts() indicating that the data should start from the year 1950 and there should 1 sample per year for each month.
 NYC_ts_data <- ts(NYC_data, frequency=12, start=c(1950,1))
 NYC_ts_data 

Sample condensed output:

R time series2
  1. Read the data containing monthly records of a souvenir shop in Australia. Find the data here
 shop_data <- scan("http://robjhyndman.com/tsdldata/data/fancy.dat")
 shop_data 

Output:

R data3
  1. Convert the souvenir shop’s data into a time series object;
 shop_ts_data <- ts(shop_data, frequency=12, start=c(1980,1))
 shop_ts_data 

Output:

R time series3
  1. Plot the time-series version of all the three datasets (kings’ data, NYC’s data and souvenir shop’s data).

plot.ts(kings_ts_data)

Output:

time series plot 1

plot.ts(NYC_ts_data)

Output:

time series plot 2

plot.ts(shop_ts_data)

Output:

time series plot 3
  1. Time series decomposition is a process of decomposing the time series data into components viz. A trend component and an irregular component. A seasonal data additionally has a seasonal component. 

We first smoothen the kings’ time series data using SMA() function of the TTR package (where SMA stands for ‘simple moving average’) for getting the trend component.

 install.packages("TTR")
 library("TTR")
 kings_ts_SMA2 <- SMA(kings_ts_data,n=2) 

(‘n’ parameter here specifies the order of SMA)

plot.ts(kings_ts_SMA2)

Output:

Similarly, we can change the order to say 8 and observe the change in the trend.

 kings_ts_SMA8 <- SMA(kings_ts_data,n=8)
 plot.ts(kings_ts_SMA8) 

Output:

It can be observed that incrementing the order of SMA smoothens the plot more i.e. reduces fluctuations in the trend.

  1. For seasonal data such as that of births in NYC, the decomposition can be carried out using the decompose() function since it also has a seasonal component apart from the trend and irregular ones.

NYC_ts_comp <- decompose(NYC_ts_data)

Now, we can separate obtain each of the three components of the decomposed NYC data.

Seasonal component:

NYC_ts_comp$seasonal

Sample condensed output:

R time series decomposition 1

Trend component:

NYC_ts_comp$trend

Sample condensed output:

R time series decomposition 2

Irregular component:

NYC_ts_comp$random

Sample condensed output:

R time series decomposition 3

Plot all the components of the NYC time series in a single plot.

plot(NYC_ts_comp)

Sample condensed output:

R time deries decomposition plot
  1. The seasonal time series data of NYC can be seasonally adjusted by subtracting its seasonal component from the original time series.
 NYC_ts_adjusted <- NYC_ts_data - NYC_ts_comp$seasonal
 plot(NYC_ts_adjusted) 

Output:

R time series adjusted

Time-Series Forecasting

  1. Read the data containing records of annual precepitation (in inches) in London (data is available here). 
 rain_data <- scan("http://robjhyndman.com/tsdldata/hurst/precip1.dat",skip=1)
 rain_data 

Output:

time series forecasting data
  1. Create a time series object of the precipitation data for the year starting from 1815 and plot that time-series.
 rain_ts_data <- ts(rain_data, start=c(1815))
 plot.ts(rain_ts_data) 

Output:

time series forecasting plot
  1. We can forecast the rainfall in future using simple exponential smoothing technique since the rain data has no seasonality and is an additive time-series.

 The HolWinters() functions with ‘beta’ and ‘gamma’ parameters set to False can be used for simple exponential smoothing of the data. It estimates the value of smoothing factor ‘alpha’ whose value lies between 0 and 1. Larger the value of alpha, more is the importance/weightage given to the historic data for making predictions.

 rain_fc <- HoltWinters(rain_ts_data, beta=FALSE, gamma=FALSE)
 rain_fc 

Output:

forecast result

The forecasts made by the HoltWinters() function can be obtained from the named variable called ‘filleted’ of the rain_fc variable. It is in this variable that the HoltWinters() function stores its output.

rain_fc$fitted

Output:

The complete output will show that HolWinters() makes predictions for that period only which is covered by our original time-series.

  1.  Plot the forecasted as well as original time series in a single plot.

plot(rain_fc)

Output:

The ‘red’ line in the above plot shows the forecasted time series which is, as can be 

seen, has much less fluctuations than the original time series (shown in black).

Share
Picture of Nikita Shiledarbaxi

Nikita Shiledarbaxi

A zealous learner aspiring to advance in the domain of AI/ML. Eager to grasp emerging techniques to get insights from data and hence explore realistic Data Science applications as well.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.