MITB Banner

PPNR Modeling – OLS, Co-Integration and ARIMAX

Share

DeiT

The objective of this article is to evaluate different techniques for time series forecasting. These techniques include OLS model, Co-integration model and ARIMAX model 

  • Business problem: To forecast the different components of PPNR. These components include Non-interest Income and Non-interest Expense.
  • Proposed solution
OLSModelCo-IntModelARIMAXModelNotes
Preference HighMedium Low 
Complexity Low Medium High 
Dependent variable is stationaryOLS should be used ARIMAX should be used For ARIMAX both (dependent and independent variables) should be stationary together
Independent variable is stationary
Dependent variable is non-stationaryCo-Int should be used ARIMAX should be used For ARIMAX both (dependent and independent variables) should be non-stationary together
Independent variable is non-stationary
Auto-correlation DW test close to 2DW test close to 2DW test close to 2If for OLS or Co-integration DW fails then ARIMAX should be used
Variable significance p-value < 0.05p-value < 0.05p-value < 0.05For ARIMAX the AR, MA and exogenous terms should be significant 
Multi co-linearityVIF < 5VIF < 5VIF < 5
Residual is stationary ADF test should passADF test should passADF test should pass
Residual is non-stationaryFor all the three approaches, the residual should be stationary 
Normality and homoscedasticity of residual Should pass Should pass Should pass 
  • OLS
    • Advantages – easy to develop / test and easy to explain 
    • Disadvantages– difficult to finding strong correlation between dependent and independent variables
  • Co-Integration
    • Advantages – easy to find strong correlations between dependent and independent variables
    • Disadvantages – difficult to pass all the tests / assumptions of co-integration 
  • ARIMAX
    • Advantages – very powerful modeling technique to overcome the shortcomings of OLS and co-integration models 
    • Disadvantages – complex to develop as there are two stages. In stage 1 OLS model is developed and in stage 2 ARIMAX model is developed post identification of AR and MA terms 

  1. Introduction 
    1. PPNR
  • Pre-provision net revenue (PPNR), under the Federal Reserve’s Comprehensive Capital Analysis and Review (CCAR), measures net revenue forecast from asset-liability spreads and non-trading fees of banks.
    • Pre-provision Net Revenue (PPNR) = Net Interest Income + Non-interest Income – Non-interest Expense 
    • Interest Income: Loans and Securities
    • Interest Expense: Deposits and Bonds
    • Non-Interest Income: Credit Related Fees and Non-Credit Related
    • Non-Interest Expense: Employee Compensation, Processing / Software, Occupancy, Credit / Collections and Residential Mortgage Repurchase

2.2 Modeling Approaches 

  • If the dependent and independent variables are stationary 
    • ADF test is done on the independent variables. Only those variables are kept, those are stationary.
    • Correlation between independent variables and dependent variable is done. Only those variables are kept, those have high correlation with dependent variable. 
    • OLS Model is developed. 
  • If the dependent and independent variables are non-stationary 
    • ADF test is done on the independent variables. Only those variables are kept, those are non-stationary 
    • Co-integration between independent variables and dependent variable is done. Only those variables are kept, those are co-integrated with dependent variable.
    • Correlation between independent variables and dependent variable is done. Only those variables are kept, those have high correlation with dependent variable. 
    • OLS Model is developed 

2.3 Independent Variables 

 RawDiff QoQDiff YoYPct Diff QoQPct Diff YoY
Lags 0, 1 and 2Lags 0, 1 and 2Lags 0, 1 and 2Lags 0, 1 and 2Lags 0, 1 and 2
GDP growthYesNoNoNoNo
Income growthYesNoNoNoNo
CPI growthYesNoNoNoNo
Unemp rateYesYesYesNoNo
3mT rateYesYesYesNoNo
5yT rateYesYesYesNoNo
10yT rateYesYesYesNoNo
BBB rateYesYesYesNoNo
Prime rateYesYesYesNoNo
HPIYesNoNoYesYes

2.4 Model Outputs 

  • Time Period 
    • Historical – 44 data points (from 2005Q1 to 2015Q4)
    • Forecasted – 13 data points (from 2016Q1 to 2019Q1)
  • Non-interest Income and Non-interest Expense are modeled 
    • Non-interest Expense is modeled using the stationary model developed approach 
    • Non-interest Income is modeled using the non-stationary model developed approach
Non-interest Expense(Stationary model developed approach)Non-interest Income(Non-stationary model developed approach)

2.5 Model Tests 

  • Stationarity of dependent and independent variables: 
    • ADF test is done 
    • If the p-value <= 0.10 then the series is stationary 
    • If the p-value > 0.10 then the series is non-stationary 
  • Multi co-linearity: 
    • Correlation matrix is used to test multi co-linearity 
    • If the correlation between variables is less than 0.30 or more than -0.30 then there is low multi co-linearity 
    • If the correlation between variables is more than 0.70 or less than -0.70 then there is high multi co-linearity 
  • Significance: 
    • The p-value <= 0.05 then the coefficient is statistically significant 
    • The p-value > 0.05 then the coefficient is statistically insignificant
  • Auto correlation:
    • Durbin-Watson test is done
    • If DW statistics is less than 1 then there is positive auto correlation
    • If DW statistics is close to 2 then there is no auto correlation 
    • If DW statistics is more than 3 then there is negative auto correlation 
  • Stationarity of residual: 
    • ADF test is done 
    • If the p-value <= 0.10 then the series is stationary 
    • If the p-value > 0.10 then the series is non-stationary 

3. Stationary Series 

3.1 Process

  • ADF test is done on the independent variables. Only stationary variables are kept (23 out of 72 variables are selected). 
  • Correlation between independent variables and dependent variable is done. Only those variables are kept, that have high correlation with dependent variable (2 out of 23 variables are selected).
  • OLS Model is developed, checks on multi co-linearity, significance of the variable and stationary of the residuals are done (2 out of 2 variables are selected). 

3.2 Dependent Variables 

  • It is observed that the dependent variables (Non-Interest Income 1st Difference and Non-Interest Expense 1st Difference) are stationary 
    • Non-Interest Income 1st Diff = Non-Interest Income (t) – Non-Interest Income (t-1)
    • Non-Interest Expense 1st Diff = Non-Interest Expense (t) – Non-Interest Expense (t-1)
VarADFPval
NonInt Inc diff-5.200.00
NonInt Exp diff-5.980.00

3.3 Independent Variables 

  • It is observed that out of 72 independent variables, 23 independent variables are stationary. 
    • If the p-value <= 0.10 then the series is stationary 
    • If the p-value > 0.10 then the series is non-stationary 
  • It is observed that no macro-economic variable has high correlation with Non-Interest Income 1st Diff. However, few macro-economic variables have high correlation with Non-Interest Expense 1st Diff.
    • If correlation is more than 0.30 or less than -0.30 then it is marked as high
    • It is observed that out of 23 independent variables, 2 independent variables have high correlation with Non-Interest Expense 1st Diff.
 NonInt Exp diff
CPI growth0.31
GDP growth 20.43

3.4 Model Development 

  • It is observed that the model has low R-Sq and Adj R-Sq. 
No. Obs:43.00R-squared:0.29
Df Model:2.00Adj. R-squared:0.26
  • There are 2 variables in the model. 
    • CPI growth and GDP growth (lag 2) 
    • The p-value for both the variables is less than 0.05
 coefstd errtP>|t|
const-377,300.00134,000.00-2.820.01
CPI growth86,320.0035,500.002.430.02
GDP growth 2122,800.0036,600.003.360.00
  • It is observed that there is very low multi co-linearity in the model 
    • Correlation between variables is less than 0.30 or more than -0.30
 CPI growthGDP growth 2
CPI growth-0.04
GDP growth 2-0.04
  • It is observed that there is no auto-correlation in the model and the residual is stationary
    • DW test statistics is close to 2 
    • The p-value of the ADF test is less than 0.10 
Durbin-Watson:
2.36
Var:ADF:Pval:
RESI-8.300.00

3.5 Projection 

  • The projection is done for 13 Quarters  
    • If t = 1: Predicted Non-Interest Expense (t) = Actual Non-Interest Expense (t)
    • If t > 1: Predicted Non-Interest Expense (t) = Predicted Non-Interest Expense (t-1) + Predicted Non-Interest Expense 1st Diff (t)
    • The severely adverse projection is done for forecasted period

4. Non-stationary Series 

4.1 Process

  • ADF test is done on the independent variables. Only non-stationary variables are kept (49 out of 72 variables are selected). 
  • Co-integration between independent variables and dependent variable is done. Only those variables are kept, those are co-integrated with dependent variable (6 out of 49 variables are selected).
  • OLS Model is developed, checks on multi co-linearity, significance of the variable and stationary of the residuals are done (1 out of 6 variables is selected). 

4.2 Dependent Variables 

  • It is observed that the dependent variables are non-stationary 
VarADFPval
NonInt Inc-2.140.23
NonInt Exp-1.490.54

4.3 Independent Variables 

  • It is observed that out of 72 independent variables, 49 independent variables are non-stationary. 
    • If the p-value <= 0.10 then the series is stationary 
    • If the p-value > 0.10 then the series is non-stationary 
  • It is observed that no macro-economic variable is co-integrated with Non-Interest Expense. However, few macro-economic variables are co-integrated with Non-Interest Income.
    • If the p-value <= 0.10 then the series is co-integrated 
    • If the p-value > 0.10 then the series is not co-integrated 
VarCoint_IncPval_Inc
3mT rate dyoy-3.370.05
3mT rate dyoy 1-3.240.06
5yT rate dyoy-3.210.07
5yT rate dyoy 1-3.380.04
Prime rate dqoq 2-3.390.04
Prime rate dyoy-3.310.05

4.4 Model Development 

  • It is observed that the model has high R-Sq and Adj R-Sq. 
No. Obs:44.00R-squared:0.66
Df Model:1.00Adj. R-squared:0.65
  • There is 1 variable in the model. 
    • 3mT rate (difference YoY) 
    • The p-value for the variable is less than 0.05
 coefstd errtP>|t|
const7,656,000.00249,000.0030.800.00
3mT rate dyoy1,786,000.00199,000.008.990.00
  • It is observed that there is positive auto-correlation in the model and the residual is stationary
    • DW test statistics is less than 1 
    • The p-value of the ADF test is less than 0.10 
Durbin-Watson:
0.85
Var:ADF:Pval:
RESI-3.330.01
  • Since there is positive auto-correlation in the model, ARIMAX model is developed 
    • The ACF and PACF plots are generated for the OLS residual 
    • Based on the ACF and PACF plot, AR(1) model is developed 
    • Reference: Time Series Modeling and Forecasting—An Application to Bank’s Stress Testing, SAS Global Forum 2015, Paper 3338-2015
  • ARIMAX model specifications 
  • P, D, Q = 1, 0, 0 
  • X = 3mT rate dyoy
  • When AR(2) term was introduced in the model, it was found to be insignificant, hence higher lags for AR are not included in the model
No. Obs:44.00AIC1,380.03
Sample:0.00BIC1,375.54
  • There are 2 variables in the model. 
    • AR(1) term and 3mT rate (difference YoY) 
    • The p-value for both the variables is less than 0.05
    • The sigma2 in the coefficients table is the estimate of the variance of the error term. 
 coefstd errtP>|t|
const7,656,000.00563,000.0013.590.00
3mT rate dyoy1,786,000.00264,000.006.760.00
ar.L10.560.134.240.00
sigma21.75E+120.171.05E+130.00
  • It is observed that there is no auto-correlation in the model and the residual is stationary
    • DW test statistics is close to 2 
    • The p-value of the ADF test is less than 0.10 
Durbin-Watson:
1.77
Var:ADF:Pval:
RESI-5.740.00

4.5 Projection 

  • The projection is done for 13 Quarters 
    • The dip in 2008-2009 is captured well by the model 
    • The severely adverse projection is done for forecasted period 
  • Graph
    • Predicted (Blue line) – OLS model 
    • Forecasted (Red line) – ARIMAX model 
Share
Picture of Rohit Garg

Rohit Garg

Rohit Garg has close to 7 years of work experience in field of data analytics and machine learning. He has worked extensively in the areas of predictive modeling, time series analysis and segmentation techniques. Rohit holds BE from BITS Pilani and PGDM from IIM Raipur.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.