###### PPNR Modeling – OLS, Co-Integration and ARIMAX  # PPNR Modeling – OLS, Co-Integration and ARIMAX  • Business problem: To forecast the different components of PPNR. These components include Non-interest Income and Non-interest Expense.
• Proposed solution
• OLS
• Advantages – easy to develop / test and easy to explain
• Disadvantages– difficult to finding strong correlation between dependent and independent variables
• Co-Integration
• Advantages – easy to find strong correlations between dependent and independent variables
• Disadvantages – difficult to pass all the tests / assumptions of co-integration
• ARIMAX
• Advantages – very powerful modeling technique to overcome the shortcomings of OLS and co-integration models
• Disadvantages – complex to develop as there are two stages. In stage 1 OLS model is developed and in stage 2 ARIMAX model is developed post identification of AR and MA terms

1. Introduction
1. PPNR
• Pre-provision net revenue (PPNR), under the Federal Reserve’s Comprehensive Capital Analysis and Review (CCAR), measures net revenue forecast from asset-liability spreads and non-trading fees of banks.
• Pre-provision Net Revenue (PPNR) = Net Interest Income + Non-interest Income – Non-interest Expense
• Interest Income: Loans and Securities
• Interest Expense: Deposits and Bonds
• Non-Interest Income: Credit Related Fees and Non-Credit Related
• Non-Interest Expense: Employee Compensation, Processing / Software, Occupancy, Credit / Collections and Residential Mortgage Repurchase

2.2 Modeling Approaches

`REGISTER FOR OUR UPCOMING ML WORKSHOP`
• If the dependent and independent variables are stationary
• ADF test is done on the independent variables. Only those variables are kept, those are stationary.
• Correlation between independent variables and dependent variable is done. Only those variables are kept, those have high correlation with dependent variable.
• OLS Model is developed.
• If the dependent and independent variables are non-stationary
• ADF test is done on the independent variables. Only those variables are kept, those are non-stationary
• Co-integration between independent variables and dependent variable is done. Only those variables are kept, those are co-integrated with dependent variable.
• Correlation between independent variables and dependent variable is done. Only those variables are kept, those have high correlation with dependent variable.
• OLS Model is developed

2.3 Independent Variables

2.4 Model Outputs

• Time Period
• Historical – 44 data points (from 2005Q1 to 2015Q4)
• Forecasted – 13 data points (from 2016Q1 to 2019Q1)
• Non-interest Income and Non-interest Expense are modeled
• Non-interest Expense is modeled using the stationary model developed approach
• Non-interest Income is modeled using the non-stationary model developed approach

2.5 Model Tests

• Stationarity of dependent and independent variables:
• If the p-value <= 0.10 then the series is stationary
• If the p-value > 0.10 then the series is non-stationary
• Multi co-linearity:
• Correlation matrix is used to test multi co-linearity
• If the correlation between variables is less than 0.30 or more than -0.30 then there is low multi co-linearity
• If the correlation between variables is more than 0.70 or less than -0.70 then there is high multi co-linearity
• Significance:
• The p-value <= 0.05 then the coefficient is statistically significant
• The p-value > 0.05 then the coefficient is statistically insignificant
• Auto correlation:
• Durbin-Watson test is done
• If DW statistics is less than 1 then there is positive auto correlation
• If DW statistics is close to 2 then there is no auto correlation
• If DW statistics is more than 3 then there is negative auto correlation
• Stationarity of residual:
• If the p-value <= 0.10 then the series is stationary
• If the p-value > 0.10 then the series is non-stationary

## 3. Stationary Series

3.1 Process

• ADF test is done on the independent variables. Only stationary variables are kept (23 out of 72 variables are selected).
• Correlation between independent variables and dependent variable is done. Only those variables are kept, that have high correlation with dependent variable (2 out of 23 variables are selected).
• OLS Model is developed, checks on multi co-linearity, significance of the variable and stationary of the residuals are done (2 out of 2 variables are selected).

3.2 Dependent Variables

• It is observed that the dependent variables (Non-Interest Income 1st Difference and Non-Interest Expense 1st Difference) are stationary
• Non-Interest Income 1st Diff = Non-Interest Income (t) – Non-Interest Income (t-1)
• Non-Interest Expense 1st Diff = Non-Interest Expense (t) – Non-Interest Expense (t-1)

3.3 Independent Variables

• It is observed that out of 72 independent variables, 23 independent variables are stationary.
• If the p-value <= 0.10 then the series is stationary
• If the p-value > 0.10 then the series is non-stationary
• It is observed that no macro-economic variable has high correlation with Non-Interest Income 1st Diff. However, few macro-economic variables have high correlation with Non-Interest Expense 1st Diff.
• If correlation is more than 0.30 or less than -0.30 then it is marked as high
• It is observed that out of 23 independent variables, 2 independent variables have high correlation with Non-Interest Expense 1st Diff.

3.4 Model Development

• It is observed that the model has low R-Sq and Adj R-Sq.
• There are 2 variables in the model.
• CPI growth and GDP growth (lag 2)
• The p-value for both the variables is less than 0.05
• It is observed that there is very low multi co-linearity in the model
• Correlation between variables is less than 0.30 or more than -0.30
• It is observed that there is no auto-correlation in the model and the residual is stationary
• DW test statistics is close to 2
• The p-value of the ADF test is less than 0.10

3.5 Projection

###### The Tech Behind Google’s ML Solution For Accurate Depth Estimation

• The projection is done for 13 Quarters
• If t = 1: Predicted Non-Interest Expense (t) = Actual Non-Interest Expense (t)
• If t > 1: Predicted Non-Interest Expense (t) = Predicted Non-Interest Expense (t-1) + Predicted Non-Interest Expense 1st Diff (t)
• The severely adverse projection is done for forecasted period

## 4. Non-stationary Series

4.1 Process

• ADF test is done on the independent variables. Only non-stationary variables are kept (49 out of 72 variables are selected).
• Co-integration between independent variables and dependent variable is done. Only those variables are kept, those are co-integrated with dependent variable (6 out of 49 variables are selected).
• OLS Model is developed, checks on multi co-linearity, significance of the variable and stationary of the residuals are done (1 out of 6 variables is selected).

4.2 Dependent Variables

• It is observed that the dependent variables are non-stationary

4.3 Independent Variables

• It is observed that out of 72 independent variables, 49 independent variables are non-stationary.
• If the p-value <= 0.10 then the series is stationary
• If the p-value > 0.10 then the series is non-stationary
• It is observed that no macro-economic variable is co-integrated with Non-Interest Expense. However, few macro-economic variables are co-integrated with Non-Interest Income.
• If the p-value <= 0.10 then the series is co-integrated
• If the p-value > 0.10 then the series is not co-integrated

4.4 Model Development

• It is observed that the model has high R-Sq and Adj R-Sq.
• There is 1 variable in the model.
• 3mT rate (difference YoY)
• The p-value for the variable is less than 0.05
• It is observed that there is positive auto-correlation in the model and the residual is stationary
• DW test statistics is less than 1
• The p-value of the ADF test is less than 0.10
• Since there is positive auto-correlation in the model, ARIMAX model is developed
• The ACF and PACF plots are generated for the OLS residual
• Based on the ACF and PACF plot, AR(1) model is developed
• Reference: Time Series Modeling and Forecasting—An Application to Bank’s Stress Testing, SAS Global Forum 2015, Paper 3338-2015
• ARIMAX model specifications
• P, D, Q = 1, 0, 0
• X = 3mT rate dyoy
• When AR(2) term was introduced in the model, it was found to be insignificant, hence higher lags for AR are not included in the model
• There are 2 variables in the model.
• AR(1) term and 3mT rate (difference YoY)
• The p-value for both the variables is less than 0.05
• The sigma2 in the coefficients table is the estimate of the variance of the error term.
• It is observed that there is no auto-correlation in the model and the residual is stationary
• DW test statistics is close to 2
• The p-value of the ADF test is less than 0.10

4.5 Projection

• The projection is done for 13 Quarters
• The dip in 2008-2009 is captured well by the model
• The severely adverse projection is done for forecasted period
• Graph
• Predicted (Blue line) – OLS model
• Forecasted (Red line) – ARIMAX model
What Do You Think?