Active Hackathon

Deep Dive Into Scorecard Development for Banking Industry

A scorecard is a risk scoring tool used to evaluate the level of risk associated with applicants.

The credit risk team of a Canadian PLCC – private label credit card issuer wanted a scorecard to estimate the risk associated with new applicants.

  • About the issuer: Private label credit card programs partner with the issuer to manage the card program for the business. These issuer (financial institution) performs several functions. These include the issuance of cards, funding of credit and the collection of payments from customers.
  • About the product: A private label credit card is a store-branded credit card that is intended for use at a specific store. The private label credit program allows retailers to offer more lenient and extended terms to customers than they could otherwise. Many stores offer private label credit cards to their customers to encourage them to spend more by offering the convenience of a credit card and deferred payment.


The data scientists developed the approval scorecard. 


Sign up for your weekly dose of what's up in emerging technology.
  • A scorecard is a risk scoring tool used to evaluate the level of risk associated with applicants. 
  • Scorecard consists of a group of characteristics that are statistically determined in separating the good and bad applicants. 


Scorecard Format: A scorecard should be easy to interpret and explain. The development process should be transparent, and it should be easy to diagnose and monitor. 

Data Review: Performance window is the time frame from where the performance of the accounts is monitored. Sample window is the time frame from where known good and bad cases are selected for development sample. 

Selection of Characteristics: The predictive power of the characteristic, reliability and robustness, ease of collection and future availability, interpretability and business reasoning, and legal aspects. 

  • Segmentation: It is based on experience and business knowledge, or using statistical techniques. Segments are generally based on demographics, product type, sources of business, and applicant type. Segments are selected based on cost of development and implementation, ease of processing and development, and monitoring strategies. 
  • Grouping: Missing is grouped separately, minimum 5% of observations in each bin, no groups are created where count of good = 0 or count of bad = 0, and WOE is sufficiently different from one group to the next.

Strength of Characteristics: 

  • Correlation: Principal component analysis is done to identify the groups of characteristics that are highly correlated. From each group one or more characteristics are selected. 
  • Information Value: It is used to measure the predictive power of the attribute. 

WOE = LN(Dist of Good / Dist of Bad)

IV = Σ (Dist of Good – Dist of Bad) x WOE

Score Card 

It indicates worthiness of a borrower. Credit score of a borrower is calculated by summing the credit scores corresponding to the respective dummy variables. 

Cut off is used to take a decision whether to approve a loan application or not 

  • It impacts the quality of loans that the bank grants 
  • Pre-determines the total number of borrowers that will be approved or rejected
  • Higher cut off means less business but overall greater quality of approved loans.

Formulas to calculate score: 

  • Intercept Score = IS* + Min Score
    • Max Score = 850 
    • Min Score = 300 
    • Max sum of coeff = Sum of max coefficient for each variable (including intercept) 
    • Min sum of coeff = Sum of min coefficient for each variable (including intercept) 
    • C = coefficient and IC = intercept 

Model Summary 

Model: The model should make business sense and should align with business experience. The correlation between characteristics should be minimal, easy interpretability of the characteristics, ease of implementation, and transparency of methodology. 

Scaling Calculations:

  • Score = Offset + Factor x LN(odds) 

Score + PDO (points to double odds) = Offset + Factor x LN(2 x odds) 

  • PDO = Factor x LN(2)

Offset = Score – (Factor x LN(odds))

  • Cut-off is selected based on either to maintain the current approval rate or to maintain the current bad rate. 

Scaling Example:

  • Given: 

Log-odds of 50:1 at 613 points

PDO is 20 

  • Calculations: 

Factor = 20 / LN(2) = 28.85

Offset = 613 – (28.85 x LN(50)) = 500

Score = 500 + 28.85 x LN(odds) 

Scorecard Strength:

  • Confusion Matrix: It is an N x N matrix, where N is the number of classes being predicted. For dichotomous output N = 2. Accuracy is the proportion of the total number of predictions that were correct. Precision is the proportion of predicted positive cases that were correctly identified. Recall or Sensitivity is the proportion of actual positive cases which are correctly identified. Specificity is the proportion of actual negative cases which are correctly identified.
  • Gain and Lift Chart: Gain and lift charts are mainly concerned to check the rank ordering of the probabilities. Gain is the percentage of targets (events) covered at a given decile level. Lift is the ratio of gain percentage to the random expectation percentage at a given decile level. 
  • Kolmogorov-Smirnov (KS): KS or Kolmogorov-Smirnov chart measures performance of classification models. The KS statistic gives the separation power of the model. It is calculated as the maximum of the absolute value of the difference between cumulative non-event and cumulative event. A good model will have a KS > 30. A high value of KS will depict over-prediction in the model.
  • Area Under the Receiver Operating Characteristic (AUROC): The AUROC curve is a fundamental tool for diagnostic test evaluation. It is plotted as a graph between sensitivity and 1-specificity, which we can get from the confusion matrix. An ideal model will have AUROC very close to 1. Lift is dependent on total response rate of the population. ROC curve on the other hand is almost independent of the response rate. 
  • Gini: Gini coefficient is the ratio between area between the ROC curve and the diagonal line and the area of the above triangle. Gini above 60% is a good model.

More Great AIM Stories

Rohit Garg
Rohit Garg has close to 7 years of work experience in field of data analytics and machine learning. He has worked extensively in the areas of predictive modeling, time series analysis and segmentation techniques. Rohit holds BE from BITS Pilani and PGDM from IIM Raipur.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM