The concept of decision support is over hyped

Published on March 25, 2014

by Feroz D Silva

Business people often expect that the analytics system will take the decision and they just need to act on it. The analytics team creates models that generate a score indicating the likelihood for a certain event. They expect the business team to accept this output for taking some preferential action. Business teams push back on this. Eventually, under compulsion, the analytics teams provides a flag for the predicted event for each entity.

Let me explain this with an example. The churn prediction model will provide a score for the likelihood of the customer terminating this account. But business team often push back from accepting this output stating that they want a list of customers who will churn so they can initiate proactive retention activities on them. The analytics team would then take a call on the score cutoff that defines a churned customer and thereby generate the list required by business. So when the list is not accurate, the business team is quick to blame the analytical process as a failure.

This is where the flaw lies. The responsibility of the analytics activity is to generate the most appropriate score and rank the customer on the likelihood of the churn happening. The selection of the cut-off score is a decision of the business team. And there is a quantitative method to arriving at this cut-off. This is known by various terms – cost matrix, mis-classification matrix or type I / II error matrix.

Consider a churn scoring of 10 customers. The following table gives the scores and the actual event of whether the customer eventually churned or not.

Identifier	Churn Score	Actual Churn (1 = churned)
Customer01	100	1
Customer02	90	1
Customer03	80	0
Customer04	70	1
Customer05	60	1
Customer06	50	0
Customer07	40	0
Customer08	30	1
Customer09	20	0
Customer10	10	0

The mis-classification matrix aims at comparing the accuracy of the predicted decision. But this is dependent on the cut-off score. Let us say, this cutoff was decided as 50. That is, any customer with a score higher than 50 is likely to churn. In the above list, customers 01 thru 05 will be labelled as predicted churners. Based on this, the mis-classification matrix will be as follows:

The cells in red show the error in judgement, or the mis-classification. In this case, where the cutoff is 50, the mis-classification is 2 out of 10 cases, so 20%. Now if we take the cutoff score as 60, the matrix would look as follows:

In this case, the mis-classification is 30%. It is evident that the selection of the cutoff score is critical to evaluate the accuracy of a model. This is the decision that the business team has to make. The cost matrix is the model available to assist the business team in this decision.

The cost matrix calculates the cost of activities implemented as a result of the analytical model. In the mis-classification matrix, there are two types of misclassification. One is where the analysis predicted that the customer will churn and the customer did not churn. The other is where the analysis predicted that the customer will not churn and the customer did churn. The second type of misclassification is critical to business and is known as Type I error and the other is known as Type II error. It should be noted that both these types of error will always exists and they limit the other. That is, decreasing one will increase the other. The reader can try this using different values of cut off score.

The cost matrix puts a cost to each of the matrix cells. The cost is the loss to the business from each mis-classification. When the customer is tagged as a churner, the business initiates retention activities. Lets say this activity costs Rs. 200 per customer. A churned customer results in loss of Rs. 500 of potential revenue.

So now we have the type II error, which states the customer will churn but did not. However, since the model predicted a churn, the business spent the money on retention activities. Thus, the business spent Rs. 200 on each customer who would not churn. This is cost of Type II error per customer. On the other side, in Type I error, the model stated the customer will not churn, so the business did not initiate the retention activities and the customer actually churned. This decision cost the business Rs. 500 of potential revenue per type I error.

Now we calculate the cost of decision on the earlier two scenarios.

Cost Matrix where the cutoff score was 50:

Cost Matrix where cutoff score was 60:

The cost of the decision in the first scenario, cutoff score = 50, is lower. Hence, between these two scenarios, the first scenario is better. So cutoff score should be 50.

We could try this exercise with different values of the cutoff score and select the one where the cost matrix is the lowest.

The key assumption in this exercise is the cost of Type I and Type II error. This is the call that business needs to take. The analytics team can provide the cost of decision basis various cut off scores. But they should not take the call on the Type I or Type II error costs.

It is very important to make the business team understand this and take ownership of the decisions on costs. It also helps in calculating the ROI of the model and deciding if the model is beneficial or not. The judgement of the customer on the predicted event is not for the analytics team to make.

PS: The story was written using a keyboard.

Access all our open Survey & Awards Nomination forms in one place

Feroz D Silva

Feroz has close to two decades of experience in customer relationship management. He has consulted businesses across the globe in their CRM strategy and processes. He was instrumental in setting up the Customer Intelligence practice for SAS India, the leader in analytics. He loves experimenting with adoption of practices and principles from fields such as theology, philosophy, mythology to the statistical processes. His strong belief in the "keep it simple, stupid" paradigm has helped his customers gain benefits from adopting analytics in a comfortable and controlled manner. He is a graduate in Statistics and a MBA in Marketing.

A Beginners’ Guide to Cross-Entropy in Machine Learning

Salesforce Open-Sources WarpDrive Deep RL Framework

Data Science Hiring Process At Ather Energy

A Beginner’s Guide To TensorFlow

Why Is Federated Learning Getting So Popular

Researchers Introduce Enhanced Deep RL Model For Automated Playtesting

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

India is Making its Own AI Servers

Pritam Bordoloi

PLI scheme marks the beginning of India ‘s manufacturing venture

GPT-5 Likely to be Released After the US Elections

Donna Eva

Generative AI Jobs in India can Fetch You up to Rs 1 Crore

Siddharth Jindal

Top Editorial Picks

Meta Forces Developers Cite ‘Llama 3’ in their AI Development

Sukriti Gupta

Elon Musk Set to Meet Indian Spacetech Startups During Upcoming Visit

Shyam Nandan Upadhyay

Happiest Minds Technologies Acquires Macmillan Learning India, Expands Edutech Reach

Shritama Saha

Meta Releases Llama 3, Beats Claude 3 Sonnet and Gemini Pro 1.5

Mohit Pandey

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Featured

Enhancing AI Integration through Optimal Data Management in the Global Convenience Food and Beverage Sector

Through the implementation of advanced data management methodologies, resilient data observability solutions, and cutting-edge AI frameworks, Course5 is spearheading the