In machine learning when we build a model for classification tasks we do not build only a single model. We never rely on a single model since we have many different algorithms in machine learning that work differently on different datasets. We always have to build a model that best suits the respective data set so we try building different models and at last we choose the best performing model. For doing this comparison we cannot always rely on a metric like an accuracy score, the reason being for any imbalance data set the model will always predict the majority class. But it becomes important to check whether the positive class is predicted as the positive and negative class as negative by the model.

For this, we make use of Receiver Characteristics Curve – Area Under Curve that is plotted between True positive and False positive rates. In this article, we will learn more about the ROC-AUC curve and how we make use of it to compare different machine learning models to select the best performing model. For this experiment, we will make use of Pima-Indian Diabetes that can be downloaded from Kaggle.

**What we will learn from this article? **

- What is the ROC-AUC Curve? How does it work?
- How to compare the performance of different models using the ROC-AUC curve?

**What is the ROC-AUC Curve? How does it work?**

It is a visualization graph that is used to evaluate the performance of different machine learning models. This graph is plotted between true positive and false positive rates where true positive is totally positive and false positive is a total negative. The area under the curve (AUC) is the summary of this curve that tells about how good a model is when we talk about its ability to generalize. If any model captures more AUC than other models then it is considered to be a good model among all or we can conclude more the AUC the better model will be classifying actual positive and actual negative.

If the value of AUC = 1 then the model will be perfect while classifying the positive class as the positive and negative class as negative. If the value of AUC = 0, then the model is poor while classifying the same. The model will predict positive as negative and negative as positive. If the value is 0.5 then the model will struggle to differentiate between positive and negative classes. If it’s between 0.5 and 1 then there are more chances that the model will be able to differentiate positive class values from the negative class values.

**How to compare the performance of different models using the ROC-AUC curve?**

Let us now practically understand how we can plot this graph and compare different model performance. We will first build 4 different classification models using different machine learning algorithms and then will plot the ROC-AUC graph to check the best performing model. We will not quickly import the required libraries and the iris data set. Refer to the below code for the same.

from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.ensemble import RandomForestClassifier from sklearn.tree import DecisionTreeClassifier import numpy as np import pandas as pd from sklearn import svm from sklearn.metrics import roc_curve, auc df = pd.read_csv('pima.csv') print(df)

Now we will divide the dependent and independent features X and y respectively followed by splitting the data set into training and testing sets. Use the below code for the same.

`X = df.values[:,0:8]`

`Y = df.values[:,8]`

`X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.50, random_state=1)`

We have divided the data into training and testing now we will build for different models for classifying the class i.e whether a patient is diabetic or not. Use the below code to build the respective models.

`clf1 = LogisticRegression()`

`clf2 = svm.SVC(kernel='linear', probability=True)`

`clf3 = RandomForestClassifier()`

`clf4 = DecisionTreeClassifier()`

Since we have defined the four different classifiers now we will fit the training data over these and will predict probabilities for testing data. Use the below code for the same.

`probas1_ = clf1.fit(X_train, y_train).predict_proba(X_test)`

`probas2_ = clf2.fit(X_train, y_train).predict_proba(X_test)`

`probas3_ = clf3.fit(X_train, y_train).predict_proba(X_test)`

`probas4_ = clf4.fit(X_train, y_train).predict_proba(X_test)`

Now we will compute the ROC curve and AUC score for all these classifiers. Use the below code for the same.

fp1, tp1, thresholds1 = roc_curve(y_test, probas1_[:, 1]) roc_auc_model1 = auc(fp1, tp1) fp2, tp2, thresholds2 = roc_curve(y_test, probas2_[:, 1]) roc_auc_model2 = auc(fp2, tp2) fp3, tp3, thresholds3 = roc_curve(y_test, probas3_[:, 1]) roc_auc_model3 = auc(fp3, tp3) fp4, tp4, thresholds4 = roc_curve(y_test, probas4_[:, 1]) roc_auc_model4 = auc(fp4, tp4) print("AUC for Logistic Regression Model : ",roc_auc_model1) print("AUC for SVM Model:", roc_auc_model2) print("AUC for Random Forest Model :" ,roc_auc_model3) print("AUC for Decision Tree model :", roc_auc_model4)

Since we have got the AUC score now we will plot the roc curve to visualize the performance of all 4 models. Use the below code to do the same.

pl.clf() pl.plot(fpr1, tpr1, label='Logistic Model (area = %0.2f)' % roc_auc1) pl.plot(fpr2, tpr2, label='SVC Model (area = %0.2f)' % roc_auc2) pl.plot(fpr3, tpr3, label='Random Forest Model (area = %0.2f)' % roc_auc3) pl.plot(fpr4, tpr4, label='Decision Tree Model (area = %0.2f)' % roc_auc4) pl.plot([0, 1], [0, 1], 'k--') pl.xlim([0.0, 1.0]) pl.ylim([0.0, 1.0]) pl.xlabel('False Positive Rate') pl.ylabel('True Positive Rate') pl.title('Receiverrating characteristic example') pl.legend(loc="lower right") pl.show()

We can see from the above graph the svc model captures the highest AUC and can be considered as the best performing model among all the four models. This way we can compute and compare different predictive models. We did this for binary classification, whereas if we want to do the same for multi-class classification models we can again do that. Consider we have three classes X, Y, and Z. So if we are plotting the curve for X class then it would be done as classification of X class against no other class i.e Y and Z. And similarly for other classes.

**Conclusion**

In this article, we discussed how we can compare different classification modes using the ROC AUC curve. We first explore what a ROC AUC curve is and why it is better than an accuracy score for comparing different models. At last, we built 4 different classification models on the Pima Diabetes data set and plotted the ROC-AUC curve to pick the best performing model.

Do you want to know how we can deploy this model now? Check here this article title as “Complete Tutorial On Tkinter to Deploy ML Models”.

*If you loved this story, do join our Telegram Community.*

Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

###### What's Your Reaction?

I am currently enrolled in a Post Graduate Program In Artificial Intelligence and Machine learning. Data Science Enthusiast who likes to draw insights from the data. Always amazed with the intelligence of AI. It's really fascinating teaching a machine to see and understand images. Also, the interest gets doubled when the machine can tell you what it just saw. This is where I say I am highly interested in Computer Vision and Natural Language Processing. I love exploring different use cases that can be build with the power of AI. I am the person who first develops something and then explains it to the whole community with my writings.