COURSES
MBAData Science & AnalyticsDoctorate Software & Tech AI | ML MarketingManagement
Professional Certificate Programme in HR Management and AnalyticsPost Graduate Certificate in Product ManagementExecutive Post Graduate Program in Healthcare ManagementExecutive PG Programme in Human Resource ManagementMBA in International Finance (integrated with ACCA, UK)Global Master Certificate in Integrated Supply Chain ManagementAdvanced General Management ProgramManagement EssentialsLeadership and Management in New Age BusinessProduct Management Online Certificate ProgramStrategic Human Resources Leadership Cornell Certificate ProgramHuman Resources Management Certificate Program for Indian ExecutivesGlobal Professional Certificate in Effective Leadership and ManagementCSM® Certification TrainingCSPO® Certification TrainingLeading SAFe® 5.1 Training (SAFe® Agilist Certification)SAFe® 5.1 POPM CertificationSAFe® 5.1 Scrum Master Certification (SSM)Implementing SAFe® 5.1 with SPC CertificationSAFe® 5 Release Train Engineer (RTE) CertificationPMP® Certification TrainingPRINCE2® Foundation and Practitioner Certification
Law
Job Linked
Bootcamps
Study Abroad
MS in Data AnalyticsMS in Project ManagementMS in Information TechnologyMasters Degree in Data Analytics and VisualizationMasters Degree in Artificial IntelligenceMBS in Entrepreneurship and MarketingMSc in Data AnalyticsMS in Data AnalyticsMS in Computer ScienceMaster of Science in Business AnalyticsMaster of Business Administration MS in Data ScienceMS in Information TechnologyMaster of Business AdministrationMS in Applied Data ScienceMaster of Business Administration | STEMMS in Data AnalyticsMaster of Business AdministrationMS in Information Technology and Administrative Management MS in Computer Science Master of Business Administration Master of Business Administration-90 ECTSMSc International Business ManagementMS Data Science Master of Business Administration MSc Business Intelligence and Data ScienceMS Data Analytics MS in Management Information SystemsMSc International Business and ManagementMS Engineering ManagementMS in Machine Learning EngineeringMS in Engineering ManagementMSc Data EngineeringMSc Artificial Intelligence EngineeringMPS in InformaticsMPS in Applied Machine IntelligenceMS in Project ManagementMPS in AnalyticsMS in Project ManagementMS in Organizational LeadershipMPS in Analytics - NEU CanadaMBA with specializationMPS in Informatics - NEU Canada Master in Business AdministrationMS in Digital Marketing and MediaMSc Sustainable Tourism and Event ManagementMSc in Circular Economy and Sustainable InnovationMSc in Impact Finance and Fintech ManagementMS Computer ScienceMBA in Technology, Innovation and EntrepreneurshipMSc Data Science with Work PlacementMSc Global Business Management with Work Placement MBA with Work PlacementMS in Robotics and Autonomous SystemsMS in Civil EngineeringMS in Internet of ThingsMSc International Logistics and Supply Chain ManagementMBA- Business InformaticsMSc International ManagementMBA in Strategic Data Driven ManagementMSc Digital MarketingMBA Business and MarketingMSc in Sustainable Global Supply Chain ManagementMSc Digital Business Analytics MSc in International HospitalityMSc Luxury and Innovation ManagementMaster of Business Administration-International Business ManagementMS in Computer EngineeringMS in Industrial and Systems EngineeringMaster in ManagementMSc MarketingMSc Global Supply Chain ManagementMS in Information Systems and Technology with Business Intelligence and Analytics ConcentrationMSc Corporate FinanceMSc Data Analytics for BusinessMaster of Business AdministrationMaster of Business AdministrationMaster of Business AdministrationMSc in International FinanceMSc in International Management and Global LeadershipMaster of Business AdministrationBachelor of BusinessBachelor of Business AnalyticsBachelor of Information TechnologyMaster of Business AdministrationMBA Business AnalyticsMSc in Marketing Analytics and Data IntelligenceMS Biotechnology Management and EntrepreneurshipMSc in Luxury and Fashion ManagementMaster of Business Administration (90 ECTS)Bachelor of Business Administration (180 ECTS)B.Sc. Computer Science (180 ECTS) MSc in International Corporate Finance MSc in Sustainable Luxury and Creative IndustriesMSc Digital MarketingMSc Global Supply Chain Management (PGMP)MSc Marketing (PGMP)MSc Corporate Finance (PGMP)MSc Data Analytics for Business (PGMP)MS Business AnalyticsMaster of Business AdministrationMS Quantitative FinanceMS Fintech ManagementMS Business Analytics PGMPState University of New York Bachelors Program - STEM
For College Students

ROC Curve in Multivariate Logistic Regression

$$/$$

So far you have learned about some evaluation metrics and saw why they're important to evaluate a logistic regression model. Now, recall that the sensitivity that you got (~53.768%) was quite low and clearly needs to be dealt with. But what was the cause of such a low sensitivity in the first place?

 

If you remember, when you assigned 0s and 1s to the customers after building the model, you arbitrarily chose a cut-off of 0.5 wherein if the probability of churning for a customer is greater than 0.5, you classified it as a 'Churn' and if the probability of churning for a customer is less than 0.5, you classified it as a 'Non-churn'. 

 

Now, this cut-off was chosen at random and there was no particular logic behind it. So it might not be the ideal cut-off point for classification which is why we might be getting such a low sensitivity. So how do you find the ideal cutoff point? Let's start by watching the following video. For a more intuitive understanding, this part has been demonstrated in Excel. You can download the excel file from below and follow along with the lecture.

$$/$$

So you saw that the predicted labels depend entirely on the cutoff or the threshold that you have chosen. For low values of threshold, you'd have a higher number of customers predicted as a 1 (Churn). This is because if the threshold is low, it basically means that everything above that threshold would be one and everything below that threshold would be zero. So naturally, a lower cutoff would mean a higher number of customers being identified as 'Churn'. Similarly, for high values of threshold, you'd have a higher number of customer predicted as a 0 (Not-Churn) and a lower number of customers predicted as a 1 (Churn).

 

Now, let's move forward with our discussion on how to choose an optimal threshold point. For that, you'd first need a few basic terminologies (some of which you have seen in earlier sections.). So let's hear what these terminologies are.

$$/$$

So you learned about the following two terminologies -

 

True Positive Rate (TPR)

This value gives you the number of positives correctly predicted divided by the total number of positives. Its formula as shown in the video is:

 

 

Now, recall the labels in the confusion matrix,

Confusion Matrix Highlighting the Total Number of Actual Positives
$$/$$


As you can see, the highlighted portion shows the row containing the total number of actual positives. Therefore, the denominator term, i.e. in the formula for TPR is nothing but -

 

 

So, the formula for True Positive Rate (TPR) becomes -

 

 

As you might remember, the above formula is nothing but the formula for sensitivity. Hence, the term True Positive Rate that you just learnt about is nothing but sensitivity.

 

The second term which you saw was -

 

False Positive Rate (FPR)

This term gives you the number of false positives (0s predicted as 1s) divided by the total number of negatives. The formula was -

 

 

Again, using the confusion matrix, you can easily see that the denominator here is nothing but the first row. Hence, it can be written as -

 

 

Therefore, the formula now becomes -

 

 

Again, if you recall the formula for specificity, it is given by - 

 

 

Hence, you can see that the formula for False Positive Rate (FPR) is nothing but (1 - Specificity). You can easily verify it yourself.

 

$$/$$

 

So, now that you have understood what these terms are, you'll now learn about ROC Curves which show the tradeoff between the True Positive Rate (TPR) and the False Positive Rate (FPR). And as was established from the formulas above, TPR and FPR are nothing but sensitivity and (1 - specificity), so it can also be looked at as a tradeoff between sensitivity and specificity. 

$$/$$

So you can clearly see that there is a tradeoff between the True Positive Rate and the False Positive Rate, or simply, a tradeoff between sensitivity and specificity. When you plot the true positive rate against the false positive rate, you get a graph which shows the trade-off between them and this curve is known as the ROC curve. The following image shows the ROC curve that you plotted in Excel.

ROC Curve
$$/$$

As you can see, for higher values of TPR, you will also have higher values of FPR, which might not be good. So it's all about finding a balance between these two metrics and that's what the ROC curve helps you find. You also learnt that a good ROC curve is the one which touches the upper-left corner of the graph; so higher the area under the curve of an ROC curve, the better is your model.

 

You'll learn more on ROC curves in the coming segments but first, try out some questions.