You chose a cutoff of 0.5 in order to classify the customers into 'Churn' and 'Non-Churn'. Now, since you're classifying the customers into two classes, you'll obviously have some errors. The classes of errors that would be there are:
To capture these errors, and to evaluate how well the model is, you'll use something known as the 'Confusion Matrix'. A typical confusion matrix would look like the following:
This table shows a comparison of the predicted and actual labels. The actual labels are along the vertical axis, while the predicted labels are along the horizontal axis. Thus, the second row and first column (263) is the number of customers who have actually ‘churned’ but the model has predicted them as non-churn.
Similarly, the cell at second row, the second column (298) is the number of customers who are actually ‘churn’ and also predicted as ‘churn’.
Note that this is an example table and not what you'll get in Python for the model you've built so far. It is just used an example to illustrate the concept.
Now, the simplest model evaluation metric for classification models is accuracy - it is the percentage of correctly predicted labels. So what would the correctly predicted labels be? They would be:
As you can see from the table above, the correctly predicted labels are contained in the first row and first column, and the last row and last column as can be seen highlighted in the table below:
Now, accuracy is defined as:
Hence, using the table, we can say that the accuracy for this table would be:
Now that you know about confusion matrix and accuracy, let's see how good is your model built so far based on the accuracy. But first, answer a couple of questions.
So using the confusion matrix, you got an accuracy of about 80.8% which seems to be a good number to begin with. The steps you need to calculate accuracy are:
The code used to do this was:
# Create confusion matrix confusion = metrics.confusion_matrix(y_train_pred_final.Churn, y_train_pred_final.predicted) # Calculate accuracy print(metrics.accuracy_score(y_train_pred_final.Churn, y_train_pred_final.predicted))