What is a Confusion Matrix? Dive in with me below as we explore the meaning and use of Confusion Matrices.
Data Classification
One of the most wellknown and wellstudied problems in Machine Learning is that of data classification.
The problem is simple — we are given a set of input data, and we have a class associated with each data point. Now, we are given a new data point and must predict the class associated with it.
Example 1
Let’s take an example. Say that we have a bunch of emails. Some of them are spam emails. Others are not spam (called “ham”). So each email is a data point, and the classes associated are spam or ham. We can now train a Machine Learningbased classifier on this data.
We are given a new data point (a new email) and need to find out if it is spam or nonspam (ham) email. Clearly, this is a classification problem. More specifically, this is a binary classification problem.
Measuring the Accuracy of a Classifier
When scientists developed these classifiers, one obvious challenge they faced was how to measure the performance of a classifier. One obvious way is to measure accuracy — the number of new data points that the algorithm correctly classified.
For instance, if the algorithm was tested on 100 new data points, and the algorithm correctly classified 86 of them — then we know that the accuracy is 86%.
However, measuring performance of a classifier this way is not recommended. Consider the following example:

Suppose we have testing data where out of the 100 test emails, 90 are nonspam and only 10 are spam. Now imagine if there was a classifier that always classifies every email it gets as nonspam email. If this classifier was run on the above 100 test emails, clearly it will classify 90 out of 100 emails correctly. Measuring accuracy tells us that this classifier is 90% accurate. However, this makes no sense since the classifier always classifies each email as nonspam!
So Now What? Enter Confusion Matrix!
Scientists therefore developed a more powerful way of measuring the performance of such classifiers — the Confusion Matrix. A Confusion Matrix is a matrix that measures the accuracy of a classifier in a more robust way. Here is an example of a Confusion Matrix of a classifier:
As you can see, there are 12 + 3 = 15 spam emails, and 4 + 81 = 85 nonspam emails in the data set. The interpretation of the above matrix is as follows:
 Of the 12 + 3 = 15 spam emails, 12 have been correctly classified as spam and the remaining 3 have been misclassified as nonspam.
 Of the 4 + 81 = 85 nonspam emails, 4 have been misclassified as spam while 81 have been correctly classified as nonspam.
Ideal Classifiers
In an ideal classifier (a classifier which classifies perfectly without any mistake), the Confusion Matrix should look something like this:
Basically, all the 15 spam emails have been correctly identified as spam, and all the 85 nonspam emails have been correctly identified as nonspam. Let’s revisit our original Confusion Matrix that looks something like this:
We know that of the 15 spam emails, 12 have been correctly classified as spam. So basically, 12/(12 + 3) = 0.800 spam emails are correctly classified while 3/(12 + 3) = 0.200 are incorrectly classified.
Similarly, 4/(4 + 81) = 0.047 nonspam emails are incorrectly classified while 81/(4 + 81) = 0.953 nonspam emails are correctly classified. Replacing the absolute numbers with the above fractions, we will get an updated Matrix that looks like this:
A Few Numbers of Interest
Now, let’s call nonspam class 0, and spam class 1.
A few numbers are of interest here:
 False positive rate: The false positive rate indicates the number of objects of class 0 that were incorrectly flagged as class 1. In this case, 0.047 is the false positive rate. Basically, 4.7% of the objects (emails) of class 0 (nonspam) were flagged as belonging to class 1 (spam).
 True negative rate: The true negative rate indicates the number of objects of class 1 that were incorrectly flagged as belonging to class 0. In this case, 0.200 is the true negative rate. Basically, 20% of the spam emails (class 1) were not identified as spam by the classifier.
Identity Matrix
In an ideal classifier, the Confusion Matrix would look like the following:
Basically, for an ideal classifier, the Confusion Matrix expressed as a fraction is an Identity Matrix. Here, 100% of spam emails were identified as spam and 100% of nonspam emails were identified as nonspam.
Clearly, the Confusion Matrix is a much more accurate performance indicator since it captures those cases where the data set is skewed — objects of a particular class are far more in number as compared to objects of other classes.
Confusion Matrix for Three Classes
The concept of Confusion Matrix is generic and can be extended to multiple classes also. Take a look at the Confusion Matrix for three classes:
As I hope you can see, the Confusion Matrix is a great tool to measure the performance of a classifier, be it a binary classifier or a multiclass classifier. It is far more accurate even when data sets are skewed, and is therefore the ideal measurement tool to evaluate the performance of a classifier.
Comments are closed.