Confusion matrix evaluation and examples for students

confusion matrix postives and negatives

What does a confusion matrix tell you?

A confusion matrix can tell you how good your predictions are compared to actual results. In machine learning terms, it shows how your model performs when making predictions.

What is a Confusion Matrix?

A confusion matrix is a grid that contains four metrics that combine true predictions that are correct, with false predictions that are incorrect, for binary classifications such as yes or no, positive or negative, or 0 or 1.

A confusion matrix can also be used to calculate performance metrics such as accuracy, precision, recall and F1 score, which equally combines precision and recall measures.

These metrics are used to evaluate the performance of machine learning models for example.

In supervised learning a machine learning algorithm uses data examples that have a label to train a model. The label is the correct answer that the model should give with that data.

To understand the strengths and weaknesses of a model, and see how it is performing, a confusion matrix can show how the predictions compare to the actual results, the given label.

confusion matrix in orange

An accompanied video series on machine learning using orange includes a video on the confusion matrix. If you are interesting in this video please click this link – confusion matrix in orange.

What is Binary Classification?

A confusion matrix shows the performance for binary classifications meaning something that has one of two results, these are given the outcomes of positive or negative.

An example in healthcare would be a test to indicate if a patient was suffering from an illness (e.g. covid). A positive test result would indicate the patient has the illness and a negative test result would indicate they do not.

When there are more than two outcomes then a multiple class confusion matrix can be used. Details, explanations, images and examples of both binary and multiple class confusion matrices are given below.

What is a Dataset?

A dataset is a set of records, examples or cases that act as a test set. For each instance in the set there are a collections of details, called features, and a label.

Why use a Confusion Matrix?

We consider a yes / no question as having either of two answers but predictions need more information.

We make predictions for prices in a stock market or if it will rain. But we could say it will rain every day and we would statistically be 100% correct on days that it does rain. But what about the days that it does not rain?

So to evaluate predictions we need to see when a positive prediction (e.g. rain) is correct or wrong, and, when a negative prediction (e.g. no rain) is correct or wrong.

A confusion matrix has four metrics: true positives, false positives, true negatives and false negatives.

What are True Positives, False Positives, True Negatives and False Negatives?

A confusion matrix always consists of four elements:

  • True Positive (TP) a correct positive test
  • True Negative (TN) a correct negative test
  • False Positive (FP) an incorrect positive test
  • False Negative (FN) an incorrect negative test

A false positive is also known as a type I error, and a false negative is also called a type II error.

In the example of a covid test, a positive test indicates the patient has covid, a negative test indicates that the patient does not have covid.

In a confusion matrix example evaluating a covid test the four confusion metrics have the following meanings:

  • True Positive (TP) – the patient has covid, test correct.
  • True Negative (TN) – the patient does not have covid, test correct.
  • False Positive (FP)  – the patient does not have covid despite the test indicating that they do have covid.
  • False Negative (FN) – the patient has covid, but the test incorrectly indicates that they don’t have covid.

Create a Confusion Matrix

To create a confusion matrix draw a 2 by 2 grid. A simple confusion matrix has the predictions horizontally and the actual results vertically, such as in the following diagram.

a simple confusion diagram

There are alternative formats (see below), but the simple confusion matrix is a four cell grid and these four cells contain the four metrics of true positives, false positives, true negatives and false negatives.

Confusion Matrix formats

To clarify that there are different formats for the confusion matrix but they are all the same. They contain the same metrics.

altrbative confusion matrix

Be careful when using a confusion matrix with a different format as the false positive and false negatives are placed differently.

Another alternative format for a confusion matrix uses the 0 and 1 digits to indicate negative and positive (see below).  Once again the metrics are the same but the placement is different.

a binary confusion matrix

In summary, there are several ways to show the four metrics in a confusion matrix, but they are all the same.

Mak sure you understand which metric is in which grid cell before you use the figures to calculate the performance statistics that we will now explain.

Confusion Matrix example

Here is a covid test example written like an exam question:

In a Covid test of 1000 patients, there were 45 positive tests, of which 30 patients had covid and 15 were falsely tested positive.

Of the 955 negative tests there were 5 that were incorrect, these patients had covid but were tested negatively.

Draw the confusion matrix and calculate the accuracy, precision, recall, sensitivity and F1 score from the matrix.

The confusion matrix evaluation metrics will be explained in the next section, but lets first focus on the confusion matrix. Remember a confusion matrix always consists of four elements:

  • True Positive (TP) a correct positive test – 30
  • True Negative (TN) a correct negative test – 950
  • False Positive (FP) an incorrect positive test – 15
  • False Negative (FN) an incorrect negative test – 5

The total of 1000 cases consist of 45 positive tests (TP + FP) which are correct (30) and incorrect (15). The other 955 negative cases (FP + FN) contain 5 incorrect tests and 950 correct tests.

We enter these in our confusion matrix as seen below.

confusion matrix example for a covid test

The simple approach to answer our example question would be to plug each value into the formulas to get the results.

Confusion Matrix Evaluation Metrics

The covid test example question requires the accuracy, precision, recall, sensitivity and F1 score from the confusion matrix. So how do we get these?

We will give explanations of these evaluation metrics, but to simply answer the question the numbers from our confusion matrix above can be entered into the following equations.

Evaluation Metric Equations

Confusion matrix evaluation metrics including accuracy, precision, recall, sensitivity and F1 score are seen in the following equations:

evaluation metric formulas accuracy precision recall F1 score and sensitivity
  • Accuracy – the percentage of correct predictions.
  • Precision– the percentage of positive, correct predictions
  • Recall– the percentage of actual cases that the test has correctly identified.
  • Sensitivity– the same as recall
  • F1 score – a measure that equally combines both precision and recall

In this example, the covid tests would consist of the following evaluation measures:

  • Accuracy – the percentage of correct predictions, covid or no covid.
  • Precision– the percentage of tests testing positive that were correct from all the positive predictions.
  • Recall– the percentage of covid sufferers that were correctly identified by the positive result.

Confusion Matrix Evaluation - Covid example

We have the metrics from the confusion matrix and the evaluation equations, therefore we can easily calculate the requirements from our original question.

confusion matrix example for a covid test
  • True Positive (TP) a correct positive test – 30
  • True Negative (TN) a correct negative test – 950
  • False Positive (FP) an incorrect positive test – 15
  • False Negative (FN) an incorrect negative test – 5
evaluation metric formulas accuracy precision recall F1 score and sensitivity

Calculate the accuracy, precision, recall, sensitivity and F1 score from the matrix.

Accuracy

  • number of correct predictions / total number of predictions
  • 30+950  /  30 + 15+ 950 + 5
  • = 980/1000
  • = 49/50 or 98%

Precision

  • true positive / true positive + false positive
  • 30 / 30+15
  • =30/45
  • =2/3 or 66.7%

 Recall (and sensitivity)

  • true positive / true positive + false negative
  • 30 / 30+5
  • =30/35
  • = 0.857 or 85.7%

 F1 score

  • 2 x (precision*recall  / precision + recall)
  • = 2 * (0.57/1.52)
  • = 2*0.375
  • =0.75 or 75%

Confusion Matrix Evaluation - ball example

Imagine we are playing a game where you have to guess if the next item is a ball or not. You are aware it happens around half the time.

You make a guess, the item is a ball or nor a ball, and you are awarded one or zero points. Here is an example dataset of results of 10 guesses, the correct answers and points awarded.

  1. item = ball, correct guess, 1 pt
  2. item = ball, incorrect guess, 0 pts
  3. item = no ball, correct guess, 1 pt
  4. item = ball, correct guess, 1 pt
  5. item = no ball, incorrect guess, 0 pts
  6. item = no ball, incorrect guess, 0 pts
  7. item = no ball, correct guess, 1 pt
  8. item = no ball, incorrect guess, 0 pts
  9. item = ball, correct guess, 1 pt
  10. item = no ball, correct guess, 1 pt

The prediction ball was made 4 times and it was correct 3 out of those 4 times. The prediction no ball was made 6 times, with 3 out of those 6 attempts correct.

Here is the resulting confusion matrix:

confusion matrix for predicting a ball tnen times

Calculate the accuracy, precision, recall, sensitivity and F1 score from the matrix.

Accuracy

  • number of correct predictions / total number of predictions
  • 6/10 or 60%

Precision

  • true positive / true positive + false positive
  • 3/4 or 75%

 Recall

  • true positive / true positive + false negative
  • 3/6 or 50%

 F1 score

  • 2 x (precision*recall  / precision + recall)
  • = 2* (0.75*0.5) / 1.25
  • = 2* (0.375/1.25)
  • = 2* 0.3
  • = 0.6 or 60%

Sensitivity is the same as recall.

Confusion Matrix for Multiple Classes

A confusion matrix with multiple classes has more than two outcomes such as group A, B, C or group D, or, single, married, divorced, or widowed for example

The matrix is similar as with binary class examples although there the only format requirement is that the grid is of equal size both horizontally and vertically (3 by , 4 by 4, etc.).

Multiple Class Confusion Matrix example

Similar to the binary classification example of predicting a ball, in the multiple class example we have a selection of three colours, red, green and blue.

Predicting the colour of the next ball we have the example results as follows:

  1. colour = green, guess=green 1 pt
  2. colour = blue, guess=red 0 pts
  3. colour = blue, guess=blue 1 pt
  4. colour = green, guess=red 0 pts
  5. colour = red, guess=red 1 pt
  6. colour = green, guess=green 1 pt
  7. colour = red, guess=blue 0 pts
  8. colour = red, guess=red 1 pt
  9. colour = blue, guess=blue 1 pt
  10. colour = green, guess=red 0 pts

The confusion matrix for the multiple classes is 3 by 3 grid.

multiple class confusion matrix

Confusion Matrix in orange

The confusion matrix seen in orange is formatted slightly differently with added colouring and the row and column totals also given.

confusion matrix seen in orange

We can calculate the evaluation metrics from this confusion matrix in the same way as in the previous examples.

Accuracy

  • number of correct predictions / total number of predictions
  • 3041/4119 = 73.8%

Precision

  • true positive / true positive + false positive
  • 2719/3668 or 74.1%

 Recall

  • true positive / true positive + false negative
  • 2719/2848 or 95.5%

 F1 score

  • 2 x (precision*recall  / precision + recall)
  • 2* (0.7077/1.696)
  • 0.835 or 83.46%

Sensitivity is the same as recall.

Practice Exercises

Stage 1. Create the confusion matrices for the following exercises. The solutions will be given at the end of the article.

Step 2. Calculate the accuracy, precision, recall, sensitivity and F1 score from the confusion matrices from part 1. Again, the solutions will be given at the end of the article.

Exercise 1 - 20 cases

In exercise 1 there are only 20 cases where 8 patients are diagnoses correctly as positive and 4 incorrectly.

There were 8 patients diagnoses with a negative result, 5 correctly and 3 incorrectly.

Complete the confusion matrix first, check it again the solution given below, and the calculate the evaluation metrics.

Exercise 2 - 100 cass

In exercise 2 there are 100 cases where only 5 positive cases were found, and there was only one case that was incorrectly negative.

Of the five positive cases, the actual results showed 3 correct and 2 incorrect cases.

Exercise 3 - 100 samples

In exercise 3 there are also 100 samples. Of the 60 positive samples there were 45 correctly identified positive cases, whilst there were also 35 correctly identified negative cases.

Exercise 4 - 128 tests

In exercise 4, of the 75 positive tests, 9 were false, whilst the 53 negative tests included 22 that were false.

Exercise 5 - 200 tests

In exercise 5, of the 122 positive tests, only 2 were false, whilst the 78 negative tests included only 5 that were false.

Good luck with all the exercises. scroll down to see the solutions. The confusion matrices are given first, and the other evaluation metrics are given after.

Terms

Confusion Matrix

A confusion matrix is a grid that contains four metrics that combine true predictions that are correct, with false predictions that are incorrect, for binary classifications such as yes or no, positive or negative, or 0 or 1.

Binary Classification

A binary classification classes the result into one of two results,such as positive or negative

True Positive (TP)

True Positive (TP) a correct positive test

True Negative (TN)

True Negative (TN) a correct negative test

False Positive (FP)

False Positive (FP) an incorrect positive test

False Negative (FN)

False Negative (FN) an incorrect negative test

Multiple Class Confusion Matrix

A confusion matrix with multiple classes has more than two outcomes such as group A, B, C or group D, or, single, married, divorced, or widowed for example

FAQ

Here are some short answers to popular or frequently asked questions:

What is the advantage of confusion matrix?

The advantage of a confusion matrix is it is a visual tool that can show the performance of a model in a simple and effective way.

What is a binary confusion matrix?

A binary confusion matrix is a grid that contains four metrics that combine true predictions that are correct, with false predictions that are incorrect, for binary classifications such as yes or no, positive or negative, or 0 or 1.

How does confusion matrix work?

A confusion matrix simply takes the figures for the positive and negative results of a test and shows how many of these are correct and incorrect.

What are confusion matrix metrics?
  1. True Positive (TP) a correct positive test
  2. True Negative (TN) a correct negative test
  3. False Positive (FP) an incorrect positive test
  4. False Negative (FN) an incorrect negative test
What is a 3x3 confusion matrix?

A3x3 confusion matrix is a type of multiple class matrix that has nine cells, three columns by three rows. This indicates there are three classes such as red, green and blue in the example above.

What is confusion matrix with example?

The explanation and examples of a confusin matrix are in the article above.

How do you write a confusion matrix?

Although there are several ways to format a confusion matrix the standard way to write a confusion matrix is as follows:

confusion matrix for predicting a ball tnen times

Practice Solutions

The practice exercise solutions start with the confusion matrices as follows:

Confusion Matrix Exercise 1
confusion matrix practice exercise
Confusion Matrix Exercise 2
confusion matrix practice exercise
Confusion Matrix Exercise 3
confusion matrix practice exercise
Confusion Matrix Exercise 4
confusion matrix practice exercise
Confusion Matrix Exercise 5
confusion matrix practice exercise

Calculate the accuracy, precision, recall, sensitivity and F1 score from the confusion matrix in the following examples. The solutions will be given at the end of the article.

Confusion Matrix Evaluation Metrics - Exercise 1

We can calculate the evaluation metrics from the confusion matrices solutions given above.

  • Accuracy  13/20 = 65%
  • Precision  8/12 or 66.7%
  • Recall (& sensitivity)  8/11 or 72.2%
  • F1 score  2*(0.48/1.39),  0.70 or 70.0%
Confusion Matrix Evaluation Metrics - Exercise 2
  • Accuracy  97/100 = 97%
  • Precision  3/5 or 60%
  • Recall (& sensitivity)  3/4 or 75%
  • F1 score  2*(0.45/1.35), 0.67 or 66.7%
Confusion Matrix Evaluation Metrics - Exercise 3
  • Accuracy  80/100 = 80%
  • Precision  45/60 or 75%
  • Recall (& sensitivity)  45/50 or 90%
  • F1 score  2*(0.68/1.65), 0.82 or 81.8%
Confusion Matrix Evaluation Metrics - Exercise 4
  • Accuracy  97/128 = 75.8%
  • Precision  66/75 or 88%
  • Recall (& sensitivity)  66/88 or 75%
  • F1 score  2*(0.66/1.63), 0.81 or 81.0%
Confusion Matrix Evaluation Metrics - Exercise 5
  • Accuracy  193/200 = 96.5%
  • Precision  120/122 or 98.4%
  • Recall (& sensitivity)  120/125 or 96%
  • F1 score  2*(0.94/1.94), 0.97 or 97.2%

Leave a Comment