Confusion matrix

		Predicted condition		^Sources:^[1]^[2] ^[3]^[4]^[5]^[6]^[7]^[8] ^{view talk edit}
	Total population $= P + N$	Predicted Positive (PP)	Predicted Negative (PN)	Informedness, bookmaker informedness (BM) $= TPR + TNR - 1$	Prevalence threshold (PT) $= .mw-parser-output .sfrac{white-space:nowrap}.mw-parser-output .sfrac.tion,.mw-parser-output .sfrac .tion{display:inline-block;vertical-align:-0.5em;font-size:85%;text-align:center}.mw-parser-output .sfrac .num{display:block;line-height:1em;margin:0.0em 0.1em;border-bottom:1px solid}.mw-parser-output .sfrac .den{display:block;line-height:1em;margin:0.1em 0.1em}.mw-parser-output .sr-only{border:0;clip:rect(0,0,0,0);clip-path:polygon(0px 0px,0px 0px,0px 0px);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}√TPR × FPR - FPR/TPR - FPR$
Actual condition	Positive (P) ^[a]	True positive (TP), hit^[b]	False negative (FN), miss, underestimation	True positive rate (TPR), recall, sensitivity (SEN), probability of detection, hit rate, power $= TP / P$ $= 1 - FNR$	False negative rate (FNR), miss rate type II error ^[c] $= FN / P$ $= 1 - TPR$
Actual condition	Negative (N)^[d]	False positive (FP), false alarm, overestimation	True negative (TN), correct rejection^[e]	False positive rate (FPR), probability of false alarm, fall-out type I error ^[f] $= FP / N$ $= 1 - TNR$	True negative rate (TNR), specificity (SPC), selectivity $= TN / N$ $= 1 - FPR$
	Prevalence $= P / P + N$	Positive predictive value (PPV), precision $= TP / PP$ $= 1 - FDR$	False omission rate (FOR) $= FN / PN$ $= 1 - NPV$	Positive likelihood ratio (LR+) $= TPR / FPR$	Negative likelihood ratio (LR−) $= FNR / TNR$
	Accuracy (ACC) $= TP + TN / P + N$	False discovery rate (FDR) $= FP / PP$ $= 1 - PPV$	Negative predictive value (NPV) $= TN / PN$ $= 1 - FOR$	Markedness (MK), deltaP (Δp) $= PPV + NPV - 1$	Diagnostic odds ratio (DOR) $= LR+ / LR-$
	Balanced accuracy (BA) $= TPR + TNR / 2$	F₁ score $= 2 PPV \times TPR / PPV + TPR$ $= 2 TP / 2 TP + FP + FN$	Fowlkes–Mallows index (FM) $= \sqrt PPV \times TPR$	Matthews correlation coefficient (MCC) $= \sqrt TPR \times TNR \times PPV \times NPV$ $- \sqrt FNR \times FPR \times FOR \times FDR$	Threat score (TS), critical success index (CSI), Jaccard index $= TP / TP + FN + FP$

^ the number of real positive cases in the data
^ A test result that correctly indicates the presence of a condition or characteristic
^ Type II error: A test result which wrongly indicates that a particular condition or attribute is absent
^ the number of real negative cases in the data
^ A test result that correctly indicates the absence of a condition or characteristic
^ Type I error: A test result which wrongly indicates that a particular condition or attribute is present

In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix,^[9] is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa).^[10] The name stems from the fact that it makes it easy to see if the system is confusing two classes (i.e. commonly mislabeling one as another).

It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table).

Example

Given a sample of 13 pictures, 8 of cats and 5 of dogs, where cats belong to class 1 and dogs belong to class 0,

actual = [1,1,1,1,1,1,1,1,0,0,0,0,0],

assume that a classifier that distinguishes between cats and dogs is trained, and we take the 13 pictures and run them through the classifier, and the classifier makes 8 accurate predictions and misses 5: 3 cats wrongly predicted as dogs (first 3 predictions) and 2 dogs wrongly predicted as cats (last 2 predictions).

prediction = [0,0,0,1,1,1,1,1,0,0,0,1,1]

With these two labelled sets (actual and predictions) we can create a confusion matrix that will summarize the results of testing the classifier:

		Actual class
		Cat	Dog
Predicted class	Cat	5	2
Predicted class	Dog	3	3

In this confusion matrix, of the 8 cat pictures, the system judged that 3 were dogs, and of the 5 dog pictures, it predicted that 2 were cats. All correct predictions are located in the diagonal of the table (highlighted in bold), so it is easy to visually inspect the table for prediction errors, as they will be represented by values outside the diagonal.

In abstract terms, the confusion matrix is as follows:

		Actual class
		P	N
Predicted class	P	TP	FP
Predicted class	N	FN	TN

where: P = Positive; N = Negative; TP = True Positive; FP = False Positive; TN = True Negative; FN = False Negative.

Table of confusion

In predictive analytics, a table of confusion (sometimes also called a confusion matrix) is a table with two rows and two columns that reports the number of false positives, false negatives, true positives, and true negatives. This allows more detailed analysis than mere proportion of correct classifications (accuracy). Accuracy will yield misleading results if the data set is unbalanced; that is, when the numbers of observations in different classes vary greatly. For example, if there were 95 cats and only 5 dogs in the data, a particular classifier might classify all the observations as cats. The overall accuracy would be 95%, but in more detail the classifier would have a 100% recognition rate (sensitivity) for the cat class but a 0% recognition rate for the dog class. F1 score is even more unreliable in such cases, and here would yield over 97.4%, whereas informedness removes such bias and yields 0 as the probability of an informed decision for any form of guessing (here always guessing cat).

According to Davide Chicco and Giuseppe Jurman, the most informative metric to evaluate a confusion matrix is the Matthews correlation coefficient (MCC).^[11]

Assuming the confusion matrix above, its corresponding table of confusion, for the cat class, would be:

		Actual class
		Cat	Non-cat
Predicted class	Cat	5 True Positives	2 False Positives
	Non-cat	3 False Negatives	3 True Negatives

The final table of confusion would contain the average values for all classes combined.

Let us define an experiment from P positive instances and N negative instances for some condition. The four outcomes can be formulated in a 2×2 confusion matrix, as follows:

This Wikipedia page has been superseded by template:diagnostic_testing_diagram and is retained primarily for historical reference.

		True condition
	Total population	Condition positive	Condition negative	Prevalence = Σ Condition positive/Σ Total population		Accuracy (ACC) = Σ True positive + Σ True negative/Σ Total population
Predicted condition	Predicted condition positive	True positive	False positive, Type I error	Positive predictive value (PPV), Precision = Σ True positive/Σ Predicted condition positive		False discovery rate (FDR) = Σ False positive/Σ Predicted condition positive
	Predicted condition negative	False negative, Type II error	True negative	False omission rate (FOR) = Σ False negative/Σ Predicted condition negative		Negative predictive value (NPV) = Σ True negative/Σ Predicted condition negative
view talk edit		True positive rate (TPR), Recall, Sensitivity (SEN), probability of detection, Power = Σ True positive/Σ Condition positive	False positive rate (FPR), Fall-out, probability of false alarm = Σ False positive/Σ Condition negative	Positive likelihood ratio (LR+) = TPR/FPR	Diagnostic odds ratio (DOR) = LR+/LR−	Matthews correlation coefficient (MCC) = √TPR·TNR·PPV·NPV − √FNR·FPR·FOR·FDR	F₁ score = 2 · PPV · TPR/PPV + TPR = 2 · Precision · Recall/Precision + Recall
		False negative rate (FNR), Miss rate = Σ False negative/Σ Condition positive	Specificity (SPC), Selectivity, True negative rate (TNR) = Σ True negative/Σ Condition negative	Negative likelihood ratio (LR−) = FNR/TNR

References

^ Fawcett, Tom (2006). "An Introduction to ROC Analysis" (PDF). Pattern Recognition Letters. 27 (8): 861–874. doi:10.1016/j.patrec.2005.10.010. S2CID 2027090.
^ Provost, Foster; Tom Fawcett (2013-08-01). "Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking". O'Reilly Media, Inc.
^ Powers, David M. W. (2011). "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation". Journal of Machine Learning Technologies. 2 (1): 37–63.
^ Ting, Kai Ming (2011). Sammut, Claude; Webb, Geoffrey I. (eds.). Encyclopedia of machine learning. Springer. doi:10.1007/978-0-387-30164-8. ISBN 978-0-387-30164-8.
^ Brooks, Harold; Brown, Barb; Ebert, Beth; Ferro, Chris; Jolliffe, Ian; Koh, Tieh-Yong; Roebber, Paul; Stephenson, David (2015-01-26). "WWRP/WGNE Joint Working Group on Forecast Verification Research". Collaboration for Australian Weather and Climate Research. World Meteorological Organisation. Retrieved 2019-07-17.
^ Chicco D, Jurman G (January 2020). "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation". BMC Genomics. 21 (1): 6-1–6-13. doi:10.1186/s12864-019-6413-7. PMC 6941312. PMID 31898477.
^ Chicco D, Toetsch N, Jurman G (February 2021). "The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation". BioData Mining. 14 (13): 13. doi:10.1186/s13040-021-00244-z. PMC 7863449. PMID 33541410.
^ Tharwat A. (August 2018). "Classification assessment methods". Applied Computing and Informatics. 17: 168–192. doi:10.1016/j.aci.2018.08.003.
^ Stehman, Stephen V. (1997). "Selecting and interpreting measures of thematic classification accuracy". Remote Sensing of Environment. 62 (1): 77–89. Bibcode:1997RSEnv..62...77S. doi:10.1016/S0034-4257(97)00083-7.
^ Powers, David M W (2011). "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation". Journal of Machine Learning Technologies. 2 (1): 37–63. S2CID 55767944.
^ Chicco D, Jurman G (January 2020). "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation". BMC Genomics. 21 (1): 6-1–6-13. doi:10.1186/s12864-019-6413-7. PMC 6941312. PMID 31898477.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[9] the number of real positive cases in the data

[10] A test result that correctly indicates the presence of a condition or characteristic

[11] Type II error: A test result which wrongly indicates that a particular condition or attribute is absent

[12] the number of real negative cases in the data

[13] A test result that correctly indicates the absence of a condition or characteristic

[14] Type I error: A test result which wrongly indicates that a particular condition or attribute is present

[1] Fawcett, Tom (2006). "An Introduction to ROC Analysis" (PDF). Pattern Recognition Letters. 27 (8): 861–874. doi:10.1016/j.patrec.2005.10.010. S2CID 2027090.

[2] Provost, Foster; Tom Fawcett (2013-08-01). "Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking". O'Reilly Media, Inc.

[3] Powers, David M. W. (2011). "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation". Journal of Machine Learning Technologies. 2 (1): 37–63.

[4] Ting, Kai Ming (2011). Sammut, Claude; Webb, Geoffrey I. (eds.). Encyclopedia of machine learning. Springer. doi:10.1007/978-0-387-30164-8. ISBN 978-0-387-30164-8.

[5] Brooks, Harold; Brown, Barb; Ebert, Beth; Ferro, Chris; Jolliffe, Ian; Koh, Tieh-Yong; Roebber, Paul; Stephenson, David (2015-01-26). "WWRP/WGNE Joint Working Group on Forecast Verification Research". Collaboration for Australian Weather and Climate Research. World Meteorological Organisation. Retrieved 2019-07-17.

[6] Chicco D, Jurman G (January 2020). "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation". BMC Genomics. 21 (1): 6-1–6-13. doi:10.1186/s12864-019-6413-7. PMC 6941312. PMID 31898477.

[7] Chicco D, Toetsch N, Jurman G (February 2021). "The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation". BioData Mining. 14 (13): 13. doi:10.1186/s13040-021-00244-z. PMC 7863449. PMID 33541410.

[8] Tharwat A. (August 2018). "Classification assessment methods". Applied Computing and Informatics. 17: 168–192. doi:10.1016/j.aci.2018.08.003.

[15] Stehman, Stephen V. (1997). "Selecting and interpreting measures of thematic classification accuracy". Remote Sensing of Environment. 62 (1): 77–89. Bibcode:1997RSEnv..62...77S. doi:10.1016/S0034-4257(97)00083-7.

[Powers2011-16] Powers, David M W (2011). "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation". Journal of Machine Learning Technologies. 2 (1): 37–63. S2CID 55767944.

[17] Chicco D, Jurman G (January 2020). "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation". BMC Genomics. 21 (1): 6-1–6-13. doi:10.1186/s12864-019-6413-7. PMC 6941312. PMID 31898477.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[a]

[b]

[c]

[d]

[e]

[f]

[9]

[10]

[11]

v t e Matrix classes
Explicitly constrained entries	Alternant Anti-diagonal Anti-Hermitian Anti-symmetric Arrowhead Band Bidiagonal Bisymmetric Block-diagonal Block Block tridiagonal Boolean Cauchy Centrosymmetric Conference Complex Hadamard Copositive Diagonally dominant Diagonal Discrete Fourier Transform Elementary Equivalent Frobenius Generalized permutation Hadamard Hankel Hermitian Hessenberg Hollow Integer Logical Matrix unit Metzler Moore Nonnegative Pentadiagonal Permutation Persymmetric Polynomial Quaternionic Signature Skew-Hermitian Skew-symmetric Skyline Sparse Sylvester Symmetric Toeplitz Triangular Tridiagonal Vandermonde Walsh Z
Constant	Exchange Hilbert Identity Lehmer Of ones Pascal Pauli Redheffer Shift Zero
Conditions on eigenvalues or eigenvectors	Companion Convergent Defective Definite Diagonalizable Hurwitz Positive-definite Stieltjes
Satisfying conditions on products or inverses	Congruent Idempotent or Projection Invertible Involutory Nilpotent Normal Orthogonal Unimodular Unipotent Unitary Totally unimodular Weighing
With specific applications	Adjugate Alternating sign Augmented Bézout Carleman Cartan Circulant Cofactor Commutation Confusion Coxeter Distance Duplication and elimination Euclidean distance Fundamental (linear differential equation) Generator Gram Hessian Householder Jacobian Moment Payoff Pick Random Rotation Seifert Shear Similarity Symplectic Totally positive Transformation
Used in statistics	Centering Correlation Covariance Design Doubly stochastic Fisher information Hat Precision Stochastic Transition
Used in graph theory	Adjacency Biadjacency Degree Edmonds Incidence Laplacian Seidel adjacency Tutte
Used in science and engineering	Cabibbo–Kobayashi–Maskawa Density Fundamental (computer vision) Fuzzy associative Gamma Gell-Mann Hamiltonian Irregular Overlap S State transition Substitution Z (chemistry)
Related terms	Jordan normal form Linear independence Matrix exponential Matrix representation of conic sections Perfect matrix Pseudoinverse Row echelon form Wronskian
Mathematics portal List of matrices Category:Matrices