Total operating characteristic
The Total Operating Characteristic (TOC) is a statistical method to compare a Boolean variable versus a rank variable. TOC can measure the ability of an index variable to diagnose either presence or absence of a characteristic. The diagnosis of presence or absence depends on whether the value of the index is above a threshold. TOC considers multiple possible thresholds. Each threshold generates a two-by-two contingency table, which contains four entries: hits, misses, false alarms, and correct rejections.
The Receiver Operating Characteristic (ROC) also characterizes diagnostic ability, although ROC reveals less information than the TOC. For each threshold, ROC reveals two ratios, hits/(hits + misses) and false alarms/(false alarms + correct rejections), while TOC shows the total information in the contingency table for each threshold. The TOC method reveals all of the information that the ROC method provides, plus additional important information that ROC does not reveal, i.e. the size of every entry in the contingency table for each threshold. TOC also provides the popular area under the curve (AUC) of the ROC.
The procedure to construct the TOC curve compares the Boolean variable to the index variable by diagnosing each observation as either presence or absence, depending on how the index relates to various thresholds. If an observation’s index is greater than or equal to a threshold, then the observation is diagnosed as presence, otherwise the observation is diagnosed as absence. The contingency table that results from the comparison between the Boolean variable and the diagnosis for a single threshold has four central entries. The four central entries are hits (H), misses (M), false alarms (F), and correct rejections (C). The total number of observations is P + Q. The terms “true positives”, “false negatives”, “false positives” and “true negatives” are equivalent to hits, misses, false alarms and correct rejections, respectively. The entries can be formulated in a two-by-two contingency table or confusion matrix, as follows:
|Presence||Hits (H)||False Alarms (F)||H + F|
|Absence||Misses (M)||Correct Rejections (C)||M + C|
|Boolean Total||H + M = P||F + C = Q||P + Q|
Four bits of information determine all the entries in the contingency table, including its marginal totals. For example, if we know H, M, F, and C, then we can compute all the marginal totals for any threshold. Alternatively, if we know H/P, F/Q, P, and Q, then we can compute all the entries in the table.  Two bits of information are not sufficient to complete the contingency table. For example, if we know only H/P and F/Q, which is what ROC shows, then it is impossible to know all the entries in the table.
The TOC curve with four boxes indicates how a point on the TOC curve reveals the Hits, Misses, False Alarms, and Correct Rejections. The TOC curve is an effective way to show the total information in the contingency table for all thresholds. The data used to create this TOC curve is available for download here. This dataset has 30 observations, each of which consists of values for a Boolean variable and an index variable. The observations are ranked from the greatest to the least value of the index. There are 31 thresholds, consisting of the 30 values of the index and one additional threshold that is greater than all the index values, which creates the point at the origin (0,0). Each point is labeled to indicate the value of each threshold. The horizontal axes ranges from 0 to 30 which is the number of observations in the dataset (P + Q). The vertical axis ranges from 0 to 10, which is the Boolean variable’s number of presence observations P (i.e. Hits + Misses). TOC curves also show the threshold at which the diagnosed amount of presence matches the Boolean amount of presence, which is the threshold point that lies directly under the point where the Maximum line meets the hits + misses line, as the TOC curve on the left illustrates. For a more detailed explanation of the construction of the TOC curve, please see Pontius Jr, Robert Gilmore; Si, Kangping (2014). "The total operating characteristic to measure diagnostic ability for multiple thresholds." International Journal of Geographical Information Science 28 (3): 570–583.”
The following four pieces of information are the central entries in the contingency table for each threshold:
- The number of hits at each threshold is the distance between the threshold’s point and the horizontal axis.
- The number of misses at each threshold is the distance between the threshold’s point and the hits + misses horizontal line across the top of the graph.
- The number of false alarms at each threshold is the distance between threshold’s point and the blue dashed Maximum line that bounds the left side of the TOC space.
- The number of correct rejections at each threshold is the distance between the threshold’s point and the purple dashed Minimum line that bounds the right side of the TOC space.
TOC vs. ROC curves
These figures are the TOC and ROC curves using the same data and thresholds. Consider the point that corresponds to a threshold of 74. The TOC curve shows the number of hits, which is 3, and hence the number of misses, which is 7. Additionally, the TOC curve shows that the number of false alarms is 4 and the number of correct rejections is 16. At any given point in the ROC curve, it is possible to glean values for the ratios of false alarms/(false alarms+correct rejections) and hits/(hits+misses). For example, at threshold 74, it is evident that the x coordinate is 0.3 and the y coordinate is 0.2. However, these two values are insufficient to construct all entries of the underlying two-by-two contingency table.
Interpreting TOC curves
It is common to report the area under the curve (AUC) to summarize a TOC or ROC curve. However, condensing diagnostic ability into a single number fails to appreciate the shape of the curve. The following three TOC curves are TOC curves that have an AUC of 0.75 but have different shapes.
This TOC curve on the left exemplifies an instance in which the index variable has a high diagnostic ability at high thresholds near the origin, but random diagnostic ability at low thresholds near the upper right of the curve. The curve shows accurate diagnosis of presence until the curve reaches a threshold of 86. The curve then levels off and predicts around the random line.
This TOC curve exemplifies an instance in which the index variable has a medium diagnostic ability at all thresholds. The curve is consistently above the random line.
This TOC curve exemplifies an instance in which the index variable has random diagnostic ability at high thresholds and high diagnostic ability at low thresholds. The curve follows the random line at the highest thresholds near the origin, then the index variable diagnoses absence correctly as thresholds decrease near the upper right corner.
Area under the curve
When measuring diagnostic ability, a commonly reported measure is the Area Under the Curve (AUC). The AUC is calculable from the TOC and the ROC. The AUC indicates the probability that the diagnosis ranks a randomly chosen observation of Boolean presence higher than a randomly chosen observation of Boolean absence. The AUC is appealing to many researchers because AUC summarizes diagnostic ability in a single number, however, the AUC has come under critique as a potentially misleading measure, especially for spatially explicit analyses.  Some features of the AUC that draw criticism include the fact that 1) AUC ignores the thresholds; 2) AUC summarizes the test performance over regions of the TOC or ROC space in which one would rarely operate; 3) AUC weighs omission and commission errors equally; 4) AUC does not give information about the spatial distribution of model errors; and, 5) the selection of spatial extent highly influences the rate of accurately diagnosed absences and the AUC scores. However, most of those criticisms apply to many other metrics.
- Pontius, Robert Gilmore; Si, Kangping (2014). "The total operating characteristic to measure diagnostic ability for multiple thresholds". International Journal of Geographical Information Science. 28 (3): 570–583.
- Pontius, Robert Gilmore; Parmentier, Benoit (2014). "Recommendations for using the Relative Operating Characteristic (ROC)". Landscape Ecology.
- Halligan, Steve; Altman, Douglas; Mallett, Susan (2015). "Disadvantages of using the area under the receiver operating characteristic curve to assess imaging tests: A discussion and proposal for an alternative approach". European Radiology.
- Ward Powers, David Martin (2012). "The Problem of Area Under the Curve". Flinders University.
- Lobo, Jorge; Jiménez-Valverde, Alberto; Real, Raimundo (2007). "AUC: a misleading measure of the performance of predictive distribution models". Global Ecology and Biogeography.
- Pontius Jr, Robert Gilmore; Si, Kangping (2014). "The total operating characteristic to measure diagnostic ability for multiple thresholds." International Journal of Geographical Information Science 28 (3): 570–583.
- Pontius Jr, Robert Gilmore; Parmentier, Benoit (2014). Recommendations for using the Relative Operating Characteristic (ROC). Landscape Ecology 29 (3): 367–382.
- Mas, Jean-François; Filho, Britaldo Soares; Pontius Jr, Robert Gilmore; Gutiérrez, Michelle Farfán; Rodrigues, Hermann (2013). A suite of tools for ROC analysis of spatial models. ISPRS International Journal of Geo-Information 2 (3): 869–887.
- Pontius Jr, Robert Gilmore; Pacheco, Pablo (2004). Calibration and validation of a model of forest disturbance in the Western Ghats, India 1920–1990. GeoJournal 61 (4): 325–334.
- Pontius Jr, Robert Gilmore; Batchu, Kiran (2003). Using the relative operating characteristic to quantify certainty in prediction of location of land cover change in India. Transactions in GIS 7 (4) pp. 467–484.
- Pontius Jr, Robert Gilmore; Schneider, Laura (2001). Land-use change model validation by a ROC method for the Ipswich watershed, Massachusetts, USA. Agriculture, Ecosystems & Environment 85 (1-3): 239–248.