Talk:Receiver operating characteristic

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated B-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B-Class article B  This article has been rated as B-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
WikiProject Computational Biology (Rated B-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computational Biology, a collaborative effort to improve the coverage of Computational Biology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
B-Class article B  This article has been rated as B-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.

Area under the curve[edit]

The minus sign in the derivation of the AUC right before the derivative of the FPR seems to be wrong. Let be the True Positive Rate as function of the False Positive Rate. The AUC is then given by:

Let be a parametrization of with and . According to Integration by substitution:

In terms of the True Negative Rate and its density:

Clinical scenarios[edit]

Can anyone add information about the use of this method in clinical scenarios (eg. examination of risk factors for disease outcomes)?—Preceding unsigned comment added by (talkcontribs) 23:08, 28 September 2006 (UTC)

The Guyatt et al. paper on iron-deficiency anemia is a classic.
Guyatt G, Patterson C, Ali M, Singer J, Levine M, Turpie I, Meyer R (1990). "Diagnosis of iron-deficiency anemia in the elderly.". Am J Med 88 (3): 205-9. PMID 2178409.
You're welcome to add an example you've come across (this is the encyclopedia anyone can edit)... or write a section after digesting Guyatt's paper -- which makes use of the concept. Nephron  T|C 19:00, 14 December 2006 (UTC)

Merging of articles[edit]

I suggest that the articles "Receiver operating characteristic" and "Detection theory" (Signal detection theory) should be merged. The merged article should seek a middle path in terms of technical formality and jargon (as simple as possible, but not simpler). Merged or not, the two articles should be more clearly compatible, if not entirely consistent, since they are basically trying to explain the same underlying thing. Rbfuld (talk) 23:02, 27 January 2008 (UTC)rbfuld

I agree with the suggestion that the articles ROC and "Reciever operator characteristic" should be merged. Wicked Maven 20:00, 28 January 2007 (UTC)

I disagree with the suggestion that "Receiver operating characteristic" and "Detection theory" (Signal detection theory) should be merged. Signal detection theory appears to be a larger topic containing many concepts and methods. The combined page would have to be huge in order to treat all of them in sufficient detail. It is easier to maintain a collection of small pages that treat each component of the topic. Anecdote: I arrived at this page seeking to clarify my understanding of ROC curves, and in particular, the use of the area under the ROC curve. I already know about most of the stuff on the current Detection Theory page, and was only seeking information relating to ROC curves. This page is the correct level of granularity for what I wanted to learn. Note: Maven is suggesting that "ROC" be merged with this article? But it seems there is no other ROC article on this topic; the ROC disambiguation page points here. Bayle Shanks (talk) 22:01, 6 November 2009 (UTC)


For whom was this written? The author displays considerable erudition, but no desire to make his subject palatable to beginners. The second paragraph was enough to choke and die on. I'm not here to be stymied by a pedant - I need to understand ROC curves. I swear - if I ever learn enough about the subject to do so, I will join Wikipedia and rewrite this #$&* entry. 02:55, 8 February 2007 (UTC)

ROC example misleading or wrong[edit]

I think that the example plot w/ points A, B, C, C' is misleading or wrong. C' is intended (I think) to be an example of the effects of inverting the output of the worse-than-random classifier C. If this is actually what it's meant to represent, the plot is wrong: inverting the output of the classifier doesn't correspond to a mirror reflection across the diagonal, but to a mirroring through the point (0.5,0.5). (Inverting the test set labels correponds to a mirror reflection across the diagonal.) I don't have the file used to create the diagram, or I would fix it myself, so I leave this to whoever posted the diagram. 22:27, 23 April 2007 (UTC)

Hmm, mirroring with the point (0.5, 0.5) is not the same as mirroring with the diagonal line. I believe the example means that the output of worse than random classifier can be simply mirrored with the diagonal line to get point above the diagonal line. In the table you can see that C' is an invert classification of C. Perhaps you can read the source here: [1]. — Indon (reply) — 08:09, 24 April 2007 (UTC)
I believe the critique above is correct. Reading your cited source confirms that if you read carefully. Fawcett has the mirroring wrong when he says it is across the diagonal, though his explanation of reversing the decision is correct. that is, true positives become false negatives and vice versa. The contingency table on the wiki page for C' is not an inverted classification of C Because the inversion must occur on the columns in the example and not the rows due to the conjunctive equations. Reversing all decisions would then swap the values in the first row with the values in the second row. Therefore, the mirroring is through the point (0.5,0.5), and a true C' which is a reversed decision of C would be at the point (.12,.76) which is still "better than" pt. A. The explanation after the words about mirroring are what is causing confusion, but it would be useful to inform the reader how the mirroring really takes place. Snthor 15:19, 9 August 2007 (UTC)

No matter whether we have to mirror with the point (0.5, 0.5) or at the diagonal line, if we agree that any point under the diagonal line can be mirrored onto the other side then the lower right corner also represents a perfect classification. Therefore IMHO the two arrows labeled "better" and "worse" are misleading too. The closer towards the diagonal line, the worse; the closer towards either the left top edge or the right bottom edge, the better. One solution might be making both arrows two-headed. The heads pointing towards the edges should be labeled "better", the heads pointing towards the diagonal line should be labeled "worse". I am just afraid that people new to the topic will still be confused. Different approach: points under the diagonal line must be mirrored first before they can be compared. Put only one two-headed arrow in the upper triangle. Stevemiller 03:58, 9 October 2007 (UTC)

The lower right triangle corresponds to worse-than-random classifiers, so imho the arrow labels are correct. One might add arrows labeled "higher classification power" pointing away from the diagonal. I agree that the presented contingency table for C' is incorrect, instead of interchanging the columns one has to interchange the rows, since the row index represents the suggested classification. When calculating the TPR and FPR for the modified matrix, one finds that TPR changes to 1-TPR and FPR changes to 1-FPR, so also the presented numbers below the matrix are wrong, as well as the position of C' in the plot. The replacement of (x,y) by (1-x,1-y) corresponds to a mirroring at (0.5, 0.5). Kero6581 (talk) 08:12, 6 May 2009 (UTC)
ok, I corrected the text so that the squares are now correct. The only thing left is to correct the diagram such that the point C' is at (.12,.76). I hope Indon still has his original file, that would make it much easier to correct the figure. Greetings --hroest 12:08, 10 July 2009 (UTC)
Ok, new figure done. Contact me if modifications are necessary. Kai walz (talk) 20:57, 8 November 2009 (UTC)

Inconsistent Notation[edit]

The notation used in the figure ("How a ROC curve can be interpreted") in the fourth section ("Further interpretations") is inconsistent with the notation introduced in the first two sections ("Basic concept" and "ROC Space"). The figure uses TP, FP, TN and FN instead of TPR, FPR, TNR and FNR. Aside from being confusing, it is actually misleading since TP, FP, etc. were already introduced in the earlier sections as having different meanings than TPR, FPR, etc. What is more, the figure's notation isn't even internally consistent. The axis labels on the ROC graph should be "TP" and "FP", not "P(TP)" and "P(FP)". Alternatively, to show explicit dependence of the true positive rate and false positive rate on the threshold value, the axis labels could be "TP(θ)" and "FP(θ)", where the threshold value θ needs then to be introduced in the graph of the probability density curves for the detection statistic. And while I'm picking nits, why aren't the axes of the probability density graph labelled, and for that matter, why don't any of the three subfigures in this image have titles?

Don't get me wrong, I don't want to get rid of this image. For me it is the one illustration that allowed me to "get" what the ROC curve actually quantifies. Which is why I think it is important that it be brought into conformance with the rest of the article. I can see from the article's history that the image itself predates the discussion in the "Basic concept" and "ROC space" sections, so I imagine that the original creator of the image (Kku?) might be resistant to having it replaced with an updated version. However, our collective goal is for the overall article to be as clear as possible, and the best way that I can see to do that is to maintain the notation of the "Basic concept" section and to update the figure accordingly.

Here are the changes that I would propose to the figure:

- Titles for each of the three subfigures (these could be placed in the figure caption as long 
  as the subfigures are labelled with a), b) and c))
- Axis labels where appropriate
- Replace TP, FP, etc. with TPR, FPR, etc.
- Introduce θ as the threshold value and replace P(TP) and P(FP) with TPR(θ) and FPR(θ)

One other thing worth mentioning on the topic of consistency, is that the confusion matrix in the figure is of a different form than that introduced in the "Basic concept" section, having its columns sum to 1 rather than to the respective probabilities of the underlying event occurring or not. I don't think that this should be changed in the name of consistency, however, because as it is, it provides a direct link between the two other subfigures in the image. If the confusion matrix were altered so that the columns sum respectively to P and N, then this link would be lost and the subfigure would only serve to introduce (dare I say it?) confusion.

Personally, if effort will be taken to update this figure, I think it might be wortwhile to introduce one more subfigure at the top showing the information flow (underlying two-state process --> observable data --> detection statistic --> decision), but this may not be the best choice of language if my stated goal is to enforce consistency with the rest of the article.

JanRu 20:26, 26 April 2007 (UTC)

ROC space and metrics[edit]

In the section, "ROC Space", the info-box is referenced as containing evaluation metrics. Perhaps inadvertently, the word metric is hyperlinked to the wiki page on metrics, as in metric space distances. The reader may be inclined to believe from this that the info-box contains metrics. This is not the case. None of the "evaluation" metrics listed are true metrics in the mathematical sense. I would recommend deleting the hyperlink.

I agree, I removed that link Bayle Shanks (talk) 22:12, 6 November 2009 (UTC)

In a similar discussion, the notion of a ROC space is incorrect. What is meant by space? It is neither a vector space nor a topological space and so the verbiage is abused, even though it appears in some of the cited literature. A ROC graph is what is presented and its limitations are made clear, but the notion of a space is ill advised. I recommend titling the section as ROC graphs.

Snthor 14:21, 9 August 2007 (UTC)

d' (d-prime)[edit]

The article says about d' "... under the assumption that both these distributions are normal with the SAME standard deviation" (my emphasis). But the article about d' uses "the standard deviation of the noise distribution". Stevemiller 04:46, 10 October 2007 (UTC)

Number of observations - irrelevant?[edit]

Does the number of observations affect the ROC curve at all? With only one observation isn't it possible to have 100% sensitivity (no false negatives) and 100% specificity (no false positives)? Presumably I'm missing something because if that's the case having a good point on a ROC curve doesn't guarantee a good classifier. pgr94 (talk) 18:55, 24 January 2009 (UTC)

You'd have a good ROC curve, but no statistical reason to believe that this curve is representative of actual behaviour. -- (talk) 01:27, 25 August 2009 (UTC)

Terminology and derivations from a confusion matrix[edit]

This is an excellent addition to this article - very helpful for people wanting to dive deeper. Thanks so much. (talk) 00:48, 9 March 2009 (UTC)

Perhaps we should move this table into either the Confusion matrix or Binary classification article. As of right now, various concepts such as sensitivity and specificity, positive and negative predictive value, and accuracy each have their own articles, each repeating the same information about the various relationships between true positives and false positives. Perhaps we should consolidate some of this information into a more comprehensive article on a more general/introductory page. Many of these basic definitions (specificity, selectivity, positive/negative predictive value) are useful basic information and right now the only place to find a handy table is in the Receiver operating characteristic page, which is a little too obscure/advanced to warrant being the main location for this information. The sensitivity and specificity page has already started to become a more general page: unlike positive and negative predictive value which have separate articles, sensitivity and specificity has attempted to introduce much of this terminology together. For now I have added this table to the sensitivity and specificity article. (talk) 17:51, 30 October 2009 (UTC)

what does eqv. stand for????

I assume "eqv." means "equivalent" (talk) 17:44, 30 October 2009 (UTC)

Math Parser Error[edit]

From Revision #314192656 by "Failed to parse (unknown function\MCC): \MCC = (TPTN - FPFN)/ \sqrt{P N P' N'}" ... reverted to working formula, however, the formula is rendered as a PNG - anyone who knows how to enforce text rendering, please be my guest. - Dlefree-loc-work (talk) 08:55, 21 September 2009 (UTC)


Perhaps something should be added to note that the further you move towards the upper-right in the ROC graph, the more often that the classifier gives you a positive answer. So, I think that movement in a diagonal direction towards the upper right corresponds to biasing the classifier to return a positive answer more often, without improving its accuracy. I'd add this myself, but I'd like someone more knowledgable to double-check it first. Bayle Shanks (talk) 22:07, 6 November 2009 (UTC)

The holy grail is the upper left, not the upper right. You are correct about the upper right. It's always easy to get the upper right: Just label every point as a "yes". (Or lower left, for "no".) All ROCs pass through those two points. If you think the average reader would benefit, and you would like to add a few words to that effect, do it! Jmacwiki (talk) 15:07, 10 July 2011 (UTC)

"Lift Curve"[edit]

ROC curve is also called a "lift curve" according to the book "Mastering Data Mining" by Berry and Linoff. —Preceding unsigned comment added by AndrewHZ (talkcontribs) 03:48, 6 December 2009 (UTC)

Yes, in data mining the same approach is used to indicate the impact of using a predictive model in a real world marketing environment. It is known as a lift curve or a gains curve, and somewhat less often as an ROC curve. Duncan (talk) 15:16, 10 December 2009 (UTC)

I thought that a lift chart had a different, but similar, X-axis than an ROC curve. The x-axis is the false positive rate in an ROC curve but it is the subset size (% of data tested) in a lift chart (see Data Mining: Practical Machine Learning Tools and Techniques by Witten and Frank). Mickeyg13 (talk) 22:27, 9 June 2010 (UTC)

Discrimination summary statistic[edit]

Just gone looking for a source for the following summary statistic:

  1. the area between the ROC curve and the no-discrimination line

If you've got a reference for this one, it'd be much appreciated. —Preceding unsigned comment added by Noogz (talkcontribs) 06:23, 8 March 2010 (UTC)

If I understand the question, this is the same as the Gini coefficient, which is already referenced in the article. (Or maybe 1-Gini, I don't recall.) Jmacwiki (talk) 15:09, 10 July 2011 (UTC)

Is the ROC Curve really a curve and is the AUC a meaningful measure ?[edit]

As I understand, the ROC Curve is created by plotting quotients of integers against each other. But this means, that on both dimensions of the ROC space the irrational numbers do not have a ROC point. But if this is the case, the ROC "Curve" is defined for pairs of rational numbers only. But if this is the case the ROC curve is majorised by the Dirichlet function, which has a Lebesgue measure of 0. Thus the area under curve for a ROC "curve" is at most 0, if it ever exists. So in consequence, this means, that the AUC increases with every additional data point, however the convergence of the appropriate "measure" for the area under the curve is by no means guarantueed. It follows, that the AUC is a meaningless number, because it will rise with any additional observation and might achieve any number that wiill be given, supposed that enough observations for the classifier are available. Please point out my error, and I would happily accept that I am wrong. (talk) 14:33, 4 June 2010 (UTC)

Good question, a reference would help. It could be a misnomer, unless you accept all plots are curves. ROC's are frequency based, so the assumption must be the frequencies are continuous probability functions. The sample size is relevant. The AUC is like a performance index itself, so as long as it correlates, as a practical matter, it has discriminatory meaning. However, that meaning can be over interpreted; because, the AUC does not account for economic utility or the dreaded Type 3 error. Too many focus in increasing AUC performance and neglect increasing the economic efficiency of a diagnostic. Zulu Papa 5 * (talk) 14:48, 4 June 2010 (UTC)
In Decision Curve Analysis, The "Net Benefit" is an simple meaningful alternative performance measure to the AUC. It is NB = (True Positives - (w)(False Positives))/ N where w is the economic ratio of (Good / (1-Good)) [2]. Zulu Papa 5 * (talk) 15:42, 4 June 2010 (UTC)
As the original writer: Why does authority help with a mathematical argument? Even if there are a lot of possible quotations, this does not establish a logically true argument. In other words, authority does not replace logic. except for the case of machine learning may be. (talk) 21:04, 4 June 2010 (UTC)
Ok .. well we probably should not go off topic; however, I believe math and reality is defined by authoritative convention, and well ... how it progresses from there can be delusional. Besides, the Wikipedia authorities require verification without WP:SYN except for the most nominal and trivial math calculations. Original Research must go some where else, like a blog, to have a voice. Zulu Papa 5 * (talk) 21:19, 4 June 2010 (UTC)
I think ZuloPapa missed IP's point. It doesn't have anything to do with actual economic benefit; it's purely a mathematical point (and even though Wikipedia requires sources, mathematics does not). If your datapoints are empirically derived, yes of course strictly speaking the Lebesgue measure of the support of an ROC curve is 0 and it's not technically a continuous curve. That doesn't mean that it is such a terrible thing to calculate the area using some sort of interpolation. Also if you really feel like getting into the esoterically technical, a finite sample of say false positives is just a sample of the underlying "actual" false positive rate. Although the samples will always be rational, it is conceivable that the underlying false positive is irrational. So there could be some process with a nonzero true positive rate for all false positive rates in (0, 1), yielding a continuous, Lebesgue integrable function. However, our finite sampling will not reflect that, so we interpolate. That we must interpolate for real-world measurements does not render the concept meaningless. Mickeyg13 (talk) 18:54, 31 August 2010 (UTC)
I am having some trouble understanding the topic -- what is the continuous variable in this space to make the curve a "curve"? One cannot alter the FP rate directly, so there must be some "hidden" parameter. User A1 (talk) 20:58, 10 August 2010 (UTC)
A very very brief scan would indicate the OP is correct, and the curve is not a curve at all, and according to this interpretation of the AUC is tricky. I'll not pretend to understand all this. User A1 (talk) 21:04, 10 August 2010 (UTC)
If the data are being modeled as coming from two continuous distributions, then the ROC curve can actually be calculated and is a continuous curve. In other situations with discrete variables or when finitely sampling from distributions, you can interpolate to make a curve and estimate the area under it, so no big deal. It's still useful even if it's not a "curve" in the technical sense.

Which perpendicular line?[edit]

Under the Further Interpretations section, a statistic is defined as: "the intercept of the ROC curve with the line at 90 degrees to the no-discrimination line"

But there are an infinite number of lines perpendicular to any line. So this intercept can arrive at any point. Is this meant to be more specifically the intercept of the ROC curve with a line at 90 degrees to the no-discrimination line intersecting at its midpoint? —Preceding unsigned comment added by (talk) 16:01, 4 November 2010 (UTC)

Interesting question ... "the intercept" would define a point along the no-discrimination line. This could be a normalized statistic of the ROC curve, such that you could fit nearly infinite ROC curves by knowing the no-discrimination line point. I doubt many have explored this concept. Would be good to search on "ROC curve no-discrimination lines", to find sources. Zulu Papa 5 * (talk) 19:53, 4 November 2010 (UTC)
Most likely whoever wrote that meant the line going from (0,1) to (1,0). But actually any of the lines perpendicular to the diagonal can give a summary statistic with equivalent information, it's just that the (0,1) -> (1,0) line is probably the most useful. The line going through (0.9,0.9), for instance, will have an intercept that changes very little as the curve changes. —Preceding unsigned comment added by (talk) 04:09, 5 December 2010 (UTC)
This makes sense. (It's nice to have a name for it -- I just always thought of it as the "100%" value, since it's the unique point whose coordinates sum to 1.)
Note that there is another distinguished point: the point of maximum (perpendicular) deviation from the no-discrimination line. Its Euclidean distance from the line is the Kolmogorov statistic D, divided by sqrt(2). (Equivalently, its distance measured in the max-norm is D, and in the sum-of-coordinates norm is D/2.)
This observation addresses another point raised on this page: The Kolmogorov-Smirnov test, which compares two populations of samples -- the two axes here -- yields a probability after combining D with the number of observations (roughly, D*sqrt(N)). As a result, the test properly recognizes that a single-point ROC cannot yield a discriminator in which you can have any confidence. Jmacwiki (talk) 15:27, 10 July 2011 (UTC)

Cut offs[edit]

How about discussing the meaning and use of cut offs as illustrated here? (The cut offs are the labels 0.1, 0.2, etc on the curve.) AndrewHZ (talk) 16:55, 6 December 2010 (UTC)

incorrect definition of false positive rate[edit]

The false positive rate is the conditional probability that a person truly has a disease given that they test positive. In other words, P(D|+). In signal detection theory there is something called the false alarm rate, which is being incorrectly used in this article as the false positive rate. This is all made clear in Statistical Methods For Rates And Proportions by David L. Fleiss.

This misunderstanding of the false positive rate is unfortunately fairly widespread, and having it misstated in this article is no help. —Preceding unsigned comment added by (talk) 14:06, 24 January 2011 (UTC)

No, the the conditional probability that a person truly has a disease given that they test positive is the positive predictive value. The false positive rate is the conditional probability that a person tests positive given that they don't have the disease. --Qwfp (talk) 14:36, 24 January 2011 (UTC)
Quite so. And the latter is identical to the false alarm rate. (The names really aren't opaque. They mean what they say, and they say equivalent things!) Jmacwiki (talk) 07:04, 2 February 2013 (UTC)


Is there a ref for this [3] thanks. Zulu Papa 5 * (talk) 02:10, 7 March 2011 (UTC)

Confusing matrix[edit]

Just noted that the Confusion matrix page gives the columns as predicted values and rows as actual values, whereas the confusing matrix used on this page has swapped the meaning of columns and rows. —Preceding unsigned comment added by (talk) 20:49, 10 April 2011 (UTC)


"If a z-transformation is applied to the ROC curve, the curve will be transformed into a straight line." - How so? I don't see how that can follow, in the general case. (I assume we're talking about z-score and not the Z transform.) Converting the data to z-scores is a linear transform. --mcld (talk) 09:37, 23 November 2011 (UTC)

That section headed Z-transformation was added on 9 May 2011 by, who has no other edits. I can't follow it either. If no-one comes forward to offer to clarify it, I suggest we delete the section. Qwfp (talk) 12:27, 23 November 2011 (UTC)
The sections Detection error tradeoff graph and Z-transformation were inserted at the wrong place, interrupting the original Further interpretations section. I've moved them so that the Area under curve and Other measures subsections are filed correctly under Further interpretations.
The "Z-transformation" might be a misnomer. It doesn't refer to the Z-transform in signal processing. Though somewhat related to the Z-score, it's not the same thing either. The transformation is a warping of the axes by the inverse of the normal cumulative distribution function , so that 0.5 becomes 0, etc. But I'm not ready to edit it because I don't know where the name "zROC" came from.
Since there is already a page for the Detection error tradeoff curve in Wikipedia, I suggest merging the sections Detection error tradeoff graph and Z-transformation to that page. MaigoAkisame (talk) 20:02, 8 August 2012 (UTC)
Re: "z score" vs. "z transform" This is overly pedantic. It is true that "z-transform" is used differently in discrete signal processing. However "z-transform" is also regularly used this way in statistics. The two fields use the term differently, and Wikipedia is not going to define that behavior out of existance, nor should it. The term "Z transform" should be retained here; "Z standardization" is much less widely used. (talk)A statistical signal processing expert —Preceding undated comment added 16:56, 19 July 2013 (UTC)


Mistakes in:

A completely random guess would give a point along a diagonal line (the so-called line of no-discrimination) from the left bottom to the top right corners (regardless of the positive and negative base rates). An intuitive example of random guessing is a decision by flipping coins (heads or tails).

This should be a biased coin. And more is not good. — Preceding unsigned comment added by (talk) 10:34, 16 April 2012 (UTC)

I came to the discussion page to look for an explanation of this, and I agree that the wording is unclear at best---it seems to suggest that for a small data set you would end up somewhere along the diagonal, and as the number of data points increases, you converge to (0.5,0.5). This is not what happens, you end up on the diagonal for large data sets, and where you end up is determined by the bias of the coin. --passerby — Preceding unsigned comment added by (talk) 22:08, 9 July 2014 (UTC)

Misinterpretation of epitope detection result?[edit]

In the section "Other measures": The graph shows that if one detects at least 60% of the epitopes in a virus protein, at least 30% of the output is falsely marked as epitopes. I wonder if the second half of the sentence is wrong -- we should say "at least 30% of non-epitopes are falsely detected". MaigoAkisame (talk) 19:41, 8 August 2012 (UTC)

Threshold choice[edit]

Can someone add some comments/references on the choice of the threshold? Indeed, after measuring the ROC curve of a system, one may want to optimize the system by choosing a specific point on the curve, i.e. a specific threshold, either by:

  • minimizing the distance with the upper-left corner
  • maximizing the distance with the random line (orthogonal projection)
  • staying below a given FPR or above a given TPR (to comply with project goals)

Lagaffe (talk) 10:42, 12 November 2012 (UTC)

Undefined Variables in Formula[edit]

The AUC section contains a formula with undefined variables, viz., X, Y and k. This is not acceptable in an article. I presume it was just copied from some text somewhere in which the definitions were given. If the author of that section is monitoring this talk page, I implore him/her to please define the variables. This is a bad (and annoying) practice and we should try to force clear definitions in every instance on wikipedia. Formulas are nearly useless without the definitions of their elements. Chafe66 (talk) 22:21, 30 January 2013 (UTC)

Bad link for Z-transformation?[edit]

In the paragraph headed "Z-transformation", I think the blue link on the first occurrence of the phrase "Z-transformation" is in error. The linked page seems to have nothing to do with the transformation being referred to here. The reference should be to some transformation associated with the Normal distribution. Stephen Robertson (talk) 09:42, 24 April 2013 (UTC)

Possible copyvio[edit]

The Basic concept section (which begins "A classification model (classifier or diagnosis) is...") contains a piece of text identical to text I've found in page 3 of this document. That document has a publishing date of 2003, while the text here seems to have been added in 2007 by this edit. As a side note, while I initially thought the paragraphs following it were also a copyvio, they have in fact apparently been reproduced in this book published in 2010. AdventurousSquirrel (talk) 13:09, 13 January 2014 (UTC)

External links[edit]

This article has been tagged for external link cleanup since March 2010. In looking over the existing external links, most of them are very borderline; they're mostly to personal pages of mathematicians working in the field or mathematicians' collections of academic papers on the subject. I can see an argument for each one to be included, but taken as a whole, they do feel kind of crufty and excessive.

I fully expect for at least part of this particular removal to be (justifiably) overridden by the regular editors -- I'm working through cleanup backlog of the external-link-template categories; I'm not a regular here -- because unlike the more egregious instances of "holy 73 links to personal webpages, Batman!" that the external-link-template tends to collect, I'm really on the fence about this one. I decided, however, to remove the external links section entirely, toss them here onto the talk page, and ask that instead of y'all reverting my edit wholesale that you instead take the chance to consider each link carefully and manually add back only the most justified and useful links to the article.

For what it's worth, and as an external opinion, the ones I really was this close to retaining were Kelly Zou's bibliography and the web-based calculator by John Eng.

--rahaeli (talk) 15:46, 18 May 2014 (UTC)

(PS: as someone with a nasty case of undiagnosed dyscalculia, leading to a strongly conditioned math-anxiety response, I have to commend the regular editors here: this is a wonderful example of specialist subject matter being written in such a way as to be crystal clear to the layperson, ie, me. I think this is one of the articles where Wikipedia really shines. --rahaeli (talk) 15:51, 18 May 2014 (UTC))

Etymology of "receiver operating characteristic"?[edit]

Just curious how the hell the name came about, as it seems a bit flaunty given that it's just true positives vs false positives. — Preceding unsigned comment added by Mmkstarr (talkcontribs) 00:56, 18 March 2016 (UTC) See last paragraph. Hous21 (talk) 00:59, 18 March 2016 (UTC)

Criticism of AUC debunked in recent research.[edit]

The article here holds a relatively critical view of AUC for estimating classifier performance based on the paper by Hand. This research seems to be mostly superseded by the more recent paper by Flach cited later. "Small-sample precision of ROC-related estimates" explicitly talks about small sample estimates, which is not relevant for machine learning applications. Therefore paragraph seems inconsistent to me. There seems to be no current literature suggesting that AUC actually is not a good measure. Still the article goes on saying "One recent explanation of the problem with ROC AUC " which implies that there is a problem. Andreas Mueller (talk) 20:47, 29 April 2016 (UTC)