Talk:Theil–Sen estimator

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 ???  This article has not yet received a rating on the importance scale.


Quote: "As Sen observed, this estimator is the value that makes the Kendall tau rank correlation coefficient comparing the sample data values yi with their estimated values mxi + b become approximately zero."

Really? Then the method gives an estimation (mxi + b) completely uncorrelated with the estimated variable (yi)? Olaf (talk) 00:49, 27 April 2014 (UTC)

No, it means that roughly half the yi are greater than the corresponding mxi+b, and roughly half are less. Deltahedron (talk) 19:49, 27 April 2014 (UTC)
No, it's not median error supposed to be equal to zero as it would be in your interpretation, it's Kendall's tau rank correlation. Counterexample: if yi = xi, then the estimator mxi + b = 1xi + 0 = xi = yi and thus the tau correlation between the estimator mxi + b and the original value yi is equal to one, instead of zero. Olaf (talk) 20:07, 27 April 2014 (UTC)
That's not a particularly good counterexample, since the number of concordant and the number of discordant pairs are both zero, and hence tau=0. Deltahedron (talk) 20:11, 27 April 2014 (UTC)
Let's check: y1=1, y2=2, y3=3.
Estimations: Y1=1, Y2=2, Y3=3
Concordant pairs:
1<2 and y1 < Y2
1<3 and y1 < Y3
2<3 and y2 < Y3
Tied pairs: none
Discordant pairs: none.
Tau = 1
In absence of tied ranks the tau correlation has the same property as Pearson's correlation: tau(A,A) = 1, and we have no tied ranks, if ai <> aj when i<>j
Olaf (talk) 20:23, 27 April 2014 (UTC)
No, it's the residuals that are all equal and hence uncorrelated. Deltahedron (talk) 20:37, 27 April 2014 (UTC)
Yes, and the article supposed, it's the estimated values, not their residuals. Now it's fixed ([1]). Thank you for the references. Olaf (talk) 20:43, 27 April 2014 (UTC)
However, what's important is what independent reliable sources say. Searching "Theil Sen" "Kendall tau" in Google Books gave me: [2], [3], [4] which support the assertion of the text (unlike the reference to Rousseeuw & Leroy (2003), pp. 67, 164 which did not). Deltahedron (talk) 20:19, 27 April 2014 (UTC)
Ok, so it's tau correlation between estimation error and X value equal to zero, not between estimator and estimated value! (the second reference). Olaf (talk) 20:26, 27 April 2014 (UTC)
Thanks for clearing this up. —David Eppstein (talk) 22:36, 27 April 2014 (UTC)