Talk:Regression toward the mean

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject Statistics (Rated C-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
 
WikiProject Mathematics (Rated C-class, High-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
High Importance
 Field:  Probability and statistics

Confusion of terms and concepts[edit]

In the *Other Examples* section the author appears to continuously confuse probability and statistics. They claim "If your favorite sport team won the championship last year, what does that mean for their chances for winning next season? To the extent this is due to skill (the team is in good condition, with a top coach etc.), their win signals that it's more likely they'll win next year." This is an extremely poor form confusion of the statistics of the victorious match (1 win out of 1 trial implying a 100% likelihood) with the probability of future matches and comparing the probability to that statistical sample as such. How could it be more likely to win than 100%? The entire section needs to be revised or removed. — Preceding unsigned comment added by 184.54.35.21 (talk) 08:05, 15 April 2018 (UTC)

Issue with some phrasing[edit]

In the *Misunderstandings* section, there is this phrase "So for every individual, we expect the second score to be closer to the mean than the first score." This is not true. For example, individuals who have a score of exactly the mean should experience regression away from the mean. Regression toward the mean is specifically a phenomenon that affects the 'highest'/'lowest' performers. Everyone else expects to experience some sort of motion with respect to the mean, but not necessarily towards it.

In fact, there are a few statements surrounding that one that are related and misleading. I will now go and attempt to edit to clarify this statement in the page myself. --Ihearthonduras (talk) 18:27, 14 November 2017 (UTC)

Different use in finance[edit]

I added a note to the end of the introduction, because I'm almost certain that "mean reversion" as used in finance is fundamentally different from "reversion to the mean" or "regression to the mean" as described here. I don't think the Wikipedia article on Mean reversion (finance) is clear on this. As I understand it, as used in science and statistics, mean reversion is an effect that shows up when genuinely independent random samples are drawn successively from a fixed population having a constant frequency distribution.

As used in finance, it seems to be referring to a situation in which performance over successive time periods is not independent, but shows a negative correlation from one time period to the next. A fluke period of low returns is not followed by a typical period of average returns, simply due to the nature of a random process. On the contrary, a period of low returns has an actual tendency to be followed by a compensating period of high returns. Thus, the average return as holding periods increase decreases faster than it would if the process were a random walk.

The law of large numbers says that if you throw 10 heads in a row, then flip a coin 100 times more, the average number of heads for the whole 110 throws will be closer to 50/50, not because there's any tendency to throw more tails after a long series of heads, but simply because the maximum likelihood is that the 100 additional throws will be split 50/50 and the percentage for the whole series will decline from 10/10 = 100% heads to (10 + 50) / 110 = 55%. I've talked to a couple of financial specialists who have been quite definite that in finance, "mean reversion" does not just mean swamping out an unusual run with a series that simply has the mean value, it means active compensation--a run of low stock returns will (supposedly) tend to be followed, not by a run with mean-value stock returns, but by a run of higher-than-mean stock returns.

In the article, I'm doing my best to present this by paraphrasing what Jeremy Siegel says, but I admit that I'm going just a little farther by using the word "compensation." Dpbsmith (talk) 15:33, 22 December 2011 (UTC)

I think, with regards to finance, the random parts of the time series are generally modeled as a stationary process. I would conjecture with 95% confidence that stationary processes exhibit a "regression toward the mean" sort of phenomenon. This is probably the missing link that you want between these two articles. --Ihearthonduras (talk) 18:33, 14 November 2017 (UTC)

Examples[edit]

"If your favorite sport team won the championship last year, what does that mean for their chances for winning next season? To the extent this is due to skill (the team is in good condition, with a top coach etc.), their win signals that it's more likely they'll win next year. But the greater the extent this is due to luck (other teams embroiled in a drug scandal, favourable draw, draft picks turned out well etc.), the less likely it is they'll win next year."

I don't see this as a good example of regression to the mean at all. There is such a huge amount of feedback loop going on (increased investment, morale, attracting better quality people etc.) that this is going to override any theoretical underlying probability based on 'normal conditions'. Also winning a championship is binary (as in you either do or you don't) - it might be more useful to talk about whether the final rank is higher or lower than their average rank. Btljs (talk) 11:53, 19 January 2018 (UTC)

Confusion about what is being implied[edit]

It says in this article, in the "Other statistical phenomena" section, "For example, following a run of 10 heads on a flip of a fair coin (a rare, extreme event), regression to the mean states that the next run of heads will likely be less than 10..." but what does that actually mean? Because, the odds of throwing 10 heads in a row haven't change at all due to that rare first event. The odds of throwing 10 heads in a row is still as it always was 1/1024 and those odds don't change over time, previous experience is completely irrelevant to any future probability. There is no God evening-out things, trying to make things fairer. It is true that the chance of throwing a run of 10 heads on a flip of a fair coin is a rare event and so if you keep throwing the coin it is highly probable that you will not throw another 10 heads in a row, but the probability of that same rare event happening again has not changed at all, the odds for the second time of a run of 10 heads on a flip of a fair coin are exactly the same as they were on the first. When you look at all the throws of the coins, say a couple hundred times later, the ratio of head to tail is highly likely, but only highly likely, to be close to 50/50 but that definitely does not mean that the probability in the second run of throwing 10 coins was less likely to end up with 10 heads than the first, which seems to be what is being subtly implied in this Wikipedia article. It is true that following a run of 10 heads on a flip of a fair coin (a rare, extreme event), the next run of heads will likely be less than 10, but your chance of throwing 10 heads on a flip of a fair coin (a rare, extreme event) was always less than 10, even before the first run, so what is significantly being said there?! The probability for the second run of throwing the coins hasn't been changed, the odds are still the same as they were for the first time, 1/1024 of them all turning out to be heads. Even if you did by chance throw another 10 heads the second time, then if you were to consider throwing the coin a third 10 times your calculated odds for the third time are still exactly the same as they were for the very first run, 1/1024 chance of getting 10 heads in a row! Once an extreme pure chance event happens it does not lessen the likelihood of another extreme pure chance event happening, it is perhaps less likely that you'll get two extreme pure chance events happening rather than just one but once an extreme pure chance event has happened it will have absolutely no effect on the future probability of another extreme pure chance event happening. The "Other statistical phenomena" section confuses me, what on Earth is it implying!? It almost sounds like a confidence trick, a scam is being marketed.