Talk:Prediction interval

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated B-class, Low-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B-Class article B  This article has been rated as B-Class on the quality scale.
 Low  This article has been rated as Low-importance on the importance scale.
WikiProject Mathematics (Rated B-class, Low-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
B Class
Low Importance
 Field: Probability and statistics

Which percentile to use[edit]

I'm not sure about the line "where Ta is the 100(1 − (p/2))th percentile of Student's t-distribution..." for a 100p% prediction interval. For example, for a 90% prediction interval that would be the 55th percentile, which doesn't sound right - or am I missing something?

Perhaps it should instead read "where Ta is the 100(1 − (α/2))th percentile of Student's t-distribution..." for a 100(1-α)% interval, also replacing p by 1-α in the line above (i.e. α is the error rate in the prediction, whereas p was the success rate). For a 90% prediction interval (α=0.1) that would mean using the 95th percentile, which sounds more reasonable.

For possible support for this formulation see which defines α in the same way and uses the 100(α/2)th and 100(1-(α/2))th percentiles of a general distribution. Also, which uses the 100(α/2)th percentile of the t-distribution - I assume that the choice of 100(α/2)th or 100(1-(α/2))th percentile depends on how your t-distribution tables are written.

Alternatively, the definition of p as a success rate in the article could be retained by referring to the 100((1+p)/2)th percentile of the t-distribution, in which case the error rate α would not need to be introduced.

Richard J Price 10:54, 22 March 2007 (UTC)

In agreement to the table on the student's t page, if T_a is the 100((1+p)/2)th percentile, then P(T<T_a)=(1+p)/2=(1+1-alpha)/2=1-alpha/2, with T student t distributed, which is the correct error for two-sided interval (see confidence interval). Gummif (talk) 00:49, 3 August 2013 (UTC)


Could we please get another example, with a population variable such as apple width or orange peel thickness, instead of a bunch of abstract equations? Thanks in advance. 21:41, 19 April 2007 (UTC)

I’ve elaborated and given some simpler and clearer examples, notably the simple non-parametric estimation – hope it’s clearer now!
—Nils von Barth (nbarth) (talk) 17:21, 19 April 2009 (UTC)

Bayesian Statistics[edit]

Why exactly is this stated --- "In Bayesian statistics, one can compute (Bayesian) prediction intervals from the posterior probability of the random variable, as a credible interval. In theoretical work, credible intervals are not often calculated for the prediction of future events, but for inference of parameters – i.e., credible intervals of a parameter, not for the outcomes of the variable itself"? --- It's quite common in practice to create a posterior predictive distribution which gives you an interval for the actual outcome of the variable itself. (talk) 19:23, 30 April 2011 (UTC)

unclear on scope[edit]

The article could possibly be clarified by relating a prediction interval to a tolerance interval. The intro currently uses language that a prediction interval is not normally appropriate for, although terminology in this area can be a bit inconsistent:

an estimate of an interval in which future observations will fall, with a certain probability, given what has already been observed

If not read carefully, this could imply that a prediction interval is an interval bounding n% of all future samples from a process, which would be equivalent to n% population coverage, which is not typically what a prediction interval gives you (except on average). However I'm not entirely sure where to go in clarifying this article. I've started by expanding tolerance interval instead. -- (talk) 14:14, 26 August 2011 (UTC)

Standard score[edit]

The source of confusion is clearly explained by Melcombe on the project page. A prediction interval [L,U] is an interval such that for a future observation X it holds: P(L<X<U) has a given value. For the standard score Z of X therefore it gives:

P\left( \frac{L-\mu}{\sigma} < Z < \frac{U-\mu}{\sigma} \right) = \gamma

By determine the quantile z such that

P\left( -z < Z < z \right) = \gamma

it follows:

L=\mu-z\sigma,\  U\mu+z\sigma

Notice Z is a standard score, z is not. Actually I don't think the use of the term standard score is much of a help. Nijdam (talk) 08:16, 13 May 2012 (UTC)

I still think it's necessary to mention standard score in the article. Let's continue the issue at that project page: Wikipedia_talk:WikiProject_Statistics#Standard_score. Mikael Häggström (talk) 18:57, 13 May 2012 (UTC)

If Known mean and known variance then it is not a prediction interval but a tolerance interval[edit]

Maybe this is just about semantics but if you agree then we should remove the example for "Known mean, known variance" and just link this case to Tolerance interval, what you think?

There should be a link to [[tolerance interval], but the material should stay as the structure of the intervals needs to be compared across the cases where the parameters needs to be estimated or not. However, overall the univariate normal example is somewhat long and written at a textbook level, so perhaps this part can be reduced. Melcombe (talk) 07:03, 18 July 2012 (UTC)

On known mean, unknown variance[edit]

In this case, for normal population we have:

s^2=\frac{1}{n-1}\sum\limits_{i=1}^n(X_i- \mu)^2,

and, therefore


is chi-squared distributed with n degrees of freedom;

\sqrt{\frac{n}{n-1}}\frac{X- \mu}{s}\sim t_n

is t-distributed with n degrees of freedom, i.e. the statistic


is scaled Student-:t_n distributed. J. Angelova — Preceding unsigned comment added by (talk) 20:34, 12 October 2012 (UTC)

Your point being? Fgnievinski (talk) 20:31, 6 February 2013 (UTC)


I don't catch!

When forecasting a growth curve (x1, x2, ..., xn), then P(xi < xi+1) > P(xi > xi+1).

In facts, P(xi < xi+1) = 1-e where e is of the order of magnitude of the error on data.

Please explain or cite references. — Preceding unsigned comment added by AlainD (talkcontribs) 18:09, 25 January 2014 (UTC)


When looking in my text book, I see the best estimate for x_t is has an expectation of \bar{y_t} = \beta + \alpha x_t, and standard-deviation \sqrt{ MS_E (1+\frac 1 n + \frac{(x_t - \bar x)^2}{S_{xx}} }.

This implies that the error on the forecast estimate is mimimum for x_0=\bar x and widens as |x_0 - \bar x| increases. It also implies that the confidence interval for the best estimator of x_0, is always wider than the confidence interval for x_o.

Is it the same concept? If then, is there a reason not to include the complete formula? AlainD (talk) 21:00, 25 January 2014 (UTC)