# Talk:Student's t-distribution

WikiProject Statistics (Rated B-class, Top-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
Top  This article has been rated as Top-importance on the importance scale.
WikiProject Mathematics (Rated B+ class, Top-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B+ Class
 Top Importance
Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.

## Miscellany

The mgf diverges, a note has been made in the summary table. WT

This page contains special entities (characters) that cannot be displayed on many browsers. See Maxwell's equations for a way to fix this. David 21:17 Oct 15, 2002 (UTC)

As of the beginning of 2003, there is a much better way to fix that problem, and this page now (in displayed math, as opposed to symbols embedded in lines of text) takes advantage of that advance. Michael Hardy 01:14 Feb 9, 2003 (UTC)

There is a possible error in the PDF, compared to for example: http://www.wolframalpha.com/input/?i=t+distribution

or my statistics text book

--- —Preceding unsigned comment added by 77.75.167.74 (talk) 13:36, 23 January 2010 (UTC)

## Should we put an index n

 ijil l on our sample variance S2? AxelBoldt 07:58 Feb 9, 2003 (UTC)

I agree with AxelBoldt - the n subscript is confusing. I prefer n-1 or no subscript. The n subscript is usually used when you want to highlight the fact you divided by n instead of n-1 when calculating the sample variance. It's especially confusing if you are trying to cross refrence the Wikipedia article on variance - extract from that article follows:
$s_n^2 = \frac 1n \sum_{i=1}^n \left(y_i - \overline{y} \right)^ 2 = \left(\frac{1}{n} \sum_{i=1}^{n}y_i^2\right) - \overline{y}^2,$
and
$s^2 = \frac{1}{n-1} \sum_{i=1}^n\left(y_i - \overline{y} \right)^ 2 = \frac{1}{n-1}\sum_{i=1}^n y_i^2 - \frac{n}{n-1} \overline{y}^2,$
Both are referred to as sample variance. Most advanced electronic calculators can calculate both $s_n^2$ and $s^2$at the press of a button, in which case that button is usually labeled $\sigma^2$ or $\sigma_n^2$ for $s_n^2$ and $\sigma_{n-1}^2$ for $s^2$.

--Jconnolly 02:20, 27 February 2007 (UTC)

Now I've done that. There are times when that matters; those are when the dependence on the sample-size n is important, especially when a limit as n approaches infinity is to be mentioned. Perhaps this article is not such an occasion, especially considering that Student's distribution is most important when n is small. But I've already put the subscript on the sample mean, so consistency makes it preferable to put it on the sample variance too. And it might also not hurt to mention the limiting distribution as the number of degrees of freedom grows. Michael Hardy 21:29 Feb 9, 2003 (UTC)

I'm a french user named Thorin. You might be interested in reading my article on Student's distribution: fr:Loi de Student.

I have noticed in the history that someone mentions he found some wrong value on the table. Related or not, I personally think the table isn't wrong, but the example below is not quite right.

The definition of t_k,x is that if T is a variable following Student(k), then the probability that T>t_k,x is equal to x.

Symetrically, the probability that T<-t_k,x is also equal to x.

This means that the probability that -t_k,x<T<t_k,x is equal to 1-2x (and not to 1-x, as the example assumes). it's only the probability that T<t_k,x that is equal to 1-x.

This means the confidence level shown on the table is the confidence level for having: mean < Xn + A sqrt(S/n). But it is not the confidence level for the interval [Xn - A sqrt(S/n),Xn + A sqrt(S/n)] (hence the passage from 90% to 80% that I added in the article).

I just did a few minor edits to fr:Loi de Student. My feeble grasp of French is such that I would not attempt more than very minor edits. Please note that in TeX you don't need to write ">=". I was identified in the edit history only by the IP number 128.101.152.68. Michael Hardy 19:48, 19 October 2005 (UTC)

THORIN writes: I'll give a few references to back me up(sorry I'm not yet familiar with external links).

The first reference is in english:

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm

I draw your attention on the following quote, and on the graph you will find above the table:

"The most commonly used significance level is alpha = 0.05. For a two-sided test, we compute the percent point function at alpha/2 (0.025)."

Their formalism, as well as their variable names seem to be pretty close to yours.

Here's the second reference (in french, sorry):

http://rfv.insa-lyon.fr/~jolion/STAT/node144.html

It counts confidence level the other way around, but it is not the only difference. This alternative table is built for a formalism that is different from ours, so that you'll notice it is their .20 collumn which corresponds to your .90 collumn. In fact their table counts two times the probability corresponding to UCL_1-alpha (.20 = 2(1-.9))

The third reference is the one I've been using as my main source.

http://newton.mat.ulaval.ca/pages/belisle/Notes-tableaux/Lois-khi2-t-F.pdf

It confirms the two others.

THORIN writes: thanks for the tip on how to adjust bracket size, Michael. It seems that now we should focus on the graphs.

What is the meaning of psi and B functions in the entropy? I haven't seen them defined anywhere.User:ThorinMuglindir

Those denote the digamma function and Beta function, respectively. --MarkSweep 23:49, 23 October 2005 (UTC)

Thanks Mark, I'll just say that in the article (ThorinMuglindir 22:01, 27 October 2005 (UTC))

## Definition of $\psi$ and $B$

Mark, I have to disagree with you reverting my change in the table. First, we simply can't put a formula in an article that uses non-defined items. So that links to the definition of psi and B have to appear some place or the other. I'll thus revert the change.

If you really think that these can't be in the table, then I engage to move them to some other part of the article rather than deleting them outright. If you wish to do this I draw you attention on two points: putting that in the article's body will make the expression less straightforward for a reader, and will force you to add a "see text for definitons of psi and B" note in the table, which will hardly take less room than the links themselves. Second, if this table is intended as a summary, then it should be self-sufficient to the greatest extent possible, which for me speaks in favor of putting links to the two functions inside the table... ThorinMuglindir 08:22, 28 October 2005 (UTC)

Could the confidence limit section be made clearer with regard to confusion of one-tailed and two-tailed confidence limits? Under the heading "Confidence intervals derived from Student's t-distribution" it currently says "The interval whose endpoints are ... where A is an appropriate percentage-point of the t-distribution, is a confidence interval for μ." I think if you want a 95% confidence interval, then the "appropriate percentage-point" is 97.5% (right?) But the sentence as it currently stands can very easily be misinterpreted as meaning that where A is the 95% percentage point, then the given interval is a 95% confidence interval. If I'm right, how about putting immediately after this sentence, "For example, if A is the value of the t-distribution for the 97.5% percentage point, then the interval is a 95% confidence interval." Cathy Woodgold 2005 Nov 8, 00:02 UT

I've taken the liberty of changing the sentence so it reads "Therefore the interval whose endpoints are ... is a 90% confidence interval for mu." because that is what the previous equation means: the probability that mu lies in the stated interval is 0.9. Simon Duane 2005 Nov 29, 14:53 UT

## Table

OK, the table was not as grossly wrong as I thought, but it was badly explained. I will replace it with a further edited version. Michael Hardy 00:46, 8 November 2005 (UTC)

In the example for the table it sais:

So that at 90% confidence, we have a true mean lying between

$10\pm1.37218 \frac{\sqrt{2}}{\sqrt{11}}=[9.41490,10.58510]$

But in case of a double sided confidence interval it should change from 90% to 80% if one still uses the value r for the 90% one sided confidence interval. Please revise this. 137.132.3.12 14:14, 3 May 2006 (UTC)Konrad

## Missing definition of F_1 in table

Hi, The cdb function stated in the summary table uses the Function F_1 that I couldn't find a definition for anywhere in the article. Thanks for any explanation - Maybe someone knows what the cdb should look like. Thanks, --ISee 13:13, 2 February 2006 (UTC)

I added the link to the hypergeometric function.Pdbailey 20:32, 20 April 2006 (UTC)

## Expected Mean

I'm not entirely sure, but isn't the expected mean only defined for degrees of freedom > 1? Student's w/one DoF is supposed to be equivalent to standard Cauchy, which definitely has undefined mean.

I would update the page (it's only a minor edit) if I were more sure about my math. Pcastine 18:06, 30 March 2006 (UTC)

## reasons for my recent reversion

The edits by user:129.137.87.25 that I just reverted don't make sense.

• The central limit theorem is not in fact involved in the way it said, because it was given that the random variables involved were already normally distributed, and moreover, if one did use the central limit theorem, one would have to speak of convergence to the normal distribution, rather than of being exactly normally distributed.
• One should not use the same symbol, the lower-case t, to refer both to the random variable and to the argument to the density function.

Michael Hardy 01:40, 3 April 2006 (UTC)

## Reason for reversion on 11-Jun-2006

The text you wrote looks like an advertisement, you'll forgive me for mistaking it for one. You toned down the text quite a bit when you update it, please do so with the other links you made to your page. In addition, I'm not sure that the like adds something since there is already links at many of these pages that does exatly the same thing. 128.135.133.123 18:17, 20 June 2006 (UTC)

Greetings all,

Recently User:128.135.17.105 removed an external link (Free Student t Calculator) that I had placed on this page to an online Student t calculator that is available for free on my website. The reason given for this removal by User:128.135.17.105 is that the calculator adds no value to the page. I would like to hear whether or not this is the majority opinion, as I believe that a link to the free Student t calculator provides a great deal of value to the page. Here's why:

1. The other online calculator that is linked to from this page (VassarStats) cannot supply t-values when there are fewer than 10 degrees of freedom.
2. The other external link (Distribution Calculator) can only perform Student t calculations if the user downloads and installs the software on their computer.
3. Well over 300 people from Wikipedia have used the free Student t calculator on my website since I posted the link two weeks ago -- a clear indication of the value of the free calculator to the readers of this page.

Out of respect for the opinion of User:128.135.17.105, I will not repost the link right away. If anyone agrees that there is value in the external link that User:128.135.17.105 removed, please let the community know by posting your thoughts here. I would particularly enjoy discussing this issue further with User:128.135.17.105, as I believe that (in the spirit of Wikipedia) we can resolve this issue amicably.  :-)

--DanSoper 23:49, 22 June 2006 (UTC)

the main discussion is at Talk:Chi-squared distribution. 128.135.226.222 00:36, 28 June 2006 (UTC)

## Alternate forms of the t-table

It would be very useful to explain the alternate form of the t-table such as the one found here. If my memory serves me correctly it is based on the population standard deviation rather than the stadard deviation. I think it is okay to not address the tables that use the residual (alpha) as that can be easily figured out. But the table presented here and these other tables are not obvious.--Nick Y. 23:25, 11 June 2006 (UTC)

Your memory serves you very badly in this case. The table you cite is 2-sided, as opposed to the 1-sided table given here. The population-versus-sample issue has nothing at all to do with it. The explanation is essentially here in this article, where it talks about 1-sided versus 2-sided, but the explanation is missing from the page you cite. Michael Hardy 23:41, 11 June 2006 (UTC)

## comment moved from article

user:193.11.239.45 put the following comment into the article; I've deleted it from there and pasted it here:

Comment: We are novices but strongly belive that the value of v is wrong, it should be 0,879 which corresponds to v=10 and 80% in the table above. /Emil and Christian

I don't understand which "v" is being referred to. Michael Hardy 18:36, 20 September 2006 (UTC)

## Better explanation

In the first example of how to calulate we have: "We can determine that at 90% confidence, we have a true mean lying below 10+1.37218+(squareroot of 11)/(squareroot of 2)=10.58510"

In the second example with 80% confidence we have: "So that at 80% confidence, we have a true mean lying between 10+-1.37218+(squareroot of 11)/(squareroot of 2)=9,41490 and 10.58510"

We have the same value of "a" in both cases, which is wrong. In the 90% case it is correct, 1.37218. In the 80% case it is wrong, it should be 0.879 taken from the table.

So the correct calculation of 80% case is: 10+-0.879(squareroot of 11)/(squareroot of 2)=7,9386 and 12,0624

Hope this explanation is better! :) /The swedes

## plots under student's t are not comprehensible

Both plots suffer from missing labels on their x-axes. One has to guess at the x-axis variable (t).

And the formula for probability density function has no variable k in it. So are the various plots in the graphs done for different values of ν instead?

same for cumulative distribution function...

## Redirect for t-value?

I think there should be a redirect to this article from t-value and t value. Those two are common names for the t statistic as well. --Big Wang 11:51, 14 November 2006 (UTC)

Done! Next time, just go ahead and put in redirects yourself, if you think they're appropriate and helpful. See Wikipedia:Redirect and Help:Starting a new page. Just now, when trying to find the "Starting a new page" instructions for you, I had a little trouble finding it, so after I did find it I also put in a few redirects such as Wikipedia:Create --> Help:Starting a new page.
You can do things like that, too. New users are the best ones to know what redirects are needed in the help pages. It's good to have lots of redirects; one advantage is that then someone won't go ahead and write a new article not realizing that an equivalent article already exists. If it turns out that the redirect is inappropriate somehow, it can be deleted later or expanded into a full article -- so you can be bold. --Coppertwig 12:23, 14 November 2006 (UTC)

## Explanation of 80% confidence interval

This is to discuss recent disagreements in the last sentence or two of the section "Table of selected values".

Recently, someone changed "80%" to "90%" and I changed it back. I then figured that other readers might also have trouble understanding why this should be 80% rather than 90%, so I inserted a sentence to try to explain this: "(It has a 10% chance of being above that range, and a 10% chance of being below that range, so it has a total of 20% chance of being outside that range, either above or below.)" It's possible I didn't use the correct statistical terminology in this sentence, but I'm pretty confident :-) that the general idea I'm trying to get across is correct. If there is no objection, I'll re-insert the same sentence. If someone can improve the sentence, that's even better. Please discuss. --Coppertwig 14:45, 17 November 2006 (UTC)

I object. I am not a probability theorist, but I don't think your statement is accurate. We need an expert on the subject. – Chris53516 (Talk) 14:49, 17 November 2006 (UTC)
Is it acceptable to use a Wikipedia page as a citation (Confidence interval)? I believe I can do that while also rewording the sentence to be more correct. Or does it have to actually come from a book? Do the examples have to have all of the same numbers as examples in cited material, or can Wikipedia pages use their own examples? Remember, stuff is not supposed to be copied verbatim from books. --Coppertwig 15:04, 17 November 2006 (UTC)
NO. Internal references are not acceptable. If they made a reference in that article, use that reference. When referencing, never plagiarize. – Chris53516 (Talk) 15:22, 17 November 2006 (UTC)
Another question: I believe you stated in the edit summary "I don't think that is what it means." Would you please explain here what you do think it (the sentence and formula immediately above my edit which you reverted) means? Thanks for watching a statistics page and helping make sure it's right! --Coppertwig 15:16, 17 November 2006 (UTC)
Your sentence is an extrapolation from the "80% confidence interval". The interval may or may not be evenly distributed. Therefore, your extrapolation that 10% must be above and 10% must be below is most likely incorrect. However, we need an expert on the topic to provide an answer. Additionally, you provided no source for verification of your statement. – Chris53516 (Talk) 15:24, 17 November 2006 (UTC)
Look a little further back: if you look at the two previous sentences, (starting from the second sentence after the table), they do state that 10% is above and 10% is below. I should have said "probability" rather than "chances". The Confidence interval page indicates that probability is what is meant. --Coppertwig 15:31, 17 November 2006 (UTC)
So? Got a reference? You have no evidence that what you say is true other than another Wiki page. – Chris53516 (Talk) 15:57, 17 November 2006 (UTC)

How does the following look? I've put the sentences I want to insert in italics here so you can see what I'm adding, but they would be normal text in the article. I think I understand Chris53516's objection. In place of the original sentence I had inserted, I inserted a different sentence which IMO achieves my goal of making the text more understandable for the non-expert reader, but which I believe does not give rise to what Chris53516 was objecting to. I've also inserted two other sentences. Each of the three additions has the purpose of clarifying for the non-expert what was just said in the sentence before it. What do you think, Chris5316?

For example, given a sample with a sample variance 2 and sample mean of 10, taken from a sample set of 11 (10 degrees of freedom), using the formula:

$\overline{X}_n\pm A\frac{S_n}{\sqrt{n}}$

We can determine that at 90% confidence, we have a true mean lying below:

$10+1.37218 \frac{\sqrt{2}}{\sqrt{11}}=10.58510$

(In other words, the probability that the true mean is higher than 10.58510 is 0.10.) And, still at 90% confidence, we have a true mean lying over:

$10-1.37218 \frac{\sqrt{2}}{\sqrt{11}}=9.41490$

(In other words, the probability that the true mean is lower than 9.41490 is 0.10.) So that at 80% confidence, we have a true mean lying between

$10\pm1.37218 \frac{\sqrt{2}}{\sqrt{11}}=[9.41490,10.58510]$

(In other words, the probability that the true mean is outside that interval, either above it or below it, is 0.20.)

Wait! Wait! No, the words in italics above which I was going to put in are wrong. Sorry. That was the prosecutor's fallacy. Let me try again -- how about the following?

For example, given a sample with a sample variance 2 and sample mean of 10, taken from a sample set of 11 (10 degrees of freedom), using the formula:

$\overline{X}_n\pm A\frac{S_n}{\sqrt{n}}$

We can determine that at 90% confidence, we have a true mean lying below:

$10+1.37218 \frac{\sqrt{2}}{\sqrt{11}}=10.58510$

(In other words, on average, 90% of the times that an upper threshold is calculated by this method, the true mean lies below this upper threshold.) And, still at 90% confidence, we have a true mean lying over:

$10-1.37218 \frac{\sqrt{2}}{\sqrt{11}}=9.41490$

(In other words, on average, 90% of the times that a lower threshold is calculated by this method, the true mean lies above this lower threshold.) So that at 80% confidence, we have a true mean lying between

$10\pm1.37218 \frac{\sqrt{2}}{\sqrt{11}}=[9.41490,10.58510]$

(In other words, on average, 80% of the times that upper and lower thresholds are calculated by this method, the true mean is both below the upper threshold and above the lower threshold. This is not the same thing as saying that there is an 80% probability that the true mean lies between a particular pair of upper and lower thresholds that have been calculated by this method -- see confidence interval and prosecutor's fallacy.)

Please let me know what you think of adding the italicized words above to the article (though I would not italiicize them in the article). Thanks --Coppertwig 23:12, 18 November 2006 (UTC)
I have just read the article on Prosecutor's Fallacy and I don't think it applies in this case. I think that the following two sentences are actually identical:
1) On average, 80% of the times that upper and lower thresholds are calculated by this method, the true mean is both below the upper threshold and above the lower threshold [which I would write more simply as "within the interval"]. and
2) There is an 80% probability that the true mean lies between a particular pair of upper and lower thresholds that have been calculated by this method.
Also the following sentence would be identical:
3) of 100 intervals calculated in this way, the true mean would lie within 80 of them.
I am no expert of probabilities but please reconsider -- anonymous

## Confidence interval: 80% or 90%

On 24 September 2007, user 141.212.137.29 performed the following change (under the header 'Confidence intervals derived from Student's t-distribution':

$\overline{X}_n\pm A\frac{S_n}{\sqrt{n}}$

is a 90-percent confidence interval for μ

He changed it to:

$\overline{X}_n\pm A\frac{S_n}{\sqrt{n}}$

is a 95-percent confidence interval for μ.

I think that the old version was correct, however, I'm not sure. Can anyone respond on this? Basten 09:35, 11 October 2007 (UTC)

## Used Template:Abramowitz_Stegun_ref to make the A&S reference clickable

Since this page refers to the famous work of Abramowitz and Stegun, I went ahead and replaced the citation of A&S in this article's reference list with a template invocation. The template was originally created by User:William Ackerman and explained at [1] because Abramowitz and Stegun are cited so often in the physics pages. A sample invocation is {{Abramowitz_Stegun_ref|26|985}}. This expands into a cite of chapter 26, where the number 26 is clickable so that it opens up page 985 of the online version of A&S. In this case I had to replace the original '26.7' with '26' because the template won't take decimal points in the chapter field. If this bothers you, revert the change. Up till now the template has always used without subst. EdJohnston 16:20, 17 November 2006 (UTC)

Thanks for the references! I didn't know there was an online copy of Abramowitz! That will sure come in handy for lots of things, not just editing this page! And a link to the original work by Gosset -- that's great! I still see some problems with this article; I hope you and Chris53516 will stick around and help work them out. I'll probably make some more comments and/or changes soon. --Coppertwig 03:23, 18 November 2006 (UTC)

## Slash in denominator: double division? Is the formula correct?

Consider the following formula from the article (3rd formula under the heading "Occurrence and specification of Student's t-distribution".)

be the sample variance. It is readily shown that the quantity

$Z=\frac{\overline{X}_n-\mu}{\sigma/\sqrt{n}}$

Note that there is a slash in the denominator, immediately after sigma. What does this mean? It seems to mean double division; that is, it seems to mean that the rhs is (xnbar - mu)/(sigma/sqrt(n)), which simplifies to:

$(\sqrt{n})\left(\frac{\overline{X}_n-\mu}{\sigma}\right)$

However, I believe that this formula is wrong and that the slash should simply be deleted. Later when I have time to think more clearly I may figure this out one way or the other. Meanwhile maybe someone else can figure it out. Either the slash is correct, in which case the expression should be simplified by putting the sqrt(n) in the denominator, or (as I believe) the slash is wrong and should be deleted. Or possibly the slash has some meaning I don't know, in which case it should be explained in the text.

Exactly the same problem (a slash, possibly spurious, in the denominator) also occurs in the following places: (2) The very next formula after the one I mentioned; and (3) the third formula in the section "Confidence intervals derived from Student's t-distribution"; and (4) possibly the first formula in the section "Further theory", although it may be correct even if the others are wrong; if so, the latter formula could possibly be improved by putting the material inside the square root sign into parentheses, or perhaps it's fine as-is.)

Thanks in advance for anyone shedding light on this. --Coppertwig 03:52, 18 November 2006 (UTC)

The above formulas are both correct, and the slash is ordinary division. See for example Casella and Berger, Statistical Inference, 2nd ed., page 222. EdJohnston 05:20, 18 November 2006 (UTC)
Now that I look at it again, I see that you're right. The formulas are mathematically correct. However, they need to be simplified down to standard form. If you look in Gosset's paper, you don't see any slashes in the denominators. He has horizontal division lines in denominators, but almost always only when necessary, for example as part of multiterm expressions inside an integral or square root sign. I think maybe he has one other one in the middle of a calculation. Other than that, he presents formulas in the conventional way, which means you don't have division going on in the denominator if you can help it. So, I think the formula needs to be edited to be one of the following forms or something similar (if I've done this right, these are all mathematically equivalent to each other and to the formula currently in the article, but are in a more standard form).:
$Z = (\sqrt{n})\left(\frac{\overline{X}_n-\mu}{\sigma}\right)$
$Z = \sqrt{n}\left(\frac{\overline{X}_n-\mu}{\sigma}\right)$
$Z = \frac{\sqrt{n}(\overline{X}_n-\mu)}{\sigma}$

--Coppertwig 23:33, 18 November 2006 (UTC)

I disagree. The form with the fraction in the denominator is easier to understand, because σ/√n is the standard deviation of the random variable $\overline{X}_n-\mu\,$, so it makes it obvious that you're just subtracting the rv's expected value from it and then dividing by its SD, the usual standardization. Michael Hardy 02:46, 19 November 2006 (UTC)

## Remove external link to Shaw's paper about the capital-T statistic?

The last item in our external links section is a paper by William T. Shaw which concerns drawing random samples from what he calls the univariate 'T' distribution, when working in a multivariate setting. This paper seems rather esoteric for the current article, and the non-standard usage of capital 'T' could be confusing. The Shaw paper is not explicitly mentioned in the text of the article and no reference to its subject matter is made. If no-one objects I'll remove this paper from the reference list. EdJohnston 06:02, 18 November 2006 (UTC)

I have just noted Ed's removal of the reference to my paper on this, which is now published in the Journal of Computational Finance. There seems to be a little confusion here about relevance, though this is partly my fault for not going through with an planned edit of the main article to discuss quantile functions (i.e. the inverse of the CDFs) to explain their relevance, and indeed to post some other discussions on quantiles for other distributions in the relevant bits of Wiki. These are fundemental to the sampling of the univariate case for any distribution. In the case of Student's t the inverse CDF, which makes it trivial to do sampling, has been not understood for many decades and now it is. There are some nice special cases which will go in the special cases section as well. I will do a tentative edit presently and let people make their own judgement. I must confess to regarding the comments about capital T vs lower case t rather bizarre - quite why anybody would be confused by this is beyond me - William Shaw

## Clear definitions of the quantities in the table

(I thought I had already made this comment but don't see it; I must have forgotten to click "save changes".) Re the (large) table in the section "Table of selected values": I would like to have clear explanations in the article of the definitions of the three types of quantities in the table: the integers along the left, the percentages along the top, and the numbers in the main body of the table. Note that one of the problems I see is that it's not clear whether the $\nu$ in the corner refers to the numbers along the left or to the percentages along the top. I suggest something like the following. Please critique. I would put this just below the table:

The number at the beginning of each row in the table above is $\nu$ which has been defined above as $n-1$. The percentage along the top is $100%(1 - \alpha)$. The numbers in the main body of the table are $t_{\alpha,\nu}$. If a quantity $T$ is distributed as a Student's t distribution with $\nu$ degrees of freedom, then there is a probability $1 - \alpha$ that $T$ will be less than $t_{\alpha,\nu}$.

--Coppertwig 23:52, 18 November 2006 (UTC)

That seems less clear to me than the present explanation. Perhaps it will help to specify which entry is intended? Septentrionalis 00:25, 19 November 2006 (UTC)
Sorry; I don't understand what you mean by "which entry is intended" and I don't know what part of the article you're referring to by "the present explanation". Maybe you mean the example shown immediately above the table? An example is fine, but I would like to see also a concise definition that applies to all elements of the table and that someone can refer to, in order to get the meaning of the table, without having to work through an example again. An example is not a substitute for a definition. --Coppertwig 03:58, 19 November 2006 (UTC)

## Student did not actually present the t-statistic in 1908

I think the intro might need to be reworded slightly, since the t-statistic was actually first defined by R.A.Fisher in a paper of 1924. The quantity whose distribution was discussed by Student was 'z', where t = z sqrt(n-1) is the relation between the old and new definitions. This 'z' was not the one we use nowadays with the normal distribution. The external link to 'Earliest known uses...' makes this evident. When time permits I'd like to make a try at rewording the opening paragraph, and will offer it here on the Talk page for discussion. Also I think it's NOT desirable to capitalize 't', which happens further down on the page. Regular statistics books don't do that. Moreover I think Student DID NOT wrote the formula that is attributed to him in the article where t is related to the gamma function. He was not a mathematician, but Fisher was.

Here is the actual quote from the 'Earliest known uses of some of the words of mathematics' web site which makes clear that Student did not call it 't':

In his 1908 paper, "The Probable Error of a Mean", Biometrika, 6, 1-25 Gosset introduced the statistic, z, for testing hypotheses on the mean of the normal distribution. Gosset used the divisor n, not the modern (n - 1), when he estimated and his z is proportional to the modern t with t = z sqrt (n - 1). Fisher introduced the t form because it fitted in with his theory of degrees of freedom (q.v.). Fisher used the t symbol and described Student's distribution (and others based on the normal distribution) and the role of degrees of freedom in "On a Distribution Yielding the Error Functions of Several well Known Statistics", Proceedings of the International Congress of Mathematics, Toronto, 2, 805-813. Although the paper was presented in 1924, it was not published until 1928 (Tankard, page 103; David, 1995). According to the OED2, the letter t was chosen arbitrarily. A new symbol suited Fisher for he was already using z for a statistic of his own (see entry for F). -- EdJohnston 21:24, 20 November 2006 (UTC)
If you've got the references for it, change the article and cite your sources. – Chris53516 (Talk) 21:31, 20 November 2006 (UTC)

This seems like a pretty minor point comparted to what the section heading about might lead one to suspect. So Student's statistic introduced in 1908 was not exactly identical to the version that is now conventional. Nonetheless, the hypothesis tests and confidence would be exactly the same. So it's worth mentioning, but not as big a deal as one might expect after reading "Student did not introduce the T-statistic." Michael Hardy 23:52, 20 November 2006 (UTC)

Perhaps not an earth-shaking issue, but the first version of this article, as created in 2002, got the terminology and the attribution correct. Somewhere along the way Student morphed into Fisher. Another issue with the article is that the term 'statistic' never gets defined, so the presentation seems incomplete. For modern readers it is most natural to describe Student's t-distribution as the distribution of the t-statistic, even though Student did not use that terminology. (Fisher introduced the term 'statistic' in 1922). I think that only minor rewording would be enough to make this clear. EdJohnston 06:34, 21 November 2006 (UTC)

## Please comment if you have an opinion on the opening section

I'm proposing this new version, to (1) state what the t-distribution really IS in the first two sentences, (2) get the sequence of events right, so Student isn't credited for Fisher's work. As you see, I've reused most of the existing language, but changed the order. Please give me your comments on this alleged improvement. If I don't hear anything back, I'll make the change in a few days. I'll also add the necessary references. EdJohnston 04:14, 22 November 2006 (UTC)

In probability and statistics, the t-distribution or Student's t-distribution is the probability distribution of the t-statistic for samples of a fixed size repeatedly drawn from a normal population. The t-statistic is the difference between the sample mean and the true population mean, divided by a standard deviation computed from the sample, and multiplied by the square root of the sample size.
Student's distribution arises when (as in nearly all practical statistical work) the population standard deviation is unknown and has to be estimated from the data. Textbook problems treating the standard deviation as if it were known are of two kinds: (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining.
The t-distribution can also be generalized to the case of two samples drawn from related populations, and be employed to compute confidence intervals. The Student's t-distribution is a special case of the generalised hyperbolic distribution.
The mathematical form of what is now called the t-distribution was presented in 1908 by William Sealy Gosset, while he worked at a Guinness brewery in Dublin. He was not allowed to publish under his own name, so the paper was written under the pseudonym Student. The t-test and the associated theory became well-known through the work of R.A. Fisher, who called the distribution "Student's distribution". Student himself called it 'the frequency distribution of a quantity z', where z was an expression for a certain kind of a normalized deviation in a small sample. Fisher later introduced the quantity 't', a deviation normalized in a slightly different way, and established all its mathematical properties.
A t-test is any statistical hypothesis test in which the test statistic has a Student's t-distribution if the null hypothesis is true.

EdJohnston 04:14, 22 November 2006 (UTC)

This proposal says:

The t-distribution can also be generalized to the case of two samples drawn from related populations, and be employed to compute confidence intervals. The Student's t-distribution is a special case of the generalised hyperbolic distribution.

That is not a generalization of the t-distribution. It's still exactly the same distribution. It's a different statistical test, but the same distribution. Michael Hardy 19:18, 22 November 2006 (UTC)

You're right. I'll try to come up with a correct version. EdJohnston 03:39, 23 November 2006 (UTC)

I am not impressed by this new lead. It leaves the third paragraph of the lead untouched, which I felt was one of the weak points of the lead. It explains in words something that is already explained much more clearly using equations in the first section of the article. It uses more technical terms than the previous lead. It uses two very short paragraphs. And it clarifies the origins of the subject in a way that would be better done in a separate section in the main body of the article. In essence, I feel it don't feel the changes are compatible with the lead section guidelines. Remember a lead is meant to provide an accessible overview, not be longer than three paragraphs for an article of this size and establish context. My strong preference is to stick with the old lead. Cedars 00:54, 23 November 2006 (UTC)

Thanks for your reply. My concern was that the article took so long to get to the point (saying what the t-distribution really is). If no-one else thinks the intro is too slow, I may reduce my proposal just to clarifying the history. At present I think there are (minor) factual errors regarding the attribution of who discovered what, between Student and Fisher, and I have the references needed to be sure of the accuracy. (Some of them listed here [2] on my Talk page).
You mention the third paragraph, the one that starts 'Student's distribution arises when..'. The value that I saw in that paragraph was it is the only place currently where the point of small-sample statistics is explained. Do you have any ideas for revising or replacing that paragraph? EdJohnston 04:02, 23 November 2006 (UTC)

## t-table

The t-table is incorrect, every value should be moved horizontally to the left one box. —The preceding unsigned comment was added by 134.10.2.125 (talk)

Yep, I agree. It's incorrect. 72.142.195.237 00:55, 17 July 2007 (UTC)

## Incorrect pdf Formula

I have just fixed a t-pdf function in the info-box, that was different from the t-pdf function in the text. They were both correct in essence, but the difference is confusing.

This is the formula in the text:

$f(t) = \frac{\Gamma((\nu+1)/2)}{\sqrt{\nu\pi\,}\,\Gamma(\nu/2)} (1+t^2/\nu)^{-(\nu+1)/2}$

This is the one I fixed:

$f(t) = \frac{\Gamma((\nu+1)/2)}{\sqrt{\nu\pi\,}\,\Gamma(\nu/2)\,(1+t^2/\nu)^{(\nu+1)/2}}$

Note the (-) sign in the exponent of the first function.

Dyaka 05:46, 14 March 2007 (UTC)

The pdf function in the text was not different from the one in the info-box; only the notation was different (just barely). There was certainly nothing "incorrect" in either (unless both were incorrect, in which case they still are---I'll look closely later). Michael Hardy 18:34, 14 March 2007 (UTC)
Yes, bad title, but I changed one of the formulas to avoid confusion. Dyaka 04:15, 22 March 2007 (UTC)

## what does "t" stand for?

can anyone add the definition of t?

Jfermiller 17:25, 13 May 2007 (UTC)

Well, basically, t stands for the domain of the probability distribution: ${-\infty} < t < \infty$ ... much in the same way z stands for values of the domain of the [standardized] normal CDF and PDF, or ${\chi }^2$ stands for domain values for the Chi-squared distributions.

It is a dummy variable, in the sense that g or q could be used instead; the usage of "t" however, inmediatly suggests that the pertaining distribution is precisely Student's.
Pallida Mors 76 00:20, 5 November 2007 (UTC).

## standardization

when v>2, the variance is defined which is v/(v-2), how to standardize it to one? Is this its pdf: :$f(t) = \frac{\Gamma((\nu+1)/2)}{\sqrt{(\nu-2)\pi\,}\,\Gamma(\nu/2)} (1+t^2/(\nu-2))^{-(\nu+1)/2}$ I hope someone can verify this and put it in the article. Jackzhp 15:05, 13 July 2007 (UTC)

Change scale on t. I trust this is covered under Normalization. Septentrionalis PMAnderson 15:37, 13 July 2007 (UTC)
Normalization is a disambiguation page with lots of entries: normalization in metallurgy, normalization in sociology, text normalization, maybe even normalization of diplomatic relations. Try normalizing constant. Michael Hardy 01:40, 17 July 2007 (UTC)

## p-value and A(t | ν) ambiguous

I spent hours to figure out why the p-value equation has a 2 on the denominator. And finally get to the conclusion that the p-value here is a single-sided/single-tailed p-value while the A(t|v) is a double-sided/double-tailed probability. I would argue a cumulative distribution integrated from negative infinity is much closer to the convention. Or at least use that to introduce the two-sided A(t|v). And the sentence "For the statistic t, with ν degrees of freedom, A(t | ν) is the probability that t would be less than the observed value if the two means were the same" made it even more confusing because the sentence assumed t to be positive and did not give a definition of t. I assumed t was the same as the T in the previous sections, but it actually is its absolute value. I also argue a two-sided p-value is the "default" for many people, like in this article: [3] (in PDF). An absolute value "t" counting for both side used, why would you make the p-value single-sided??? —Preceding unsigned comment added by 128.227.105.227 (talk) 23:19, 27 September 2007 (UTC)

## A(t|nu) discussion not very enlightening

I've edited the page to try to explain the relation between A(t|nu) and f(t). As noted above, it still needs t defined. This is actually done on the "Student t test" page, so it could do with a reference. The whole business about using the absolute value of t for the test based on A(t|nu) is still untidy, not helped by absence of discussion that 1-A(t|nu) is two-sided probability. But I don't have time for more. Also see Abramowitz and Stegun for definition of A(t|nu) and condition on relation to beta function. JohnPhysicist. Sorry, no Wikipedia account (or I've forgotten its name)77.99.31.195 21:58, 4 November 2007 (UTC)

## Examples are always helpful

Can someone add a few examples, perhaps using the TTEST() function from MS Excel or with a simple set of data? I think this would add greatly the usability of your content.

## Error in example after table?

Am I mistaken or is there an error in the example given of how to use the values from the table under "Table of Selected Values"? The text says the following:

For example, given a sample with a sample variance 2 and sample mean of 10, taken from a sample set of 11 (10 degrees of freedom), using the formula
$\overline{X}_n\pm A\frac{S_n}{\sqrt{n}}.$
We can determine that at 90% confidence, we have a true mean lying below
$10+1.37218 \frac{\sqrt{2}}{\sqrt{11}}=10.58510.$

But when I look at the table, the value (A) for 90% with 10 degrees of freedom should be 1.812, not 1.37218. The 1.372 value appears to correspond to 80%, not 90%.

I'm certainly not an expert in this area, so it's possible I'm misunderstanding something.. Am I reading the table wrong, or is this example using the wrong number? -- Foogod (talk) 00:21, 22 November 2007 (UTC)

Yes, the table is wrong, the T value quoted in the text is correct excapt for the other example. I've checked the T values using a calculator and also Mathematica, and it turnes out that the percentages are wrong: 80% should be 90%, 90% should be 95% etc. Count Iblis (talk) 17:59, 29 November 2007 (UTC)
Aha.. Thank you, that answers another question I was rather confused about (namely, if it is a one-tailed probability table, then shouldn't the 50% numbers, by definition, all be 0?) That table makes more sense now.. -- Foogod 01:41, 4 December 2007 (UTC)

The table is right and the example you quote is done wrong. You did indeed misunderstand. Since it's a 90% confidence interval, you've got 10% in the tails, hence 5% in the upper tail, so you need to look for 95% in the table, not 90%. Michael Hardy (talk) 18:45, 16 September 2008 (UTC)

## Corrected error in table.

The listed probablities (first row of table) were wrong, I've corrected it. I suggest that we all check that it is correct and also check every single critical T value that islisted to make sure they are all correct. Count Iblis (talk) 19:00, 29 November 2007 (UTC)

This may partly be my responsibility - many apologies. I was trying to patch up some IP edits made on 14th November. When restoring the column headings I looked at (contrary to the article statements and original version prior to the IP edit) a two-sided distribution reference. Apologies again - second math error I've made in the past few weeks, on a small number of edits so my percentage rate is quite appalling. Asperal 20:11, 29 November 2007 (UTC)

## please explain the formulas

{{technical}} The introduction and the section "Why use the student's t-distribution?" are excellent but the rest of the article is difficult to understand. 69.140.159.215 (talk) 12:57, 12 January 2008 (UTC)

That's not really a compelling reason to tag this article. --C S (talk) 06:29, 14 August 2008 (UTC)

## Table of Student t's

I just want to comment here that there is a mismatch between student t values and the confidence interval values. For sure the values under 97.5 % are the 95 % confidence interval student t's. These table headings should be corrected. Thanks. —Preceding unsigned comment added by Julescarlson (talkcontribs) 05:03, 29 February 2008 (UTC)

Oops... I think I see what you are indicating, that these are one sided student t's and I'm talking about 2-sided student t's. Maybe you could just make sure this is clear (or maybe it's just me). contribs) 05:03, 29 February 2008 (UTC) —Preceding unsigned comment added by Julescarlson (talkcontribs)

## K or nu?

Is k in the leading diagrams actually $\nu$ that appears in the info box? It would be helpful if the notation matched. --Michael C. Price talk 12:12, 21 January 2009 (UTC)

## Misleading statement deleted

I deleted this statement:

For $\nu$ = 30 the t-distribution is almost the same as the normal distribution.

I have a suspicion as to the origin of this statement: A rule of thumb sometimes says sample sizes of 30 or more are enough to justify using a normal approximation for the distribution of a sum or an average, based on the central limit theorem. So extremely confused students try to apply that to this situation—an altogether different thing. It is profoundly appalling that someone could be that confused, but it really can happen—I've seen it.

Software in front of me is giving the 90th percentile of Student's distribution with 30 degrees of freedom as 1.3104, and that of the normal distribution as 1.2816. Michael Hardy (talk) 13:00, 13 March 2009 (UTC)

## More general version

I am curious as to why there is no mention of the more general t distribution. I understand that when doing t-tests everything is rescaled N(0,1) but shouldn't the more general distribution be mentioned here anyway, where Z~N(mu,sigma^2). MATThematical (talk) 21:08, 7 February 2010 (UTC)

This point is partly met by the article Noncentral t-distribution, which is mentioned in the "characterization" section. Melcombe (talk) 12:20, 8 February 2010 (UTC)
I added it a few months back; it's the "three-parameter version". Benwing (talk) 16:37, 3 November 2010 (UTC)

## Article hard to understand?

Not to seem frustrated, but good God Almighty, could we make this article just a smidgen more accessable to the average user? I'm a college senior, I'm majoring in biology, I've used this test many, mnay times before and was simply looking for a light refresher on its nuances. Even so, I'm having a serious trouble following what's going on in this article. Prehaps you should split it into several sections, ie Overview (for regular people) and a Detailed (for statisticians and mathematicians) section. Otherwise, I suspect that it'll remain totally useless to the vast majority of users.

Antagonistrex 14:37, 26 February 2007 (UTC)

You're mistaken: this page is not supposed to be about any particular statistical test. There are various statistical tests that rely on Student's t-distribution, and there are separate articles about those. Michael Hardy 20:01, 26 February 2007 (UTC)
Try Student's t-test. Michael Hardy 20:02, 26 February 2007 (UTC)

... also, could you be SPECIFIC about which parts you're having trouble with? I've just looked at the parts on "occurence and specification" and "confidence intervals". The parts about the density function are asserted without any explanation of where they came from, but otherwise the two sections looked as if you don't need to know much to read them except things that are naturally prerequisites to this topic. (I still wonder if your problem is that you were looking for something on how to use this distribution in statistical tests, and failing to realize that's found in separate articles.) Michael Hardy 20:16, 26 February 2007 (UTC)

From Student's t-test: "To determine or calculate significance, see Student's t-distribution." This in contradiction to your assertion that the use of this distribution in statistical tests is explained in the articles for those tests. rah 15:18, 18 June 2007 (UTC)
I concur with Antagonistrex that this -- and indeed few of the pages on distributions -- is not written at the high school level. I know it's not easy to do with mathematical topics, but really, this is frustrating. Here are my suggestions:
* Include an example in the introduction, and then a more detailed one later.
* Keep in mind that the average high school or college student doesn't know what degrees of freedom are.
Many thanks! --aciel (talk) 21:46, 10 May 2010 (UTC)
I think I concur, too. Yeah, give the article some arc. Maybe start at the 10th grade level (age 15 and 16), then some college examples, and later on more some grad school examples. Then at the very end, maybe some cutting edge stuff. That way the article will be useful to a wider range of people. People can scan through the stuff they already know and start reading slower and more carefully when they get to their level and maybe even a little bit past it.
I mean, is student's t-distribution just a more tightly compressed bell-shaped distribution? That would be a fine start. FriendlyRiverOtter (talk) 20:12, 29 October 2010 (UTC)
OK, I usually complain that Wikipedia math articles are too technical, but in this case it does say, right in the lead in the second para,
The t-distribution is symmetric and bell-shaped, like the normal distribution, but has heavier tails, meaning that it is more prone to producing values that fall far from its mean. This makes it useful for understanding the statistical behavior of certain types of ratios of random quantities, in which variation in the denominator is amplified and may produce outlying values when the denominator of the ratio falls close to zero. The Student's t-distribution is a special case of the generalised hyperbolic distribution.
Now the second and third sentences may be confusing, but the first one seems pretty clear to me. What would help you understand it more? Benwing (talk) 04:46, 30 October 2010 (UTC)
BTW awhile ago I wrote, I think in Wikipedia:WikiProject Statistics, that I think this article is an example of what a well-written lead should look like. In this case, it states clearly that it's used as part of the Student's t-test, which is a very common statistical test, and furthermore the most common uses of the test are mentioned. If you look on the page for that test, you'll see examples of how the test is used. It's not clear to me what sorts of examples you could put on this page on the distribution, however -- the most common use of the distribution is as part of the t test. Benwing (talk) 04:52, 30 October 2010 (UTC)
But it looks like it's a tigher bell-shaped. That is, it looks like it would have lighter tails. For example see this article, http://controls.engin.umich.edu/wiki/index.php/Comparisons_of_two_means , 6th diagram down. And "generalised hyperbolic" seems to mean student's t-distribution.
No it has heavier tails. You can see that clearly by looking at the graphs that compare the student t for various degrees of freedom with the standard normal distribution. The tails are above the tails of the normal dist, meaning there is more mass in them, meaning they are heaver.Benwing (talk) 16:34, 3 November 2010 (UTC)
What about the sixth diagram down?
http://controls.engin.umich.edu/wiki/index.php/Comparisons_of_two_means
Those tales are way below, almost flat. And, just looking at it, student's t seems higher, more abrupt. That is, more definite slope. (I am not afraid of asking obvious questions and appearing foolish) FriendlyRiverOtter (talk) 21:20, 4 November 2010 (UTC)
And what I want is the real deal. I mean, maybe the first (short) paragraph of the lead can be written at the 10th grade level, the next (short) paragraph of the lead at the 12th grade level. And maybe the final (short) paragraph of the lead at the freshman college level. And that's fine for the lead. Later on, deeper in the article, we can take it further, grad school level, etc. What I don't want is for the article to lay there flat and "perfect." I am not a big fan of "perfect" and I hope you aren't either  :) FriendlyRiverOtter (talk) 03:42, 2 November 2010 (UTC)
I don't have any problem with your sentiments and you're in general agreement with what it says in WP:TECHNICAL. But alas I'm not a high school teacher so I have no idea what "10th grade level" and "12th grade level" is. Could you be specific as to which sentences in the lead you find confusing, and what sort of info could be added to make them less confusing? Otherwise there's no way I can help. Benwing (talk) 16:34, 3 November 2010 (UTC)
Maybe examples as we go along, getting more complex the deeper we get into the article? But we don't talk down to our reader. We don't 'baby' the beginning. We just keep it straightforward.
For example, if we measured the heights of 20 random people on a busy city street. Well, 20 people are going to give us some variation, but probably not as much variation as that which exists in the entire city. Is that part of the point? FriendlyRiverOtter (talk) 21:11, 4 November 2010 (UTC)
Well, there are examples on the Student's t test page. Is that what you're looking for? I'm not sure what you're getting at by your example -- it's true that you'll get more variation with a city worth of people than 20 random people, but that doesn't have anything specifically to do with the Student's t distribution. Benwing (talk) 08:00, 5 November 2010 (UTC)
Please take a look at the 6th diagram on this page. FriendlyRiverOtter (talk) 02:22, 8 November 2010 (UTC)
http://controls.engin.umich.edu/wiki/index.php/Comparisons_of_two_means

## typo in formula?

There appears to be an error in the formula for finding P value from the T value, section: probability density. Specifically in the "for even degrees of freedom" part.

For $\nu$ even,

$\frac{\Gamma(\frac{\nu+1}{2})} {\sqrt{\nu\pi}\,\Gamma(\frac{\nu}{2})} = \frac{(\nu -1)(\nu -3)\cdots 5 \cdot 3} {2\sqrt{\nu}(\nu -2)(\nu -4)\cdots 4 \cdot 2\,}.$

The pi mysteriously disappears in the right hand side of the formula. Compare with the 'for odd' formula immediately below. I didn't feel confident enough editing it myself, since the syntax for writing formulas looked complex. Jktejik (talk) 04:04, 3 January 2011 (UTC)

I can't see anything wrong here. The Gamma function for half-integer argument involves a factor of √π. See the formulae about half way through the Gamma function#General section. Qwfp (talk) 10:02, 3 January 2011 (UTC)

## Student's T as a maximum entropy distribution

I ran across the statement "Even the Cauchy distribution is a maximum entropy distribution over all distributions satisfying E$(\ln(1+X^2)=\alpha$" in Entropy and Conditional Probability by Campenhout, et. al.

From Maximum entropy probability distribution#Continuous_version, the maximum entropy distribution for constraint E(f(x))=a is

$p(x)=\frac{e^{\lambda f(x)}}{\int_{-\infty}^\infty e^{\lambda f(x)}\,dx}$

For the constraint given in Campenhout, et. al., this yields

$p(x)=\frac{\left(x^2+1\right)^{\lambda } \Gamma (-\lambda )}{\sqrt{\pi } \Gamma \left(-\lambda -\frac{1}{2}\right)}$

with $\lambda \le -1/2$, which is just the Student's T distribution when $\lambda=-\frac{\nu+1}{2}$ which reduces to the standard Cauchy distribution for $\lambda=-1$

I would like to state in the article that, contrary to Campenhout et. al., the Student's T distribution is the maximum entropy distribution for a fixed E$(\ln(1+X^2)$, but I can find no reference for this. Does anyone know of such a reference?

I can find nothing in standard book sources. But you might want to see this journal paper which seems to be agreeing with Campenhout, while giving another (more general) constraint that suposedly yields the more general Student's t. Also, this book states (with refs) a similar result for the multivariate case. Melcombe (talk) 09:14, 2 June 2011 (UTC)
Excellent, thank you. The journal article by Park & Bera actually agrees with what I wrote rather than Campenhout. If you take the Student's T expression from the table on page 221 and make the substitution $y=x/r$, you get the constraint $E(\ln(1+y^2))$ as I wrote above. This will give $E(\ln(1+y^2))=\psi((r+1)/2)-\psi(r/2)$ where $\psi(x)=\Gamma'/\Gamma$ is the digamma function. Ignoring y=x/r goes bad at r=0, this gives $E(\ln(1+y^2))=2\ln(2)$ for the Cauchy distribution, just as shown in the table. The expression for Student's T then, is just slightly more general, enough to avoid the r=0 problem, but the Cauchy is just what I would expect, and is contrary to Campenhouts assertion that $E(\ln(1+y^2))$ can be anything for Cauchy, rather than 2 ln(2) only. This will do it. Thanks again. PAR (talk) 15:59, 2 June 2011 (UTC)
On second thought, there is no r=0 problem since $r^2=\nu > 0$, so I can see no reason why the Student's T distribution cannot be described as a maximum entropy distribution for which $E(\ln(1+X^2))$ is fixed. I entered Park & Bera's statement, with reference, and stated that it is equivalent to $E(\ln(1+X^2))$ fixed, with citation needed. PAR (talk) 10:12, 3 June 2011 (UTC)
I am not clear on what is going on here, but I would be wary of a formulation in which the same parameter appears both in the conditioning function and in the constant that the expectation is set to. However, your argument about changing scale is also worrying, as it seems to imply that one of two formulations will lead to a scaled version of the Student's t being the maximum entropy distribution, not the Student's t itself. Melcombe (talk) 13:35, 3 June 2011 (UTC)
Without the extra constant, saying E(ln(1+Y^2))=constant, yes, you come up with a probability density which is a function of the Lagrange multiplier λ which, in order to obtain the Student's T distribution in y, you must make the identification $\lambda=-(\nu+1)/2$ and rescale to $y=x/\sqrt(\nu)$. With Park & Bera's E(ln(r^2+X^2))=constant this is not necessary. One of the things Park & Bera introduce in the article is the idea of not only having Lagrange multipliers in the constraint, but constants in the constraint which ultimately become parameters in the distribution. I do not understand the process by which the extra constants are determined, but I assumed this is what they are doing with the r^2 in the constraint. Apparently this is necessary to give Student's T without rescaling. PAR (talk) 18:31, 3 June 2011 (UTC)
If you are still looking at this, it may be worth also looking at Tsallis distribution, where Student's t is said to be maximum entropy for a differnt type of entropy, the Tsallis entropy. Melcombe (talk) 09:43, 24 June 2011 (UTC)
This is now in q-Gaussian. Melcombe (talk) 09:47, 27 June 2011 (UTC)
Ok, good. I will look at it, tx. PAR (talk) 14:36, 27 June 2011 (UTC)

## An apparent mistake in "Related distributions"

One thing to point out is that there are two headlines "related distributions", one under the subject "Properties" and the other one below. I only suggest that the latter headline be changed to "Alternative parameterizations" or something similar.

A serious problem, however, I see in the following:

### Related distributions

• $X \sim \mathrm{t}(\nu)$ has a t-distribution if $\sigma^2 \sim \mbox{Inv-}\chi^2(\nu,1)\!$ has a scaled inverse-χ2 distribution and $X \sim \mathrm{N}(0,\sigma^2)\!$ has a normal distribution.

There is an apparent contradiction and something is missing and/or mistakenly additional. It may have some connection to previous statements but as such this (mis)information is useless.

M- — Preceding unsigned comment added by 195.98.16.86 (talkcontribs) 18:57, 4 July 2011

Yes. I think I sort of know roughly what it was supposed to mean, but not precisely enough to be able to correct it, so I have removed it. I would have asked the person who put it in to clarify it, but it was inserted in October 2006 by an editor who has not edited since August 2009, so there is probably little chance of getting a response. JamesBWatson (talk) 20:33, 4 July 2011 (UTC)
It means, I think, that if $pr(X|\mu=0,\sigma) \sim \mathrm{N}(0,\sigma^2)\!$ and $\sigma^2 \sim \mbox{Inv-}\chi^2(\nu,1)\!$, then the marginalised distribution
$pr(X|\mu=0) = \int pr(X|\mu=0,\sigma) pr(\sigma) d\sigma \sim t(\nu)$
Jheald (talk) 11:53, 31 October 2012 (UTC)

## math in subtitle

I think it depends on what "My preferences>Appearance>Math" you have set for wikipedia. In mine, I have "HTML if very simple or else PNG"

• [itex]\nu/math> = $\nu$ looks very good to me in a subtitle
• the character "ν" renders as nearly a bold "v" to me, which is why I changed it to [itex]\nu/math>
• A third option is & nu ; = ν which also renders as nearly a bold "v" to me

I forgot all of this and just assumed that the original editor had made a less-than-optimum choice. We should figure out a way that is preference independent. PAR (talk) 12:17, 26 July 2011 (UTC)

This seems partly solved, but the point was that maths in section titles does not work well in terms of the appearance in the contents list, which here just appeared at a section number followed by blank. I don't think the options selected matter for this. The present version looks OK, but it may be worth thinking of making this material into a table, or possibly just joining the lines together to take up less page space. For others, this discussion relates to the subsection headed "Special cases". Melcombe (talk) 08:56, 27 July 2011 (UTC)

## An error in a formula relating t-student to incomplete beta

In section "Integral of Student's probability density function and p-value" the last formula is wrong. The same error can be found in a number of papers on the Web, but this does not make the formula more true. I have verified it against R package, with the ppoint 7.5 and v=3 for example (the difference on the third place behind decimal point: true value is .997..., the formula yeals .994...). The correct one is that on Wolfram's page [4]

Please change it accodingly. 217.67.210.18 (talk) 13:27, 24 August 2011 (UTC)

Formula now changed, with citation. Melcombe (talk) 16:44, 25 August 2011 (UTC)

### Please can anyone provide a source for the equations for moments when df is small?

For example variance ∞ for 1 < \nu ≤ 2, otherwise undefined

Do these apply for real degrees of freedom as well as integer values?

Paul A Bristow (talk) 17:07, 31 July 2012 (UTC)

Moments of the T distribution exist up to order int(ν). Expectations of order ν and higher do not exist. 71.210.197.82 (talk) 04:36, 17 November 2013 (UTC) Dennis L. Clason, NMSU Department of Econ, Applied Statistics and Int'l Business

## Offensive so-called "definition"

Someone created a section labeled "Definition", right after the lead section, that said this:

#### begin excerpt

The textbook definition of Student's t-distribution can be understood as follows:

"The sampling distribution of t is a probability distribution of the t values that would occur if all possible different samples of a fixed size N were drawn from the null-hypothesis population. It gives (1) all the possible different values for samples of size N and (2) the probability of getting each value if sampling is random from the null-hypothesis population." [1]

— Robert R. Pagano, Understanding Statistics in Behavioural Sciences, pp 291 Wadsworth, 2001

1. ^ Pagano, Robert R., Understanding Statistics in Behavioural Sciences, pp 291 Wadsworth, 2001

#### end excerpt

How can any sensible person not be offended? No attempt was made to say what "t values" are, nor to say that this is intended to apply to i.i.d. samples from a normally distributed population. One should presume the author of the book is innocent and got quoted out of context, but no one could read the quote above without that context and then somehow know what it means. This article is what should provide that context; only those who already know what the t-distribution is would know that already. Michael Hardy (talk) 14:00, 13 November 2011 (UTC)

## "Consequently"

I've put a {{clarify}} tag on the assertion in the Characterization - Derivation section that "Consequently... [T] ... has a Student's t-distribution as defined above."

There seem to be several lines missing, unless there is something obvious that I'm not seeing, if we're supposed to be able to see from this why

$T \equiv \frac{Z}{\sqrt{V/\nu}} = \left(\overline{X}_n-\mu\right)\frac{\sqrt{n}}{S_n},$

should have the distribution

$f(t) = \frac{\Gamma(\frac{\nu+1}{2})} {\sqrt{\nu\pi}\,\Gamma(\frac{\nu}{2})} \left(1+\frac{t^2}{\nu} \right)^{-\frac{\nu+1}{2}}\!$

Could somebody fill in the missing maths, either here, or perhaps at Ratio distribution as an archetypal example?

Cheers, Jheald (talk) 23:18, 12 November 2012 (UTC)

Actually, it's pretty straightforward. The Chi-square and normal densities are well known, and the sample mean and variance of Normal random variables are independent. Therefore, the joint density is the product of the marginal densities. Now, define a transformation from (Z, S) to (T, x), where x is a function chosen to be convenient (S is a convenient choice, as it happens).
Now, find the differential element, and complete the transformation. Finally, integrate out S to get the marginal density of T, which is as given. — Preceding unsigned comment added by 71.210.197.82 (talk) 04:43, 17 November 2013 (UTC)

## Question about entropy expression

In the expression for entropy the beta term is shown as B(nu/2, 1/2). But in Katz et al Multivariate T Distributions and Their Applications, the term in the multivariate case is shown as B(p/2, nu/2) (so B(1/2, nu/2) in single variate case. Is it a mistake in the article here? — Preceding unsigned comment added by Analyticposterior (talkcontribs) 09:57, 11 April 2013 (UTC)

B is symmetric in its arguments. Melcombe (talk) 14:06, 11 April 2013 (UTC)

## Distribution of the True Mean?

In the introduction, the article states now:

"... then the t-distribution (for n-1) can be defined as the distribution of the location of the true mean, relative to the sample mean and divided by the sample standard deviation... In this way the t-distribution can be used to estimate how likely it is that the true mean lies in any given range."

Is this right? It seems like nonsense to me. Isn't it backwards? I would say it gives the distribution of the location of the sample mean, relative to the true mean, etc. Steven J Haker (talk) 20:48, 19 October 2013 (UTC)

(true mean - sample mean)/(sample standard deviation) is the negative of (sample mean - true mean)/(sample standard deviation), and the t distribution is symmetrical, so taken literally, the statements are equivalent. You're right that the statement in the article could be misleading in that it suggests that the true mean varies relative to the sample mean when the random process by itself is the other way around. On the other hand, it's effective in suggesting to a beginning reader how the t distribution might be relevant to its purpose of making inferences about the true mean, even though neither frequentist nor Bayesian inference actually works in this exact way. Hashproduct (talk) 02:45, 2 October 2014 (UTC)

## Undefined expectation versus infinite expectation versus non-existent expecations

In the section on moments, it is claimed that the even moments of order greater than ν are infinite and only the odd moments do not exist. This is incorrect. Every reference in mathematical statistics I've ever used (including Billingsley, and Rao's Linear Statistical Inference) has the same condition on expectations:

E{g(X)} exists if and only if E{|g(X)|} < ∞.

The expectation in this case is not finite, and therefore does not exist. It is not a matter of 0/0 vs 1/0: claiming it is in the article simply obfuscates the matter.

A central T random variable has moments of all order up to int(ν-1). Higher order moments do not exist. 128.123.198.241 (talk) 23:38, 15 November 2013 (UTC) Dennis L. Clason, New Mexico State University Department of Economics, Applied Statistics and International Business.

## Minor comment: legend of CDF graph

The logend colors in the example CDF do not match the colors of the curve.

(It seems that the colors of the curves are the same as those in the PDF, but the legend colors are different).

The CDF graph should be reproduced with the correct colors.

07:39, 23 February 2014 (UTC) — Preceding unsigned comment added by 134.191.232.70 (talk)

## Subject-verb agreement

Looking at the first 7 pages given by Google for "student distribution" (skipping Wikipedia and YouTube) I observe that 6 of them do not speak explicitly about a family (of distributions), but just of a distribution depending on parameter(s). Only one of the 7 ("mathworks") says "The Student's t distribution is a family of curves depending on a single parameter ν (the degrees of freedom)." Maybe it is better not to mention the "family" in the lead (but do mention it later). For now I see in our article "Student's t-distribution (or simply the t-distribution) is the continuous probability distributions that arise..."; quite terrible. Boris Tsirelson (talk) 15:31, 22 November 2014 (UTC)

Indeed, we do not say "Quadratic polynomial is the family of functions of the form ax2+bx+c"; why go this way for distributions? Boris Tsirelson (talk) 15:41, 22 November 2014 (UTC)

OK. Let's try this: "... Student's t-distribution (or simply the t-distribution) is the name for the continuous probability distribution that arises for a particular sample size when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown." Singular all the way, since there's a single t-distribution for each sample size. -- The Anome (talk) 23:54, 22 November 2014 (UTC)

Or we could say: "Student's t-distribution (or simply the t-distribution) is the name for any of a number of continuous probability distributions..." -- The Anome (talk) 00:00, 23 November 2014 (UTC)

## Probability Density Function

In the section on the probability density function, there are two equations. One shows a calculation when v is even, and the other shows a calculation when v is odd. Each shows a numerator series and a denominator series. At the right end of each series is a multiplication. The multiplications need to be further explained, as they are not self-evident in the contexts of the series’. Since those are described as implementations of the Beta function, there is a disconnect between the Beta function formulae as described on this page, and the Beta function formulae described on the Beta function page that is linked here. By that I mean, I am not seeing these two formulas on the Beta function page.

If we substitute either the fourth or fifth formulae into the first formula, then the returned value will become zero once the numerator, v-n becomes zero. — Preceding unsigned comment added by Statguy1 (talkcontribs) 03:12, 17 February 2015 (UTC)

On further examination, I find that the Beta function call in the second formula under Probability Density Function does not fit either of the presentations of Beta functions that follow. For example, the function call refers to B(1/2, v/2) and has two parameters. the two presentations of the Beta function that follow do not have any reference to two parameters: only a variable, v. The relationship between the Probability Density Function and the Beta function is important for a user to understand and use the formulae on this page. The fact that one does not match the other is a major problem in need of repair. — Preceding unsigned comment added by Statguy1 (talkcontribs) 02:49, 17 February 2015 (UTC)