Welch's t test
In statistics, Welch's t-test (or unequal variances t-test) is a two-sample location test, and is used to test the hypothesis that two populations have equal means. Welch's t-test is an adaptation of Student's t-test, and is more reliable when the two samples have unequal variances and unequal sample sizes. These tests are often referred to as "unpaired" or "independent samples" t-tests, as they are typically applied when the statistical units underlying the two samples being compared are non-overlapping. Given that Welch's t-test has been less popular than Student's t-test and may be less familiar to readers, a more informative name is "Welch's unequal variances t-test" or "unequal variances t-test" for brevity.
Student's t-test assumes that the two populations have normal distributions and with equal variances. Welch's t-test is designed for unequal variances, but the assumption of normality is maintained. Welch's t-test is an approximate solution to the Behrens-Fisher problem.
Welch's t-test defines the statistic t by the following formula:
Here = , the degrees of freedom associated with the st variance estimate.
Welch's t-test can also be calculated for ranked data and might then be named Welch's U-test.
Once t and have been computed, these statistics can be used with the t-distribution to test the null hypothesis that the two population means are equal (using a two-tailed test), or the alternative hypothesis that one of the population means is greater than or equal to the other (using a one-tailed test). The approximate degrees of freedom is rounded down to the nearest integer.
Advantages and limitations
Welch's t-test is more robust than Student's t-test and maintains type I error rates close to nominal for unequal variances and for unequal sample sizes. Furthermore, the power of Welch's t-test comes close to that of Student’s t-test, even when the population variances are equal and sample sizes are balanced.
It is not recommended to pre-test for equal variances and then choose between Student's t-test or Welch's t-test. Rather, Welch's t-test can be applied directly and without any substantial disadvantages to Student's t-test as noted above. Welch's t-test remains robust for skewed distributions and large sample sizes. Reliability decreases for skewed distributions and smaller samples, where one could possibly perform Welch’s t-test on ranked data.
The following three examples compare Welch's t-test and Student's t-test. Samples are from random normal distributions using the R programming language.
For all three examples, the population means were = 20 and = 22.
The first example is for equal variances ( = = 4) and equal sample sizes ( = = 15). Let A1 and A2 denote two random samples:
The second example is for unequal variances ( = 16, = 1) and unequal sample sizes ( = 10, = 20). The smaller sample has the larger variance:
The third example is for unequal variances ( = 1, = 16) and unequal sample sizes ( = 10, = 20). The larger sample has the larger variance:
Reference P-values were obtained by simulating the distributions of the t statistics for the null hypothesis of equal population means ( = 0). Results are summarised in the table below, with two-tailed P-values:
|Sample A1||Sample A2||Student's t-test||Welch's t-test|
Welch's t-test and Student's t-test gave practically identical results for the two samples with equal variances and equal sample sizes (Example 1). For unequal variances, Student's t-test gave a low P-value when the smaller sample had a larger variance (Example 2) and a high P-value when the larger sample had a larger variance (Example 3). For unequal variances, Welch's t-test gave P-values close to simulated P-values.
|Microsoft Excel pre 2010||
|Microsoft Excel 2010 and later||
- Welch, B. L. (1947). "The generalization of "Student's" problem when several different population variances are involved". Biometrika 34 (1–2): 28–35. doi:10.1093/biomet/34.1-2.28. MR 19277.
- Ruxton, G. D. (2006). "The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test". Behavioral Ecology 17: 688–690. doi:10.1093/beheco/ark016.
- Fagerland, M. W.; Sandvik, L. (2009). "Performance of five two-sample location tests for skewed distributions with unequal variances". Contemporary Clinical Trials 30: 490–496. doi:10.1016/j.cct.2009.06.007.
- Zimmerman, D. W. (2004). "A note on preliminary tests of equality of variances". British Journal of Mathematical and Statistical Psychology 57: 173–181. doi:10.1348/000711004849222.
- Fagerland, M. W. (2012). "t-tests, non-parametric tests, and large studies—a paradox of statistical practice?". BioMed Central Medical Research Methodology 12: 78. doi:10.1186/1471-2288-12-78.