Two-sample hypothesis testing
In statistical hypothesis testing, a two-sample test is a test performed on the data of two random samples, each independently obtained from a different given population. The purpose of the test is to determine whether the difference between these two populations is statistically significant.
There are a large number of statistical tests that can be used in a two-sample test. Which one(s) are appropriate depend on a variety of factors, such as:
- Which assumptions (if any) may be made a priori about the distributions from which the data have been sampled? For example, in many situations it may be assumed that the underlying distributions are normal distributions. In other cases the data are categorical, coming from a discrete distribution over a nominal scale, such as which entry was selected from a menu.
- Does the hypothesis being tested apply to the distributions as a whole, or just some population parameter, for example the mean or the variance?
- Is the hypothesis being tested merely that there is a difference in the relevant population characteristics (in which case a two-sided test may be indicated), or does it involve a specific bias ("A is better than B"), so that a one-sided test can be used?
Statistical tests that may apply for two-sample testing include:
- Hotelling's T-squared distribution#Two-sample statistic
- Kernel embedding of distributions#Kernel two-sample test
- Kolmogorov–Smirnov test
- Kuiper's test
- Median test
- Pearson's chi-squared test
- Student's t-test
- Tukey–Duckworth test
- Welch's t-test