Samuelson's inequality

In statistics, Samuelson's inequality, named after the economist Paul Samuelson,[1] also called the Laguerre–Samuelson inequality,[2] after the mathematician Edmond Laguerre, states that every one of any collection x1, ..., xn, is within √(n − 1) sample standard deviations of their sample mean.

Definition

If we let

$\overline{x} = \frac{x_1+\cdots+x_n}{n}$

be the sample mean and

$s = \sqrt{\frac{1}{n} \sum_{i=1}^n (x_i - \overline{x})^2 }$

be the standard deviation of the sample, then

$\overline{x} - s\sqrt{n-1} \le x_i \le \overline{x} + s\sqrt{n-1}\qquad \text{for }i = 1,\dots,n.$[3]

Equality holds on the left if and only if the n − 1 smallest of the n numbers are equal to each other, and on the right iff the n − 1 largest ones are equal.

Samuelson's inequality may be considered a reason why studentization of residuals should be done externally.

Relationship to polynomials

Samuelson was not the first to describe this relationship. The first to discover this relationship was probably Laguerre in 1880 while investigating the roots (zeros) of polynomials.[4][5]

Consider a polynomial with all roots real:

$a_0x^n + a_1x^{n-1} + \cdots + a_{n-1}x + a_n = 0$

Without loss of generality let $a_0 = 1$ and let

$t_1 = \sum x_i$ and $t_2 = \sum x_i^2$

Then

$a_1 = - \sum x_i = -t_1$

and

$a_2 = \sum x_ix_j = \frac{t_1^2 - t_2}{2} \qquad \text{ where } i < j$

In terms of the coefficients

$t_2 = a_1^2 - 2a_2$

Laguerre showed that the roots of this polynomial were bounded by

$-a_1 / n \pm b \sqrt{n - 1}$

where

$b = \frac{\sqrt{nt_2 - t_1}}{n} = \frac{\sqrt{na_1^2 + a_1 - 2na_2}}{n}$

Inspection shows that $-\tfrac{a_1}{n}$ is the mean of the roots and that b is the standard deviation of the roots.

Laguerre failed to notice this relationship with the means and standard deviations of the roots being more interested in the bounds themselves. This relationship permits a rapid estimate of the bounds of the roots and may be of use in their location.

Note

When the coefficients $a_1$ and $a_2$ are both zero no information can be obtained about the location of the roots, because not all roots are real (as can be seen from Descartes' rule of signs) unless the constant term is also zero.

References

1. ^ Paul Samuelson, "How Deviant Can You Be?", Journal of the American Statistical Association, volume 63, number 324 (December, 1968), pp. 1522–1525 JSTOR 2285901
2. ^ Jensen, Shane Tyler (1999) The Laguerre–Samuelson Inequality with Extensions and Applications in Statistics and Matrix Theory MSc Thesis. Department of Mathematics and Statistics, McGill University.
3. ^ Advances in Inequalities from Probability Theory and Statistics, by Neil S. Barnett and Sever Silvestru Dragomir, Nova Publishers, 2008, page 164
4. ^ Jensen, Shane Tyler (1999) The Laguerre–Samuelson Inequality with Extensions and Applications in Statistics and Matrix Theory MSc Thesis. Department of Mathematics and Statistics, McGill University
5. ^ Laguerre E. (1880) Mémoire pour obtenir par approximation les racines d'une équation algébrique qui a toutes les racines réelles. Nouv Ann Math 2e série, 19, 161–172, 193–202