# Talk:Chebyshev's inequality

WikiProject Statistics (Rated B-class, High-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
High  This article has been rated as High-importance on the importance scale.
WikiProject Mathematics (Rated B-class, High-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B Class
 High Importance
Field: Probability and statistics

## Name

I wonder if we should rename this to "Chebyshev's inequality". First, Chebyshev seems to be the more common spelling of his name, and "inequality" seems to be the more common label for this result; his theorem being in the theory of approximations. AxelBoldt 00:29 Dec 11, 2002 (UTC)

Certainly fine with me. I'm hardly an expert in statistics, or in transliteration. --Ryguasu 00:51 Dec 11, 2002 (UTC)

I'm for that: Axel's right about the spelling and the name. --User:LenBudney 01:58 Dec 11, 2002 (UTC)

Tchebysheff is more like a German spelling. The correct and the most common spelling in English, as Axel said, for this great Russian mathematician is Pafnuty Lvovich Chebyshev. But theorem is quite okay for me. Best regards. --XJamRastafire 01:23 Dec 17, 2002 (UTC)

I really do not want to be a quibbler but Chebyshev's name in his article is also incorrect. If we write Pafnuti instead of Pafnuty, it is the same mistake in English if we write Yzaak Nevtonn or Jull Bryner :-). You can check the most common English spelling of Chebyshev's name at: Pafnuty Lvovich Chebyshev. The original name is "Pafnutij" but I guess "ij" in transcription becomes "y" and not "i". I should also comment very common practice here that Russian names lack "otchestvo", what is also incorrect. As I know Russians use full names and specially for their famous people. But I am repeating myself over and over again. And finally as I have already said somewhere here around that Donald Knuth maintains a list of all Russian names, he used as links and references in his books over the years. I believe we should follow his practise wher'e'er is possible. --XJamRastafire 13:25 Dec 17, 2002 (UTC)

## Other inequalities

Let Χ be the mathematician in question until you sort out his name =p Then there are at least two other inequality-type Χ's theorems.

One is aka Bertrand's postulate.

Another was a step toward prime number theorem: if π(n) is the number of primes not exceeding n, then $0.92\frac n{\ln n}<\pi(n)<1.11\frac n{\ln n}$ [Yaglom and Yaglom problem 170; vaguely mentioned at Mathworld]. 142.177.19.171 19:12, 13 Sep 2004 (UTC)

## Introductory Paragraph

I felt that the intoductory paragraph was excessively dense and technical:

'Chebyshev's inequality (also known as Tchebysheff's inequality, Chebyshev's theorem, or the Bienaymé-Chebyshev inequality) is a theorem of probability theory named in honor of Pafnuty Chebyshev. Chebyshev's inequality gives a lower bound for the probability that a value of a random variable of any distribution with finite variance lies within a certain distance from the variable's mean; equivalently, the theorem provides an upper bound for the probability that values lie outside the same distance from the mean.

It's quite correct, but I was afraid it would be difficult for a general audience to understand. This seemed a shame, since the importance of the theorem can be readily understood by a general audience. So I inserted a nontechnical paraphrase:

Chebyshev's inequality (also known as Tchebysheff's inequality, Chebyshev's theorem, or the Bienaymé-Chebyshev inequality) is a theorem of probability theory. Simply put, it states that in any data sample, nearly all the values are close to the mean value, and provides a quantitiative description of "nearly all" and "close to".
The theorem gives a lower bound for the probability that a value of a random variable of any distribution with finite variance lies within a certain distance from the variable's mean; equivalently, the theorem provides an upper bound for the probability that values lie outside the same distance from the mean.

I thought that people would be more likely to appreciate the meaning of the second paragraph if the idea was set up in less technical language first. -- Dominus 14:41, 11 August 2005 (UTC)

I found the current introductory parargaph very confusing and had to use a book to find out what this inequality really states. Clearly, there are cases where nearly all of the probability is concentrated far away from the mean, but these distrubutions have a high variance. I suggest the previous introduction (see above) is returned.

## measure theoretic statement

The second statement is not true for arbitrary nondecreasing nonnegative extended valued measurable g. This is because g composed with f need not be measurable if g and f are measurable and therefore the integral on the right does not make sense. If g is continuous the theorem holds since the inverse image of g is open and the inverse image of this under g of this open set will by definition by measurable. — Preceding unsigned comment added by 73.36.169.23 (talk) 00:58, 24 June 2014 (UTC)

From the point of view introduced in the measure theoretic section, the most natural Chebyshev inequality is the one obtained for $g(t)=t$. This simply tells you that a (say positive) random variable with a finite expected value cannot be large with high probability: quite understandable by non-math people (at least, more than second momentum). So, it would be nice to mention this basic Chebyshev inquality. Isn't it? gala.martin December 9.

To Gala Martin, The statement for general continuous g is very important in proving weak type estimates for motivating the lebesgue differentiation theorem. In particular is g(t)=t^p, then you get weak p-p estimate.

Except that that is called the Markov inequality. The Chebyshev inquality is about second moment about the mean. There is a Gauss inequality for the second moment about the mode. —The preceding unsigned comment was added by 81.151.197.86 (talk) 11:37, 14 April 2007 (UTC).

## whats the difference between chebyshev's inequality and chebyshev sum inequality

eh? —Preceding unsigned comment added by 68.161.204.136 (talk) 18:04, 17 December 2007 (UTC)

## Who first proved it?

I removed the statement that Chebyshev was the first person to prove the inequality. Bienayme proved it (as I recall) some 20 years earlier. For those who care, here is a little more history about the inequality, and about who the inequality "should" be named after. Markov, who was Chebyshev's student, wrote a letter saying that Chebyshev deserved the credit because he understood its purpose, which was to lay the groundwork for stronger probabilistic inequalities (that culminated in the central limit theorem).

Markov also wrote that Bienayme had just proved the bound to refute something or other that Cauchy had claimed. --AS314 (talk) 15:46, 22 February 2008 (UTC)

See Stigler's law of eponymy. EEng (talk) 15:36, 30 December 2009 (UTC)

## Merger proposal

The page An inequality on location and scale parameters refers to a simple corollary of the one-sided version of Chebyshev's inequality and has a misleading (or at least non-standard) name. Searching around I can't find a "real" name for the theorem proven on that page; rather, everyone simply proves it as a natural result of Chebyshev's inequality. Romanempire (talk) 12:09, 11 June 2008 (UTC)

Keep as separate article. The inequality is about different quantities than involved in Chebyshev's inequality ... median as opposed to probabilities of typical values. By "everyone simply proves it as a natural result of Chebyshev's inequality", I guess you mean everyone except whoever put the article up. It is good to have a direct proof. Are there any elaborations of that would allow tighter bounds to be formulated? Melcombe (talk) 13:24, 11 June 2008 (UTC)
...and there isn't a proof given of the "One-sided Chebyshev inequality". Melcombe (talk) 13:37, 11 June 2008 (UTC)
[1] shows that tighter bounds are impossible, ie $\frac{|\mu - M|}{\sigma}$ can be made as close as possible. That reference and [2] seem to call the result something like "a median-mean inequality." Romanempire (talk) 11:34, 12 June 2008 (UTC)
The problem here is that the other page seems to be violating WP:NOR. Wikipedia is not a probability textbook. We can't just have pages called "a theorem on ..." with non-standard results and original proofs. Romanempire (talk) 11:18, 12 June 2008 (UTC)

## Chebyshev's inequality in mathematics.

I noticed this is not the form of Chebyshev's inequality I am familiar with. The inequality that occurs most often in analysis would is what Wikipedia refers to as Markov's inequality. The terminology seems to vary a bit in probability as well (for example Krylov's book on Introduction to Diffusion processes for a different usage then here.) Should we make a comment about different usages of the phrase.

It might also make a lot of sense to merge the articles since the two concepts are so closely related. Thenub314 (talk) 11:51, 4 November 2008 (UTC)

## Question

In probability theory, Chebyshev's inequality (also known as Tchebysheff's inequality, Chebyshev's theorem, or the Bienaymé-Chebyshev inequality) states that in any data sample or probability distribution, nearly all the values are close to the mean value, and provides a quantitative description of "nearly all" and "close to".

shouldn't this say "in any normally distributed data sample..."? Or is this result valid for other distributions? 67.124.147.141 (talk) 05:37, 3 December 2008 (UTC)

It is valid for all distributions. --Zvika (talk) 07:28, 7 December 2008 (UTC)
The mentioned history of the inequality seems questionable. Reference (5) was published (in Russian) in the "Matematicheskii Sbornik" journal under the (translated) title claimed in the main text and not a French journal. However, a similar article with the title "sur les valuers limites integrales" appeared in 1874 in the journal mentioned in reference (5). So I suppose this is a French translation, although I am not sure as I do not speak Russian.

## Certain examples

I was privately (and rightly) scolded by Zvika for

## g(t) decreasing on some intervals?

In the measure theory versions, for the choice of g(t) so that the extended version implies the first version might need a little tweaking.

$g(t)=\begin{cases}t^2&\text{if }t\geq0\\ 0&\text{otherwise,}\end{cases}$

is decreasing on the range (0,1), which violates the precondition that g(t) be nondecreasing on the range of easily chosen f, such as f(t)=t. Did I miss something?

## Error in measure-theoretic formulation

If (X, Σ, μ) is an arbitrary probability space, then there is no guarantee that the event { xX : |f(x)| ≥ t } is even measurable. I think we need Σ to be Borel.  // stpasha »  01:53, 5 November 2010 (UTC)

## Probabilistic statement interpretation wrong

The following quote is not correct: "the proportion which are between 600 and 1400 words (...) must be more than 3/4". One can say that on average over many samples, or for an infinitely high pile of articles (!) the proportion between 600 and 1400 words must be more than 3/4, but for any particular finite pile Chebyshev's inequality cannot guarantee anything about the contents. The inequality is a statement about probability distributions, not about finite statistical samples. —Preceding unsigned comment added by 62.141.176.1 (talk) 16:44, 4 February 2011 (UTC)

I agree. I rephrased it in terms of probabilities, which makes the example a bit less useful. The other option is to claim that we have an infinite length pile. 199.46.198.233 (talk) 00:06, 14 October 2011 (UTC)

## Exact inequality for Probabilistic statement

More exact inequality for random values with normal distribution (average m and variance s^2)

P(|X-m| > ks) < sqrt(2/pi) * exp(-(k^2)/2) / k — Preceding unsigned comment added by 81.162.76.85 (talk) 11:54, 21 June 2012 (UTC)

## Error in Bhattacharyya's inequality?

Sitting comparing the Wiki page with reference [30], there seem to be several errors.

 - Bhattacharyya uses normed central moments, not normed raw moments.
- The second term in the denominator is squared.  — Preceding unsigned comment added by 203.110.235.129 (talk) 05:43, 7 February 2013 (UTC)

I think Wiki has the first correct: the inequality as stated requires the mean to be zero. DrMicro (talk) 11:37, 18 March 2013 (UTC)

## Mitzenmacher and Upfal's inequality

The inequality in this section is the same as given higher up under the heading "higher moments". As such it is rather trivial and very old and I don't think it deserves its own section and modern attribution. Anyone disagree? McKay (talk) 07:18, 8 March 2013 (UTC)

Pardon me if I have missed this but I don't think this inequality is given under the higher moments section. The inequality is somewhat tighter than the one given under higher moments.DrMicro (talk) 11:26, 18 March 2013 (UTC)