Talk:Cumulative distribution function

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics (Rated C-class, High-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
High Importance
 Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.
WikiProject Statistics (Rated C-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
 

cadlag[edit]

while the distribution is required to be cadlag? a discussion section on this will be valuable. Jackzhp (talk) 18:46, 15 August 2009 (UTC)

Moreover there is a tradition here (I suppose because of Kolmogorov's original notation, but I'm not sure) that the CDF should be left continous... Drkazmer Crystal 128 penguin.png Just tell me... 23:01, 2 January 2012 (UTC)

EDIT: My sincere apologies, but I don't know where else to report this. Unlike other Wikipedia pages, when this page is googled, its title shows up with the first letter of the first word uncapitalized. Try it. Cheers. -- Anonymous user

Complementary Comulative Distribution function[edit]

I assume there is an error after "Proof: Assuming X has density function f, we have for any c > 0", regarding integration limits for E(X) ? —Preceding unsigned comment added by Amir bike (talkcontribs) 05:54, 19 May 2011 (UTC)

It is said that Markov's inequality states that: \bar F(x) \leq \frac{\mathbb E(X)}{x} However it is only correct in continuous case, as in discrete case P(X \geq x) = \bar F(X) + P(X=x) Although the Inequality still holds, the current version is weaker than the proper Markov's inequality — Preceding unsigned comment added by Colinfang (talkcontribs) 18:59, 4 March 2012 (UTC)

The current version is the standard statement of Markov's inequality found in reference books. If there is a stronger result, it could be stated with a citation. If the stronger resuly is still generally known as Markov's inequality, then the Markov's inequality article could be updated as well. But the version in the article (now) states valid conditions under which the results hold. Melcombe (talk) 16:58, 15 April 2012 (UTC)

Utility[edit]

It would be helpful if in the entry there was a discussion of the utility performing a CDF plot. This would include when to perform one, and what information is learned from performing the CDF plot. What real world applications would this include? Maybe an example would be helpful — Preceding unsigned comment added by 209.252.149.162 (talk) 14:10, 1 August 2011 (UTC)

Table of cdfs[edit]

I have moved the recently added table of cdfs to here for discussion/revision. The version added was ...

Distribution Cumulative Density function \! F_X(x)
Binomial B(n, p)   \! \textstyle I_{1-p}(n - k, 1 + k)
Negative binomial NB(r, p)   \! 1 - I_{p}(k+1, r)
Poisson Pois(\lambda)   \! e^{-\lambda} \sum_{i=0}^{k} \frac{\lambda^i}{i!}
Uniform U(a, b)   \! \frac{x-a}{b-a} for x \in (a,b)
Normal N(µ, \sigma^2)   \! \frac12\left[1 + \operatorname{erf}\left( \frac{x-\mu}{\sqrt{2\sigma^2}}\right)\right]
Chi-squared \Chi_k^2   \! \frac{1}{\Gamma(k/2)}\;\gamma(k/2,\,x/2)
Cauchy Cauchy(µ, \theta)   \! \frac{1}{\pi} \arctan\left(\frac{x-x_0}{\gamma}\right)+\frac{1}{2}
Gamma G(k, \theta)   \!\frac{\gamma(k, x/\theta)}{\Gamma(k)}
Exponential Exp(\lambda)   \! 1 - e^{-\lambda x}

There are several problems here, particularly with inconsistent notations. But there are structural problems in defining the cdfs of the discrete distributions, as the formulae given are only valid at the integer points (within the range of the distribution) and would give incorrect values of the cdf at non-integer values. Also several of the functions involved require definitions/wikilinks. So, if the table is to be included, thought needs to be given to possibly dividing it into discrete/continuous tables and/or adding extra columns. Melcombe (talk) 09:25, 21 October 2011 (UTC)

citation needed.... really?[edit]

I think it is a little bit ridiculous to expect a citation that a CDF is càdlàg. It is an almost trivial observation that follows directly from the probability space axioms and the definition of a càdlàg function. Surely this is a routine calculation. --217.84.60.220 (talk) 11:32, 3 November 2012 (UTC)

Please, for didactic, show area relation[edit]

EXAMPLE

http://beyondbitsandatomsblog.stanford.edu/spring2010/files/2010/04/CdfAndPdf.gif — Preceding unsigned comment added by 187.66.187.183 (talk) 07:25, 3 February 2013 (UTC)

CDF is definitely LEFT-continuous.[edit]

CDF must be left-continuous, not right as stated on the wiki page. Source: current ongoing University studies, 3 separate professors, books from 4 different authors. — Preceding unsigned comment added by 213.181.200.159 (talk) 07:46, 6 March 2013 (UTC)

This article follows the convention reached via the link right-continuous. This is "continuous from the right". Perhaps you are thinking of "continuous to the left". 81.98.35.149 (talk) 11:34, 6 March 2013 (UTC)

Not that redundant?[edit]

The passage that I deleted but which was restored said

Point probability
The "point probability" that X is exactly b can be found as
\operatorname{P}(X=b) = F(b) - \lim_{x \to b^{-}} F(x).
This equals zero if F is continuous at x.

However, at the end of the section "Definition" it says

In the case of a random variable X which has distribution having a discrete component at a value x0,
 \operatorname{P}(X=x_0) =F(x_0)-F(x_0-) ,
where F(x0-) denotes the limit from the left of F at x0: i.e. lim F(y) as y increases towards x0.

What I deleted looks identical to that, except that it doesn't include the sentence This equals zero if F is continuous at x.

I propose that we re-delete it but put the last-mentioned sentence into the existing section.

Okay, no problem. Nijdam (talk) 07:08, 19 April 2013 (UTC)

cdf notation[edit]

I went through and changed the notation F_X(x) to F(x) everywhere in the definition section to try to obtain notational consistency through the article, but the change was reverted by Nijdam with edit summary "Difference between cdf of X and just a cdf". But that conflicts with much notation in the article that uses F(x) for the cdf of X. In the Properties section:

the CDF of X will be discontinuous at the points xi and constant in between:
F(x) = \operatorname{P}(X\leq x) = \sum_{x_i \leq x} \operatorname{P}(X = x_i) = \sum_{x_i \leq x} p(x_i).
If the CDF F of X is continuous, then X is a continuous random variable; if furthermore F is absolutely continuous, then there exists a Lebesgue-integrable function f(x) such that
F(b)-F(a) = \operatorname{P}(a< X\leq b) = \int_a^b f(x)\,dx
for all real numbers a and b. The function f is equal to the derivative of F almost everywhere, and it is called the probability density function of the distribution of X.

In the Examples section:

As an example, suppose X is uniformly distributed on the unit interval [0, 1]. Then the CDF of X is given by
F(x) = \begin{cases}
0 &:\ x < 0\\
x &:\ 0 \le x < 1\\
1 &:\ 1 \le x.
\end{cases}

In the Derived functions section:

\bar F(x) = \operatorname{P}(X > x) = 1 - F(x).

In the multivariate case section:

for a pair of random variables X,Y, the joint CDF F is given by
F(x,y) = \operatorname{P}(X\leq x,Y\leq y),

So we need to establish consistency of notation -- either use FX every time we mention a cdf "of X", or else never. Your thoughts? Duoduoduo (talk) 15:08, 17 May 2013 (UTC)

In the literature both F_X and F are used, the latter for ease of notation, and only if there is no confusion about the random variable. In the examples you mention there is not always an inconsistency. In the first one you're right, but if it reads: If the CDF F of X ..., it merely states F_X=F, where F is some specified function. Nijdam (talk) 06:11, 18 May 2013 (UTC)
But there's no confusion anywhere in the article regardless of which is used. So why not use the same one everywhere? Duoduoduo (talk) 12:30, 18 May 2013 (UTC)