Conditional probability distribution
| This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (March 2009) |
Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value. If the conditional distribution of Y given X is a continuous distribution, then its probability density function is known as the conditional density function.
The properties of a conditional distribution, such as the moments, are often called by corresponding names such as the conditional mean and conditional variance.
Contents |
[edit] Discrete distributions
For discrete random variables, the conditional probability mass function of Y given (the occurrence of) the value x of X, can be written, using the definition of conditional probability, as:
As seen from the definition, and due to its occurrence, it is necessary that P(X = x) > 0.
The relation with the probability distribution of X given Y is:
[edit] Continuous distributions
Similarly for continuous random variables, the conditional probability density function of Y given (the occurrence of) the value x of X, can be written as
where fX,Y(x, y) gives the joint density of X and Y, while fX(x) gives the marginal density for X. Also in this case it is necessary that fX(x) > 0.
The relation with the probability distribution of X given Y is given by:
The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: Borel's paradox shows that conditional probability density functions need not be invariant under coordinate transformations.
[edit] Relation to independence
Random variables X, Y are independent if and only if the conditional distribution of Y given X is equal to the unconditional distribution of Y. For discrete random variables: P(Y = y | X = x) = P(Y = y) for all relevant x and y. For continuous random variables having a joint density: fY(y | X=x) = fY(y) for all relevant x and y.
[edit] Properties
Seen as a function of y for given x, P(Y = y | X = x) is a probability and so the sum over all y (or integral if it is a conditional probability density) is 1. Seen as a function of x for given y, it is a likelihood function, so that the sum over all x need not be 1.



