Talk:Independence (probability theory)

Statistics C‑class High‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
C	This article has been rated as C-class on Wikipedia's content assessment scale.
High	This article has been rated as High-importance on the importance scale.

Mathematics C‑class High‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
C	This article has been rated as C-class on Wikipedia's content assessment scale.
High	This article has been rated as High-priority on the project's priority scale.

What does this notation mean?

In the definition of independent rv's the notation [X ≤ a] and [Y ≤ b] are used. What does that mean? I don't think I've seen it. - Taxman 14:18, Apr 28, 2005 (UTC)

It refers to the event of the random variable X taking on a value less than a. --MarkSweep 16:39, 28 Apr 2005 (UTC)

I am puzzled. If you didn't understand this notation, what did you think was the definition of independence of random variables? Michael Hardy 19:20, 28 Apr 2005 (UTC)

Well actually that meant I didn't understand it. I had learned it in the past and was familiar with the general concept only. I think we always used P(X≤ a) for consistency, but I could be wrong. In any case the switch from the P notation in the section before to this notation could use some explanation for the village idiots like me. :) I'll see what I can do without adding clutter. - Taxman 20:26, Apr 28, 2005 (UTC)

It's not a switch in notation at all: it's not two different notations for the same thing; it's different notations for different things. The notation [X ≤ a] does not mean the probability that X ≤ a; it just means the event that X ≤ a. The probability that X ≤ a would still be denoted P(X ≤ a) or P[X ≤ a] or Pr(X ≤ a) or the like. Michael Hardy 20:51, 28 Apr 2005 (UTC)

I just wrote an edit summary that says:

No wonder User:taxman was confused: the statement here was simply incorrect. I've fixed it by changing the word "probability" to "event". Michael Hardy 20:58, 28 Apr 2005 (UTC)

Then I looked at the edit history and saw that Taxman was the one who inserted the incorrect language. So that really doesn't explain it. Michael Hardy 20:56, 28 Apr 2005 (UTC)

... and it does not make sense to speak of two probabilities being independent of each other. Events can be independent of each other, and random variables can be independent of each other, but probabilities cannot. Michael Hardy 20:58, 28 Apr 2005 (UTC)

About a role of probability measure in property of independence of events

Definition of independence of events is entered into probability theory only for the fixed probability $\mathbf {P}$ . It means, that two events can be independent concerning one probability and to not be independent concerning another. We shall illustrate this opportunity on the simple example connected with the scheme of Bernoulli, i.e. with repeated independent trials, each of which has two outcomes $S$ and $F$ (' success ' and ' failure '). We shall assume, that $\mathbf {P} (S)=p$ , where $0\leq p\leq 1,$ then $\mathbf {P} (F)=1-p.$ We shall lead three independent trials and we shall consider events.

A=\{

no more than one succes

\},

B=\{

all identical outcomes

\}.

It is obvious, that

A=\{FFF,\ FFS,\ FSF,\ SFF\},\ \ \ \

B=\{FFF,\ SSS\}.

Hence,

\mathbf {P} (A)=p^{3}+3p^{2}(1-p),\ \ \ \mathbf {P} (B)=p^{3}+(1-p)^{3},\ \ \ \mathbf {P} (AB)=p^{3}.

Now it is easy to see, that equality

\mathbf {P} (A)\mathbf {P} (B)=\mathbf {P} (AB)\ \ \ \ \ \ \ \ \ \ \ \ (1)

takes place in trivial cases $p=0,\ p=1,$ and also at $p=1/2,$ , i.e. also in a symmetric case.

Attention, a conclusion: it means, that the "same" events $A$ and $B$ are independent only at $p=0,\ p=1,\ p=1/2,$ , for other values of probability of success they are not independent.

Examples of the delusion

Misunderstanding of this fact generates delusion, that in concept of independence of events of probability theory something is put greater, than a trivial relation (1). Behind examples of this delusion far to go it is not necessary. It is enough to come on a page Statistical independence and to look at a preamble:

"In probability theory, to say that two events are independent intuitively means that knowing whether one of them occurs makes it neither more probable nor less probable that the other occurs. For example, the event of getting a "1" when a die is thrown and the event of getting a "1" the second time it is thrown are independent. Similarly, when we assert that two random variables are independent, we intuitively mean that knowing something about the value of one of them does not yield any information about the value of the other. For example, the number appearing on the upward face of a die the first time it is thrown and that appearing the second time are independent."

Like from the point of view of common sense all is correct, but only not from the point of view of probability theory where as independence of two events it is understood nothing, except for relation (1). Probably it is the original of the Russian version. The original favourably differs from Russian tracing-paper with one detail: intuitively. But this pleasant detail cannot rescue the position: in probability theory anything greater than the relation (1) is not put in concept of independence of events even at an intuitive level.

The same history and almost with all 'another language' mirrors of this English page. The Russian page ru:Независимость (теория вероятностей) is a tracing-paper from the English original. Here is a preamble of Russian page:

"В теории вероятностей случайные события называются независимыми, если знание того, случилось ли одно из них, не несёт никакой дополнительной информации о другом. Аналогично, случайные величины называют независимыми, если знание того, какое значение приняла одна из них, не даёт никакой дополнительной информации о возможных значениях другой."

From eight "mirrors" in different languages only two mirrors have avoided this delusion: Italians and Poles, and that because have written very short papers containing only the relation (1), dexterously and successfully having avoided comments.

References

Stoyanov Jordan (1999) Counterexamples in probability theory.—, Publishing house " Factorial ". — 288с. ISBN 5-88688-041-0 (p. 29-30).

- Helgus 23:58, 25 May 2006 (UTC)[reply]

The only thing, that confuses me in the text of definition, is a one detail – using terms “knowledge”, “knowing” and “information”. These terms do not belong to concepts of probability theory, they are not defined in this theory. In my humble opinion the independence of events is better for defining without these terms so: “two events are independent intuitively means that occurring one of them [say A] makes it neither more probable nor less probable that the other [say B] occurs”. And the independence of random variables is better for defining so: “two random variables are independent intuitively means that the value of one of them (say xi) makes it neither more probable nor less probable any values of the other [say xi’]”. - Helgus 23:49, 26 May 2006 (UTC)[reply]

Correction ?

Reading the arcticle I feel that the paragraphs for continuous and discrete cases are mixed in the section "Conditionally independent random variables". Formula with inequalities pertains to the continuous case, while formula with equality pertains to discrete case. 134.157.16.38 08:03, 19 June 2007 (UTC) Francis Dalaudier.[reply]

independent vs. uncorellated

We had better explain the relationship between these two concepts here. They are often confusing for me. Jackzhp (talk) 14:17, 17 April 2008 (UTC)[reply]

I remember I wrote something like the following on wiki, but I can not find them:

Some cases, no correlation implies independence, but usually, no correlation does not independence. While independence always implies no correlation.
the cases are binomial (Bernoulli) and Gaussian.

Jackzhp (talk) 16:17, 16 April 2010 (UTC)[reply]

Yes, for two two-valued random variables no correlation implies independence. (For three such variables it implies pairwise independence). Also for jointly Gaussian. (But "jointly" is essential.) --Boris Tsirelson (talk) 18:43, 2 April 2011 (UTC)[reply]

External link

Recently added random.org seems to be not very pertinent to independence. The link is already present in the randomness article, where it really belongs. I think it may be deleted from independence (on reflection, it may even create a confusion what independence really is). ptrf (talk) 09:19, 27 February 2009 (UTC)[reply]

Independent σ-algebras

Near the end of the Independent σ-algebras section there was the following text:

"Using this definition, it is easy to show that if X and Y are random variables and Y is Pr-almost surely constant, then X and Y are independent, since the σ-algebra generated by an almost surely constant random variable is the trivial σ-algebra {∅, Ω}."

That proof is correct if Y is always constant but not in the stated case of Pr-almost surely constant. The difficulty is that the definition of σ-algebra generated by a random variable does not exclude probability-zero events (as it must since it is defined irrespective of the probability).

I did a quick fix, but my fix could use some polishing.

--67.87.221.55 (talk) 16:11, 2 April 2011 (UTC)[reply]

Thank you for the fix. --Boris Tsirelson (talk) 18:37, 2 April 2011 (UTC)[reply]

X ≤ a

The article speaks of the probability of event X ≤ a. But wouldn't this mean that P(X≤a) = a/∞, as there isn't stated a maximum value b? Also, this does not take care of the fact that what is less than a, might also be a negative number (since 0 is not denoted as a minimum value), stretching to infinity that way too. In that case, it seems to me that P(X≤a) = (∞+a)/∞. This is confusing, and impossible, as 1) it is a fraction with two infinities (not sure if that's the correct plural), 2) it's larger than one and 3) what is infinity plus a? I might be very wrong now, but this is what it looks like to me. --80.212.16.200 (talk) 15:24, 26 December 2011 (UTC)[reply]

Yes, you are very wrong now; :-) see Probability distribution. Boris Tsirelson (talk) 16:23, 26 December 2011 (UTC)[reply]

Definition of random variable

Before changing I would like to get someone's opinion on this. I think that sentence "Two random variables X and Y are independent iff the π-system generated by them are independent;" is lacking in an important way, that is, it silently assumes that X and Y are defined on the same probabilistic space. I suggest simply adding this (Two random variables X and Y defined on the same probabilistic space are independent ...). Not only this would make mathematical sense, but it could also help (in a distant way) understanding that one cannot speak about dependence or independence of say, number of eyes in each of two rolled dice, without constructing a joint probabilistic space. — Preceding unsigned comment added by 193.219.42.50 (talk) 15:41, 4 April 2013 (UTC)[reply]

First, yes, they are supposed to be defined on the same probability space (which is usually assumed by default in probability theory). However, why to start emphasizing it only on that place? The same holds already for two events.

Second, the phrase you quote is quite bad for another reason. There is a notion "σ-algebra generated by a random variable", but "π-system generated by a random variable" is a bad neologism; the system of events {X ≤ a} is just one out of many π-systems.

Boris Tsirelson (talk) 18:39, 4 April 2013 (UTC)[reply]

Good point. Should I take your answer as a vote for adding this assumption for events as well as random variables, or do you think that neither is necessary? I also agree with your remark on σ-algebra, we can change it unless someone wants to defend that choice. — Preceding unsigned comment added by 193.219.42.50 (talk) 08:37, 5 April 2013 (UTC)[reply]

Yes, early in the article note that everything happens on the same prob space (unless indicated otherwise), I think so. Boris Tsirelson (talk) 09:15, 5 April 2013 (UTC)[reply]

if and only if

Any thoughts on whether the use of "if and only if" is a bit misleading in the first definition section under "For events: Two events"?

Seems like iff may confuse people when they need to make the jump to more than two events. The distinct pairs requirement is elaborated in the "More than two events" section.

Is it necessary for all uses of "iff" link to its page? Doesn't really affect much of anything, just a question. Thehockeydude44 (talk) 23:31, 15 September 2013 (UTC)[reply]

Crazy terminology

"independent (alternatively statistically independent, stochastically independent, marginally independent or absolutely independent" — really? What could all that mean? What for? Does it really happen "In probability theory", or maybe in other disciplines that use probabilistic notions in their different frames? "Marginally independent" in constrast to what? Can they be "non-marginally independent"? "Non-absolutely independent"? Boris Tsirelson (talk) 11:46, 9 September 2013 (UTC)[reply]

I agree. Stochastically independent or statistically independent are synonyms not different categories, so "statistical" here may be used in some domains to emphasise the statistical nature. But "absolutely" and "marginally" are uncommon and should not be on top of the page, perhaps somewhere else. Limit-theorem (talk) 00:31, 10 September 2013 (UTC)[reply]

Actually marginal independence has a different (related) meaning, see here. The redundant word "absolutely" is sometimes added to emphasise the absence of the word "conditionally"; I don't know if there is another meaning. I agree these shouldn't be in the lead, though mentioning them somewhere later would be ok. McKay (talk) 05:33, 10 September 2013 (UTC)[reply]

To emphasise the absence of the word "conditionally" I'd say "(unconditionally) independent". Boris Tsirelson (talk) 05:42, 10 September 2013 (UTC)[reply]

Maybe a section dedicated to implications of independence?

Independent random variables and events have several implications. I was thinking this article could benefit from a few basic implications of independence. Obviously there are more than the following.

One example, if events are mutually independent, then their complements are mutually independent. By definition of mutually independent, any pair should also be independent. Consider

P({\overline {A_{1}}}\cap {\overline {A_{2}}})=P({\overline {A_{1}\cup A_{2}}})

P({\overline {A_{1}}}\cap {\overline {A_{2}}})=1-P(A_{1}\cup A_{2}).

Using the addition rule, $P(A_{1}\cup A_{2})=P(A_{1})+P(A_{2})-P(A_{1}\cap A_{2})$

P({\overline {A_{1}}}\cap {\overline {A_{2}}})=1-P(A_{1})-P(A_{2})+P(A_{1}\cap A_{2}).

Using the property of independence

P({\overline {A_{1}}}\cap {\overline {A_{2}}})=1-P(A_{1})-P(A_{2})+P(A_{1})P(A_{2}).

Take the complements of the probabilities on the right side

P({\overline {A_{1}}}\cap {\overline {A_{2}}})=1-(1-P({\overline {A_{1}}}))-(1-P({\overline {A_{2}}}))+(1-P({\overline {A_{1}}}))(1-P({\overline {A_{2}}})).

Simplify

P({\overline {A_{1}}}\cap {\overline {A_{2}}})=P({\overline {A_{1}}})P({\overline {A_{2}}}).

We will use this in a recursive fashion to build the next solution. We do this by replacing (in a copy-paste style) $A_{2}$ with $A_{2}\cup A_{3}$ and using the properties of mutually independent. The original proof remains valid because of mutual independence.

Replace the left side.

P({\overline {A_{1}}}\cap {\overline {A_{2}\cup A_{3}}})=P({\overline {A_{1}}}\cap {\overline {A_{2}}}\cap {\overline {A_{3}}}).

Doing the same to the right side,

P({\overline {A_{1}}}\cap {\overline {A_{2}\cup A_{3}}})=P({\overline {A_{1}}})P({\overline {A_{2}}}\cap {\overline {A_{3}}}).

Using the result from the first iteration,

P({\overline {A_{1}}}\cap {\overline {A_{2}\cup A_{3}}})=P({\overline {A_{1}}})P({\overline {A_{2}}})P({\overline {A_{3}}}).

Combining the new left side with the new right side,

P({\overline {A_{1}}}\cap {\overline {A_{2}}}\cap {\overline {A_{3}}})=P({\overline {A_{1}}})P({\overline {A_{2}}})P({\overline {A_{3}}}).

This above can easily be extended to an inductive proof.

As a corollary of the above property we can derive an "OR" instead of "AND" for mutually independent events. Start with

P\left(\bigcap _{i=1}^{n}{A_{i}}\right)=\prod _{i=1}^{n}P({A_{i}}).

Replace all A_i with their complements

P\left(\bigcap _{i=1}^{n}{\overline {A_{i}}}\right)=\prod _{i=1}^{n}P({\overline {A_{i}}}).

Take the complements

1-P\left({\overline {\bigcap _{i=1}^{n}{\overline {A_{i}}}}}\right)=\prod _{i=1}^{n}1-P(A_{i}).

1-P\left(\bigcup _{i=1}^{n}A_{i}\right)=\prod _{i=1}^{n}1-P(A_{i}).

P\left(\bigcup _{i=1}^{n}A_{i}\right)=1-\prod _{i=1}^{n}1-P(A_{i}).

Even though these properties I derive here are basic, I think it would be best to find an academic source to cite. A source would likely have many other properties. Otherwise this runs the risk of being declared original research.Mouse7mouse9 22:39, 1 May 2015 (UTC) — Preceding unsigned comment added by Mouse7mouse9 (talk • contribs)

Yes... All that can be compressed to the joint distribution of (two or more) indicators (of the independent events). The indicators are independent random variables with two values 0,1 ("binary"); their joint distribution is the product of the marginals; each marginal is just (p,1-p) (but we have two or more numbers p). For n events we have 2ⁿ points, with their (product) probabilities, and 2^2ⁿ events; some of these are more notable than others, of course. Boris Tsirelson (talk) 09:10, 2 May 2015 (UTC)[reply]

Temporal independence

Is there a term to describe that some event's probability is independent of time? As an example, the probability of a radioactive atom decaying does not depend on how long it has been waiting. In the case of discrete events, such as dice throws or coin tosses, you say that the events are independent. Seems to me it needs a different term for the continuous case. Gah4 (talk) 21:42, 12 October 2016 (UTC)[reply]

"probability independent of time" may refer to "stationary process" or "memorylessness". Boris Tsirelson (talk) 06:46, 13 October 2016 (UTC)[reply]

That is interesting. There is in quantum mechanics stationary state which is similar to your stationary process, which I could have thought of. But it is not a stationary state because it makes one transition, but no more. (In theory it is reversible, but the probability is pretty much zero.) But I think your memorylessness is what I was asking about. Some event happens, with probability independent of how long you have waited. Thanks! Gah4 (talk) 07:59, 13 October 2016 (UTC)[reply]

Pairwise vs mutual example

The XOR of two unbiased independent random bits is a third unbiased random bit, independent of either but not both of the other two. I'm not saying the (6,9,9,6,4,1,1,4) example is wrong, but it doesn't seem to add anything. --God made the integers (talk) 20:18, 14 January 2017 (UTC)[reply]