Talk:Asymptotic equipartition property

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated B-class, Mid-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B-Class article B  This article has been rated as B-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 
WikiProject Mathematics (Rated B-class, Mid-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
B Class
Mid Importance
 Field: Applied mathematics

This article is also assessed within the mathematics field Probability and statistics.

Not a Paper[edit]

Shouldn't Wikipedia be an encyclopedia, not a draft for a publication in a mathematics journal? —Preceding unsigned comment added by 131.112.199.214 (talk) 07:36, 23 October 2007 (UTC)

Textual Characterization[edit]

In Cover's seminal work on IT, he uses the following textual characterization of the property (although he states it as a theorem): "Almost all events are almost equally surprising," which I found to be very helpful in interpreting the nature of the property. Could this characterization find a place in the article? If nobody objects, I will put it in, since it could provide a non-expert way of evaluating the content of the theorem and the nature of the typical set for a given source distribution. Eccomi 20:50, 30 June 2010 (UTC) —Preceding unsigned comment added by Eccomi (talkcontribs)

Well, you'd have to explain what surprising means. For me, its almost as if it says "the whole damn thing is dominated by the mean value, which happens to be the completely flat tangent space to the center of the central limit theorem, and what's more, this flat space blows up to occupy the entire volume of the space as N to infty. That is, all the surprising bits don't matter at all, since they're vanishingly unlikely." But that's my layman's intuition. (The "surprising bits vanish"-- this is Cramer's large deviation theorem from large deviations theory.) This "blow up to fill all of space" seems to be a common/recurring phenomenon for infinite-dimensional spaces, and so I suspect that its a so-called "deep result", a kind-of fundamental property of infinite-dimensional cartesian products -- i.e. it holds in some general kind of way, and not just for measure spaces. i.e. it is due to the product topology for infinite products. In particular, note that the Cantor set is the covering space for many/all of these situations; this follows from Maharam's theorem. (Why is this? Because the cantor set is more or less identical/isomorphic to the product topology on the infinite cartesian product, in a certain vague sense. I really should work out the details. Now, the classical construction of the cantor set gives a measure of zero, but in fact, you can construct it with any measure at all. I'm wildly guessing that this measure should be interpreted as being exactly equal to the entropy; that's really what that limit is saying.. Hmm...). But this 'intuitive explanation' may just be my over-active imagination. Anyway, as you can see, the proof, even for the simple cases, is pretty complicated, so exactly what the general, intuitive case would be is not exactly clear. linas (talk) 17:32, 19 July 2012 (UTC)
p.s. Terrence Tao describe the "tensor power trick" as a generic way of taking tensor powers to wipe out unwanted bits in proofs of inequalities. It's essentially a general statement of what's happening here... linas (talk) 05:11, 20 July 2012 (UTC)

Hypothesis/definitions not stated properly[edit]

  • Process limited to duration \{1, \cdots, n\}??
  • p(X_1^n)??

Ideas are thrown without any organization, motivation, explanation... André Caldas (talk) 14:06, 4 January 2011 (UTC)

This notation is commonly used, and is implicitly understood to mean that n to infty. I added a note to state this explicitly. As to 'lack of organization, motivation, explanation' .. please be more specific. The article seems, to me, to be clear, direct, forth-right; it defines all terms, and proceeds in a direction from the simplest case, to the more complicated cases. Yes, I suppose additional introductory/intuitive remarks could be added, as this property/theorem is not very intuitive at all (depending on your background)...linas (talk) 17:50, 19 July 2012 (UTC)

Indeed p(X_1^n) does not make sense here. p is a probability measure, but surely X_1^n is not an event, just part of the stochastic process. Compare, for example: Let Y be an exponentially distributed random variable. What is p(Y)? — Preceding unsigned comment added by 71.194.136.165 (talk) 11:49, 24 September 2013 (UTC)

AEP for discrete-time i.i.d. sources[edit]

Can the author of this part please explain the relationship between the two probability measures p and \Pr?

AEP for discrete-time finite-valued stationary ergodic sources[edit]

Here, some x is defined as follows:

  • Let x denote some measurable set x = X(A) for some AB

Please explain the notation X(A). What exactly is this set, and why is it measurable? — Preceding unsigned comment added by 71.194.136.165 (talk) 11:57, 24 September 2013 (UTC)

Category theory[edit]

It should be noted that, the quantity Gromov defines is not a distance on the collection of all finite probability spaces. Neither is the asymptotic equivalence an equivalence relation for arbitrary sequences of finite probability spaces. It might be an equivalence relation for sequences of powers, there is no proof in Gromov. If it is, it seems to be a non-trivial statement.

Besides there is an obvious typo in Gromov: the denominator of

\frac{\sup_{p\in P^'}|\log p - \log \pi(p)|}{\log \min \left(|set(P)|,|set(Q)|\right)}

should be \log \min \left(|set(P^')|,|set(Q^')|\right). Otherwise one could make anything arbitrary close to anything else by adding many zero-weighted points to both spaces, and this is not what one wants.

(Slava Matveev 194.95.184.62 (talk) 17:21, 27 March 2014 (UTC))