Talk:Information theory and measure theory

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Entropy as a Measure Theoretic measure[edit]

Where is this in Reza? I could not find it.

pp. 106-108 20:43, 25 August 2007 (UTC)

Possible misstatement[edit]

I am uncomfortable with the phrase 'we find that Shannon's "measure" of information content satisfies all the postulates and basic properties of a formal measure over sets.' This may not be quite correct, as it is a signed measure, as explained below in the article. How should it be better worded? -- 21:18, 20 June 2006 (UTC)

signed measure is still a measure, so if that's the only objection, it should be ok. on the other hand, the section title suggests entropy is a measure, that doesn't seem right. Mct mht 02:49, 21 June 2006 (UTC)
Not as defined in Measure (mathematics). There a "measure" is defined clearly as non-negative. The trouble is that that two rvs that are unconditionally independent can become conditionally dependent given a third rv. An example is given in the article. Maybe we should start calling it a signed measure right off the bat. A measure is normally assumed positive if not specified otherwise. -- 19:09, 21 June 2006 (UTC)
Measure (mathematics) does say signed measure is a measure, as it should. similarly, one can talk about complex measures or operator valued measures. but yeah, specify that it is a signed measure is a good idea. Mct mht 19:44, 21 June 2006 (UTC)

in the same vein, i think the article confuses measure in the sense of information theory with measure in the sense of real analysis in a few places. Mct mht 03:02, 21 June 2006 (UTC)

There are two different senses of "measure" in the article. One is the abstract measure over sets which forms the analogy with joint entropy, conditional entropy, and mutual information. The other is the measures over which one integrates in the various formulas of information theory. Where is the confusion? -- 04:16, 21 June 2006 (UTC)

some language in the section is not clear:

If we associate the existence of sets and with arbitrary discrete random variables X and Y, somehow representing the information borne by X and Y, respectively, such that: whenever X and Y are independent, and...

Associate sets to random variables how? are they the supports of the random variables? what's the σ-algebra? what's meant by two random variable being independent? Mct mht 03:14, 21 June 2006 (UTC)

Just pretend that those sets exist. They are not the supports of the random variables. The sigma-algebra is the algebra generated by the operations of countable set union and intersection on those sets. See statistical independence. -- 03:34, 21 June 2006 (UTC)
I mean unconditionally independent. -- 03:49, 21 June 2006 (UTC)

so given a family of random variables, one associates, somehow, a family of sets. the σ-algebra is the one generated by these family (in the same way the open sets generate the Borel σ-algebra), or does one assume that, somehow, the family is already a σ-algebra? also the section seems to imply the Shannon entropy is a measure on the said σ-algebra, is that correct?
Yes. -- 04:32, 21 June 2006 (UTC)
then there seems to be, at least, two ways measure theory is applied in this context. first, in the sense that entropy is a measure on some, undefined, sets corresponding to random variables. second, one can talk about random variables on a fixed measure space, and define information theoretic objectes in terms of the given measure. that a fair statement? Mct mht 04:14, 21 June 2006 (UTC)
The σ-algebra is the one generated by the family of sets. (They are not already a σ-algebra.) And I believe that is a fairly reasonable statement if I understand it right. -- 04:29, 21 June 2006 (UTC)
There's still a lot of explaining to do; that's why the article has the expert tag. -- 04:32, 21 June 2006 (UTC)
thanks for the responses. Mct mht 04:35, 21 June 2006 (UTC)
We really do need expert help, though. -- 05:30, 21 June 2006 (UTC)

Kullback–Leibler divergence[edit]

Also, the Kullback–Leibler divergence should be explained here in a proper measure-theoretic framework. -- 21:27, 20 June 2006 (UTC)

Mis-statement of entropy.[edit]

I am deeply suspicious of the defintions given here; I think they're wrong. Normally, given a collection of measureable sets \{X_i\}, the entropy is defined by

S=-\sum_i \mu(X_i) \log \mu(X_i)

Its critical to have the logarithm in there, otherwise things like the partition function (statistical mechanics) and etc. just fail to work correctly. See, for example, information entropy.

Also, signed measures are bizarre, and you shoul avoid using them if you cannot explain why they are needed in the theory.

Also, I assumed that when this article said "H(X)", I assumed it meant "entropy", but on second reading, I see that it does not actually say what the symbols H(X) and I(X) are, or what they mean. Without defining the terms, the article is unreadable/subject to (mis-)interpretation, as perhaps I'm doing ??

Finally, I'm unhappy that this article blatently contradicts the article on random variable as to the defintion of what a random variable is. This article states that a random variable is a "set", and that is most certainly not how I understand random variables.

linas 23:54, 22 June 2006 (UTC)

It's been over a year now and this article still needs a major rewrite to clear up the confusion. I need some help here. -- 19:37, 25 August 2007 (UTC)
Specifically, the "Other measures in information theory" section is actually more fundamental than the sections that come before it, so that could be one source of confusion. 20:40, 25 August 2007 (UTC)

Main ideas[edit]

Integration with respect to various measures is one of the main ideas of this article as it stands now. It ties together differential entropy, discrete entropy, and K–L divergence. The second main idea is from Reza pp. 106-108 where it is called a "set-theoretic" interpretation of mutual information, conditional entropy, joint entropy, and so forth. (But Reza very clearly discusses measure there in roughly the way discussed in that section of this article.) There might be yet more main ideas to discuss in this article as well as references to add and clarifications to make. 17:21, 7 September 2007 (UTC)

Misattribution of credit?[edit]

I strongly suggest that R. Yeung, and not Fazlollah M. Reza be the primary reference cited. Specifically, I recommend the reference "A new outlook on Shannon's information measures", Information Theory, IEEE Transactions on, 1991 vol. 37 (3) pp. 466 - 474

First, in Yeung's paper pg 467, he says "The use of diagrams to represent the relation among Shannon's information measures has been suggested by Reza [2], Abramson [3], Dyckman [5], and Papoulis [15]." Yeung's paper constitutes a proof of (good) intuition by previous authors (including Reza).

Second, Reza pg 108, makes a serious misstatement that seems to indicate that the graphical interpretation of Shannon's measures was not completely thought out, and certainly not thought out in terms of a (signed) measure. "When two variables $X_k$ and $X_j$ are independent, their representative sets will be mutually exclusive." Yeung says on pg 469 "It was incorrectly pointed out in [2] [Reza] that when two random variables are independent, the corresponding set variables are disjoint."

On pg 470-471, Yeung cites a "classical example (see Gallager [6])" where three random variables, $X$, $Y$, $Z$, have vanishing mutual informations between each pair, and so are independent, yet cannot be represented as a trio without intersecting set variables. This is one of several illustrations that Yeung has mastered this material.

Further, Yeung discusses: 2 variables, 3 variables, Markov chains; bounds on certain quantities for some cases; the "non-intuitive" quantity I(X;Y;Z); and in a rigorous, yet transparent style. —Preceding unsigned comment added by Mohnjahoney (talkcontribs) 22:00, 18 June 2009 (UTC)

My apologies for not signing. Mohnjahoney (talk) 22:04, 18 June 2009 (UTC)

measure versus content[edit]

The article claimes Shannon entropy has the typical properties of a countably-additive measure. However, only the finite additivity property is ever touched upon. The difference between finite additivity (content) and countable additivity (measure) is very decisive in measure theory: almost nothing works with contents. Some justification needs to be given for the word "measure", or the entire section should be deleted. -- (talk) 00:04, 13 May 2012 (UTC)