Jump to content

Sanov's theorem: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Benni678 (talk | contribs)
definition
 
Benni678 (talk | contribs)
No edit summary
Line 6: Line 6:


where
where
* <math>q^n(x^n)</math> is shorthand for <math>q(x_1)q(x_2)...q(x_n)</math> and <math>q^n(S)</math> is shorthand for <math>\sum_{x^n\in S}q^n(x^n)</math>,
* <math>p^*</math> is the [[information projection]] of ''q'' onto ''A'',
* <math>q^n(x^n)</math> is shorthand for <math>q(x_1)q(x_2)...q(x_n)</math>, and
* <math>\hat{p}_{x^n}</math> is the [[Empirical distribution function|empirical distribution]] of the sample <math>x^n</math>, and
* <math>\hat{p}_{x^n}</math> is the [[Empirical distribution function|empirical distribution]] of the sample <math>x^n</math>.
* <math>p^*</math> is the [[information projection]] of ''q'' onto ''A''.


Furthermore, if ''A'' is a [[closed set]],
Furthermore, if ''A'' is a [[closed set]],

Revision as of 02:59, 25 April 2011

In information theory, Sanov's theorem gives a bound on the probability of observing an atypical sequence of samples from a given probability distribution.

Let A be a set of probability distributions over an alphabet X, and let q be an arbitrary distribution over X (where q may or may not be in A). Suppose we draw n i.i.d. samples from q, represented by the vector . Then

,

where

  • is shorthand for and is shorthand for ,
  • is the empirical distribution of the sample , and
  • is the information projection of q onto A.

Furthermore, if A is a closed set,

.