Rademacher complexity

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computational learning theory (machine learning and theory of computation), Rademacher complexity, named after Hans Rademacher, measures richness of a class of real-valued functions with respect to a probability distribution.

Given a training sample S=(z_1, z_2, \dots, z_m) \in Z^n, and a hypotheses set \mathcal{H} (where \mathcal{H} is a class of real-valued functions defined on a domain space Z), the empirical Rademacher complexity of \mathcal{H} is defined as:


\widehat{\mathcal{R}}_S(\mathcal{H}) 
= 
\frac{2}{m}
   \mathbb{E} \left[
   \sup_{h \in \mathcal{H}}
   \left|
      \sum_{i=1}^m \sigma_i h(z_i)
   \right| \ \bigg| \ S
\right]

where \sigma_1, \sigma_2, \dots, \sigma_m are independent random variables such that \Pr(\sigma_i = +1) = \Pr(\sigma_i = -1) = 1/2 for i=1,2,\dots,m. The random variables \sigma_1, \sigma_2, \dots, \sigma_m are referred to as Rademacher variables.

Let P be a probability distribution over Z. The Rademacher complexity of the function class \mathcal{H} with respect to P for sample size m is:


\mathcal{R}_m(\mathcal{H}) 
= 
\mathbb{E} \left[ \widehat{\mathcal{R}}_S(\mathcal{H}) \right]

where the above expectation is taken over an identically independently distributed (i.i.d.) sample S=(z_1, z_2, \dots, z_m) generated according to P.

One can show, for example, that there exists a constant C, such that any class of \{0,1\}-indicator functions with Vapnik-Chervonenkis dimension d has Rademacher complexity upper-bounded by C\sqrt{\frac{d}{m}}.

Gaussian complexity[edit]

Gaussian complexity is a similar complexity with similar physical meanings, and can be obtained from the previous complexity using the random variables g_i instead of \sigma_i, where g_i are Gaussian i.i.d. random variables with zero-mean and variance 1, i.e. g_i \sim \mathcal{N} \left( 0, 1 \right).

References[edit]

  • Peter L. Bartlett, Shahar Mendelson (2002) Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. Journal of Machine Learning Research 3 463-482
  • Giorgio Gnecco (2008) Approximation Error Bounds via Rademacher's Complexity. Applied Mathematical Sciences, Vol. 2, 2008, no. 4, 153 - 176