= Hajek projection =

In statistics, Hájek projection of a random variable $T$ on a set of independent random vectors $X_1,\dots,X_n$ is a particular measurable function of $X_1,\dots,X_n$ that, loosely speaking, captures the variation of $T$ in an optimal way. It is named after the Czech statistician Jaroslav Hájek .

== Definition ==
Given a random variable $T$ and a set of independent random vectors $X_1,\dots,X_n$, the Hájek projection $\hat{T}$ of $T$ onto $\{X_1,\dots,X_n\}$ is given by

 $\hat{T} = \operatorname{E}(T) + \sum_{i=1}^n \left[ \operatorname{E}(T\mid X_i) - \operatorname{E}(T)\right] =
\sum_{i=1}^n \operatorname{E}(T\mid X_i) - (n-1)\operatorname{E}(T)$

== Properties ==

- Hájek projection $\hat{T}$ is an $L^2$projection of $T$ onto a linear subspace of all random variables of the form $\sum_{i=1}^n g_i(X_i)$, where $g_i:\mathbb{R}^d \to \mathbb{R}$ are arbitrary measurable functions such that $\operatorname{E}(g_i^2(X_i))<\infty$ for all $i=1,\dots,n$
- $\operatorname{E} (\hat{T}\mid X_i)=\operatorname{E}(T\mid X_i)$ and hence $\operatorname{E}(\hat{T})=\operatorname{E}(T)$
- Under some conditions, asymptotic distributions of the sequence of statistics $T_n=T_n(X_1,\dots,X_n)$ and the sequence of its Hájek projections $\hat{T}_n = \hat{T}_n(X_1,\dots,X_n)$ coincide, namely, if $\operatorname{Var}(T_n)/\operatorname{Var}(\hat{T}_n) \to 1$, then $\frac{T_n-\operatorname{E}(T_n)}{\sqrt{\operatorname{Var}(T_n)}} - \frac{\hat{T}_n-\operatorname{E}(\hat{T}_n)}{\sqrt{\operatorname{Var}(\hat{T}_n)}}$ converges to zero in probability.
