= Deficiency (statistics) =

In statistics, the deficiency is a measure to compare a statistical model with another statistical model. The concept was introduced in the 1960s by the French mathematician Lucien Le Cam, who used it to prove an approximative version of the Blackwell–Sherman–Stein theorem. Closely related is the Le Cam distance, a pseudometric for the maximum deficiency between two statistical models. If the deficiency of a model $\mathcal{E}$ in relation to $\mathcal{F}$ is zero, then one says $\mathcal{E}$ is better or more informative or stronger than $\mathcal{F}$.

== Introduction ==
Le Cam defined the statistical model more abstract than a probability space with a family of probability measures. He also didn't use the term "statistical model" and instead used the term "experiment". In his publication from 1964 he introduced the statistical experiment to a parameter set $\Theta$ as a triple $(X,E,(P_\theta)_{\theta\in\Theta})$ consisting of a set $X$, a vector lattice $E$ with unit $I$ and a family of normalized positive functionals $(P_\theta)_{\theta \in \Theta}$ on $E$. In his book from 1986 he omitted $E$ and $X$.
This article follows his definition from 1986 and uses his terminology to emphasize the generalization.

== Formulation ==
=== Basic concepts ===
Let $\Theta$ be a parameter space. Given an abstract L_{1}-space $(L,\|\cdot\|)$ (i.e. a Banach lattice such that for elements $x,y\geq 0$ also $\|x+y\|=\|x\|+\|y\|$ holds) consisting of lineare positive functionals $\{P_{\theta}:\theta\in\Theta\}$. An experiment $\mathcal{E}$ is a map $\mathcal{E}:\Theta \to L$ of the form $\theta \mapsto P_{\theta}$, such that $\|P_{\theta}\|=1$. $L$ is the band induced by $\{P_{\theta}:\theta\in\Theta\}$ and therefore we use the notation $L(\mathcal{E})$. For a $\mu\in L(\mathcal{E})$ denote the $\mu^{+}=\mu \vee 0=\max(\mu,0)$. The topological dual $M$ of an L-space with the conjugated norm $\|u\|_M=\sup\{|\langle u,\mu\rangle|; \|\mu\|_L\leq 1\}$ is called an abstract M-space. It's also a lattice with unit defined through $I \mu=\|\mu^+\|_L-\|\mu^-\|_L$ for $\mu\in L$.

Let $L(A)$ and $L(B)$ be two L-space of two experiments $A$ and $B$, then one calls a positive, norm-preserving linear map, i.e. $\|T\mu^{+}\|=\|\mu^{+}\|$ for all $\mu\in L(A)$, a transition. The adjoint of a transitions is a positive linear map from the dual space $M_B$ of $L(B)$ into the dual space $M_A$ of $L(A)$, such that the unit of $M_A$ is the image of the unit of $M_B$ ist.

=== Deficiency ===
Let $\Theta$ be a parameter space and $\mathcal{E}:\theta \to P_\theta$ and $\mathcal{F}:\theta \to Q_\theta$ be two experiments indexed by $\Theta$. Le $L(\mathcal{E})$ and $L(\mathcal{F})$ denote the corresponding L-spaces and let $\mathcal{T}$ be the set of all transitions from $L(\mathcal{E})$ to $L(\mathcal{F})$.

The deficiency $\delta(\mathcal{E},\mathcal{F})$ of $\mathcal{E}$ in relation to $\mathcal{F}$ is the number defined in terms of inf sup:
$\delta(\mathcal{E},\mathcal{F}):=\inf\limits_{T\in \mathcal{T}}\sup\limits_{\theta \in \Theta} \tfrac{1}{2}\|Q_{\theta}-TP_{\theta}\|_{\text{TV}},$
where $\|\cdot\|_{\text{TV}}$ denoted the total variation norm $\|\mu\|_{\text{TV}}=\mu^{+}+\mu^{-}$. The factor $\tfrac{1}{2}$ is just for computational purposes and is sometimes omitted.
==== Explanations ====
- $\delta(\mathcal{E},\mathcal{F})=0$ means that there exists a transition $T$ such that $TP_{\theta}=Q_{\theta}$ for all $\theta \in \Theta$.
- The deficiency measures how well $Q_{\theta}$ of $P_{\theta}$ can be approximated by $T$ in the sense of total variation.
- The deficiency is a norm for $Q_{\theta}-TP_{\theta}$.

=== Le Cam distance ===
The Le Cam distance is the following pseudometric
$\Delta(\mathcal{E},\mathcal{F}):= \operatorname{max}\left(\delta(\mathcal{E},\mathcal{F}),\delta(\mathcal{F},\mathcal{E})\right).$
This induces an equivalence relation and when $\Delta(\mathcal{E},\mathcal{F})=0$, then one says $\mathcal{E}$ and $\mathcal{F}$ are equivalent. The equivalent class $C_{\mathcal{E}}$ of $\mathcal{E}$ is also called the type of $\mathcal{E}$.

Often one is interested in families of experiments $(\mathcal{E}_n)_{n}$ with $\{P_{n,\theta}\colon \theta \in \Theta_{n}\}$ and $(\mathcal{F}_n)_{n}$ with $\{Q_{n,\theta}\colon \theta \in \Theta_{n}\}$. If $\Delta(\mathcal{E}_n,\mathcal{F}_n)=0$ as $n\to \infty$, then one says $(\mathcal{E}_n)$ and $(\mathcal{F}_n)$ are asymptotically equivalent.

Let $\Theta$ be a parameter space and $E(\Theta)$ be the set of all types that are induced by $\Theta$, then the Le Cam distance $\Delta$ is complete with respect to $E(\Theta)$. The condition $\delta(\mathcal{E},\mathcal{F})=0$ induces a partial order on $E(\Theta)$, one says $\mathcal{E}$ is better or more informative or stronger than $\mathcal{F}$.

== Bibliography ==
- Le Cam, Lucien. "Asymptotic methods in statistical decision theory"
- Le Cam, Lucien. "Sufficiency and Approximate Sufficiency"
- Torgersen, Erik. "Comparison of Statistical Experiments"
