# Neyman–Pearson lemma

In statistics, the Neyman–Pearson lemma was introduced by Jerzy Neyman and Egon Pearson in a paper in 1933.[1] The Neyman-Pearson lemma is part of the Neyman-Pearson theory of statistical testing, which introduced concepts like errors of the second kind, power function, and inductive behavior.[2][3][4] The previous Fisherian theory of significance testing postulated only one hypothesis. By introducing a competing hypothesis, the Neyman-Pearsonian flavor of statistical testing allows investigating the two types of errors. The trivial cases where one always rejects or accepts the null hypothesis are of little interest but it does prove that one must not relinquish control over one type of error while calibrating the other. Neyman and Pearson accordingly proceeded to restrict their attention to the class of all ${\displaystyle \alpha }$ level tests while subsequently minimizing type II error, traditionally denoted by ${\displaystyle \beta }$. Their seminal paper of 1933, including the Neyman-Pearson lemma, comes at the end of this endeavor, not only showing the existence of tests with the most power that retain a prespecified level of type I error (${\displaystyle \alpha }$), but also providing a way to construct such tests. The Karlin-Rubin theorem extends the Neyman-Pearson lemma to settings involving composite hypotheses with monotone likelihood ratios.

## Proposition

Consider a test with hypotheses ${\displaystyle H_{0}:\theta =\theta _{0}}$ and ${\displaystyle H_{1}:\theta =\theta _{1}}$, where the probability density function (or probability mass function) is ${\displaystyle f({\boldsymbol {x}}\mid \theta _{i})}$ for ${\displaystyle i=0,1}$. Denoting the rejection region by ${\displaystyle R}$, the Neyman-Pearson lemma states that a most powerful (MP) test satisfies the following: for some ${\displaystyle \eta \geq 0}$,

• ${\displaystyle {\boldsymbol {x}}\in R}$ if ${\displaystyle f({\boldsymbol {x}}\mid \theta _{1})>\eta f({\boldsymbol {x}}\mid \theta _{0})}$,
• ${\displaystyle {\boldsymbol {x}}\in R^{c}}$ if ${\displaystyle f({\boldsymbol {x}}\mid \theta _{1})<\eta f({\boldsymbol {x}}\mid \theta _{0})}$,
• ${\displaystyle \mathbb {P} _{\theta _{0}}({\boldsymbol {X}}\in R)=\alpha }$ for a prefixed significance level ${\displaystyle \alpha }$.

Also, if there is at least one MP test that satisfies the two conditions, the Neyman-Pearson lemma states that every existing ${\displaystyle \alpha }$-level MP test should obey the likelihood ratio inequalities. Note that the most powerful test may not always be unique as can be inferred from the lemma. In fact, it may not exist at all.[5]

In practice, the likelihood ratio is often used directly to construct tests — see likelihood-ratio test. However it can also be used to suggest particular test-statistics that might be of interest or to suggest simplified tests — for this, one considers algebraic manipulation of the ratio to see if there are key statistics in it related to the size of the ratio (i.e. whether a large statistic corresponds to a small ratio or to a large one).

## Proof

Define the rejection region of the null hypothesis for the Neyman–Pearson (NP) test as

${\displaystyle R_{\text{NP}}=\left\{x:{\frac {{\mathcal {L}}(\theta _{0}\mid x)}{{\mathcal {L}}(\theta _{1}\mid x)}}\leqslant \eta \right\}}$

where ${\displaystyle \eta }$ is chosen so that ${\displaystyle \operatorname {P} (R_{\text{NP}}\mid \theta _{0})=\alpha \,.}$

Any alternative test will have a different rejection region that we denote by ${\displaystyle R_{\text{A}}}$.

The probability of the data falling within either region ${\displaystyle R=R_{\text{A}}}$ or ${\displaystyle R=R_{\text{NP}}}$ given parameter ${\displaystyle \theta }$ is

${\displaystyle \operatorname {P} (R\mid \theta )=\int _{R}{\mathcal {L}}(\theta \mid x)\,\operatorname {d} x\,.}$

For the test with critical region ${\displaystyle R_{\text{A}}}$ to have significance level ${\displaystyle \alpha }$, it must be true that ${\displaystyle \alpha \geqslant \operatorname {P} (R_{\text{A}}\mid \theta _{0})}$, hence

${\displaystyle \alpha =\operatorname {P} (R_{\text{NP}}\mid \theta _{0})\geqslant \operatorname {P} (R_{\text{A}}\mid \theta _{0})\,.}$

It will be useful to break these down into integrals over distinct regions:

{\displaystyle {\begin{aligned}\operatorname {P} (R_{\text{NP}}\mid \theta )&=\operatorname {P} (R_{\text{NP}}\cap R_{\text{A}}\mid \theta )+\operatorname {P} (R_{\text{NP}}\cap R_{\text{A}}^{c}\mid \theta )\\\operatorname {P} (R_{\text{A}}\mid \theta )&=\operatorname {P} (R_{\text{NP}}\cap R_{\text{A}}\mid \theta )+\operatorname {P} (R_{\text{NP}}^{c}\cap R_{\text{A}}\mid \theta )\end{aligned}}}

where ${\displaystyle R^{c}\equiv \{x:x\notin R\}}$ is the complement of region R. Setting ${\displaystyle \theta =\theta _{0}}$, these two expressions and the above inequality yield that

${\displaystyle \operatorname {P} (R_{\text{NP}}\cap R_{\text{A}}^{c}\mid \theta _{0})\geqslant P(R_{\text{NP}}^{c}\cap R_{\text{A}}\mid \theta _{0})\,.}$

The powers of the two tests are ${\displaystyle \operatorname {P} (R_{\text{NP}}\mid \theta _{1})}$ and ${\displaystyle \operatorname {P} (R_{\text{A}}\mid \theta _{1})}$, and we would like to prove that:

${\displaystyle \operatorname {P} (R_{\text{NP}}\mid \theta _{1})\geqslant \operatorname {P} (R_{\text{A}}\mid \theta _{1})}$

However, as shown above this is equivalent to:

${\displaystyle \operatorname {P} (R_{\text{NP}}\cap R_{\text{A}}^{c}\mid \theta _{1})\geqslant \operatorname {P} (R_{\text{NP}}^{c}\cap R_{\text{A}}\mid \theta _{1})}$

in what follows we show that the above inequality holds:

{\displaystyle {\begin{aligned}\operatorname {P} (R_{\text{NP}}\cap R_{\text{A}}^{c}\mid \theta _{1})&=\int _{R_{\text{NP}}\cap R_{\text{A}}^{c}}{\mathcal {L}}(\theta _{1}\mid x)\,\operatorname {d} x\\[4pt]&\geqslant {\frac {1}{\eta }}\int _{R_{\text{NP}}\cap R_{\text{A}}^{c}}{\mathcal {L}}(\theta _{0}\mid x)\,\operatorname {d} x&&{\text{by definition of }}R_{\text{NP}}{\text{ this is true for its subset}}\\[4pt]&={\frac {1}{\eta }}\operatorname {P} (R_{\text{NP}}\cap R_{\text{A}}^{c}\mid \theta _{0})&&{\text{by definition of }}\operatorname {P} (R\mid \theta )\\[4pt]&\geqslant {\frac {1}{\eta }}\operatorname {P} (R_{\text{NP}}^{c}\cap R_{\text{A}}\mid \theta _{0})\\[4pt]&={\frac {1}{\eta }}\int _{R_{\text{NP}}^{c}\cap R_{\text{A}}}{\mathcal {L}}(\theta _{0}\mid x)\,\operatorname {d} x\\[4pt]&>\int _{R_{\text{NP}}^{c}\cap R_{\text{A}}}{\mathcal {L}}(\theta _{1}\mid x)\,\operatorname {d} x&&{\text{by definition of }}R_{\text{NP}}{\text{ this is true for its complement and complement subsets}}\\[4pt]&=\operatorname {P} (R_{\text{NP}}^{c}\cap R_{\text{A}}\mid \theta _{1})\end{aligned}}}

## Example

Let ${\displaystyle X_{1},\dots ,X_{n}}$ be a random sample from the ${\displaystyle {\mathcal {N}}(\mu ,\sigma ^{2})}$ distribution where the mean ${\displaystyle \mu }$ is known, and suppose that we wish to test for ${\displaystyle H_{0}:\sigma ^{2}=\sigma _{0}^{2}}$ against ${\displaystyle H_{1}:\sigma ^{2}=\sigma _{1}^{2}}$. The likelihood for this set of normally distributed data is

${\displaystyle {\mathcal {L}}\left(\sigma ^{2}\mid \mathbf {x} \right)\propto \left(\sigma ^{2}\right)^{-n/2}\exp \left\{-{\frac {\sum _{i=1}^{n}(x_{i}-\mu )^{2}}{2\sigma ^{2}}}\right\}.}$

We can compute the likelihood ratio to find the key statistic in this test and its effect on the test's outcome:

${\displaystyle \Lambda (\mathbf {x} )={\frac {{\mathcal {L}}\left({\sigma _{0}}^{2}\mid \mathbf {x} \right)}{{\mathcal {L}}\left({\sigma _{1}}^{2}\mid \mathbf {x} \right)}}=\left({\frac {\sigma _{0}^{2}}{\sigma _{1}^{2}}}\right)^{-n/2}\exp \left\{-{\frac {1}{2}}(\sigma _{0}^{-2}-\sigma _{1}^{-2})\sum _{i=1}^{n}(x_{i}-\mu )^{2}\right\}.}$

This ratio only depends on the data through ${\displaystyle \sum _{i=1}^{n}(x_{i}-\mu )^{2}}$. Therefore, by the Neyman–Pearson lemma, the most powerful test of this type of hypothesis for this data will depend only on ${\displaystyle \sum _{i=1}^{n}(x_{i}-\mu )^{2}}$. Also, by inspection, we can see that if ${\displaystyle \sigma _{1}^{2}>\sigma _{0}^{2}}$, then ${\displaystyle \Lambda (\mathbf {x} )}$ is a decreasing function of ${\displaystyle \sum _{i=1}^{n}(x_{i}-\mu )^{2}}$. So we should reject ${\displaystyle H_{0}}$ if ${\displaystyle \sum _{i=1}^{n}(x_{i}-\mu )^{2}}$ is sufficiently large. The rejection threshold depends on the size of the test. In this example, the test statistic can be shown to be a scaled Chi-square distributed random variable and an exact critical value can be obtained.

## Application in economics

A variant of the Neyman–Pearson lemma has found an application in the seemingly unrelated domain of the economics of land value. One of the fundamental problems in consumer theory is calculating the demand function of the consumer given the prices. In particular, given a heterogeneous land-estate, a price measure over the land, and a subjective utility measure over the land, the consumer's problem is to calculate the best land parcel that he can buy – i.e. the land parcel with the largest utility, whose price is at most his budget. It turns out that this problem is very similar to the problem of finding the most powerful statistical test, and so the Neyman–Pearson lemma can be used.[6]

## Uses in electrical engineering

The Neyman–Pearson lemma is quite useful in electronics engineering, namely in the design and use of radar systems, digital communication systems, and in signal processing systems. In radar systems, the Neyman–Pearson lemma is used in first setting the rate of missed detections to a desired (low) level, and then minimizing the rate of false alarms, or vice versa. Neither false alarms nor missed detections can be set at arbitrarily low rates, including zero. All of the above goes also for many systems in signal processing.

## Uses in particle physics

The Neyman–Pearson lemma is applied to the construction of analysis-specific likelihood-ratios, used to e.g. test for signatures of new physics against the nominal Standard Model prediction in proton-proton collision datasets collected at the LHC.