# Min-entropy

The min-entropy, in information theory, is the smallest of the Rényi family of entropies, corresponding to the most conservative way of measuring the unpredictability of a set of outcomes, as the negative logarithm of the probability of the most likely outcome. The various Rényi entropies are all equal for a uniform distribution, but measure the unpredictability of a nonuniform distribution in different ways. The min-entropy is never greater than the ordinary or Shannon entropy (which measures the average unpredictability of the outcomes) and that in turn is never greater than the Hartley or max-entropy, defined as the logarithm of the number of outcomes with nonzero probability.

As with the classical Shannon entropy and its quantum generalization, the von Neumann entropy, one can define a conditional version of min-entropy. The conditional quantum min-entropy is a one-shot, or conservative, analog of conditional quantum entropy.

To interpret a conditional information measure, suppose Alice and Bob were to share a bipartite quantum state ${\displaystyle \rho _{AB}}$. Alice has access to system ${\displaystyle A}$ and Bob to system ${\displaystyle B}$. The conditional entropy measures the average uncertainty Bob has about Alice's state upon sampling from his own system. The min-entropy can be interpreted as the distance of a state from a maximally entangled state.

This concept is useful in quantum cryptography, in the context of privacy amplification (See for example [1]).

## Definition for classical distributions

If ${\displaystyle P=(p_{1},...,p_{n})}$ is a classical finite probability distribution, its min-entropy can be defined as[2]

${\displaystyle H_{\rm {min}}({\boldsymbol {P}})=\log {\frac {1}{P_{\rm {max}}}},\qquad P_{\rm {max}}\equiv \max _{i}p_{i}.}$
One way to justify the name of the quantity is to compare it with the more standard definition of entropy, which reads ${\displaystyle H({\boldsymbol {P}})=\sum _{i}p_{i}\log(1/p_{i})}$, and can thus be written concisely as the expectation value of ${\displaystyle \log(1/p_{i})}$ over the distribution. If instead of taking the expectation value of this quantity we take its minimum value, we get precisely the above definition of ${\displaystyle H_{\rm {min}}({\boldsymbol {P}})}$.

## Definition for quantum states

A natural way to define a "min-entropy" for quantum states is to leverage the simple observation that quantum states result in probability distributions when measured in some basis. There is however the added difficulty that a single quantum state can result in infinitely many possible probability distributions, depending on how it is measured. A natural path is then, given a quantum state ${\displaystyle \rho }$, to still define ${\displaystyle H_{\rm {min}}(\rho )}$ as ${\displaystyle \log(1/P_{\rm {max}})}$, but this time defining ${\displaystyle P_{\rm {max}}}$ as the maximum possible probability that can be obtained measuring ${\displaystyle \rho }$, maximizing over all possible projective measurements.

Formally, this would provide the definition

${\displaystyle H_{\rm {min}}(\rho )=\max _{\Pi }\log {\frac {1}{\max _{i}\operatorname {tr} (\Pi _{i}\rho )}}=-\max _{\Pi }\log \max _{i}\operatorname {tr} (\Pi _{i}\rho ),}$
where we are maximizing over the set of all projective measurements ${\displaystyle \Pi =(\Pi _{i})_{i}}$, ${\displaystyle \Pi _{i}}$ represent the measurement outcomes in the POVM formalism, and ${\displaystyle \operatorname {tr} (\Pi _{i}\rho )}$ is therefore the probability of observing the ${\displaystyle i}$-th outcome when the measurement is ${\displaystyle \Pi }$.

A more concise method to write the double maximization is to observe that any element of any POVM is a Hermitian operator such that ${\displaystyle 0\leq \Pi \leq I}$, and thus we can equivalently directly maximize over these to get

${\displaystyle H_{\rm {min}}(\rho )=-\max _{0\leq \Pi \leq I}\log \operatorname {tr} (\Pi \rho ).}$
In fact, this maximization can be performed explicitly and the maximum is obtained when ${\displaystyle \Pi }$ is the projection onto (any of) the largest eigenvalue(s) of ${\displaystyle \rho }$. We thus get yet another expression for the min-entropy as:
${\displaystyle H_{\rm {min}}(\rho )=-\log \|\rho \|_{\rm {op}},}$
remembering that the operator norm of a Hermitian positive semidefinite operator equals its largest eigenvale.

## Conditional entropies

Let ${\displaystyle \rho _{AB}}$ be a bipartite density operator on the space ${\displaystyle {\mathcal {H}}_{A}\otimes {\mathcal {H}}_{B}}$. The min-entropy of ${\displaystyle A}$ conditioned on ${\displaystyle B}$ is defined to be

${\displaystyle H_{\min }(A|B)_{\rho }\equiv -\inf _{\sigma _{B}}D_{\max }(\rho _{AB}\|I_{A}\otimes \sigma _{B})}$

where the infimum ranges over all density operators ${\displaystyle \sigma _{B}}$ on the space ${\displaystyle {\mathcal {H}}_{B}}$. The measure ${\displaystyle D_{\max }}$ is the maximum relative entropy defined as

${\displaystyle D_{\max }(\rho \|\sigma )=\inf _{\lambda }\{\lambda :\rho \leq 2^{\lambda }\sigma \}}$

The smooth min-entropy is defined in terms of the min-entropy.

${\displaystyle H_{\min }^{\epsilon }(A|B)_{\rho }=\sup _{\rho '}H_{\min }(A|B)_{\rho '}}$

where the sup and inf range over density operators ${\displaystyle \rho '_{AB}}$ which are ${\displaystyle \epsilon }$-close to ${\displaystyle \rho _{AB}}$. This measure of ${\displaystyle \epsilon }$-close is defined in terms of the purified distance

${\displaystyle P(\rho ,\sigma )={\sqrt {1-F(\rho ,\sigma )^{2}}}}$

where ${\displaystyle F(\rho ,\sigma )}$ is the fidelity measure.

These quantities can be seen as generalizations of the von Neumann entropy. Indeed, the von Neumann entropy can be expressed as

${\displaystyle S(A|B)_{\rho }=\lim _{\epsilon \rightarrow 0}\lim _{n\rightarrow \infty }{\frac {1}{n}}H_{\min }^{\epsilon }(A^{n}|B^{n})_{\rho ^{\otimes n}}~.}$

This is called the fully quantum asymptotic equipartition theorem.[3] The smoothed entropies share many interesting properties with the von Neumann entropy. For example, the smooth min-entropy satisfy a data-processing inequality:[4]

${\displaystyle H_{\min }^{\epsilon }(A|B)_{\rho }\geq H_{\min }^{\epsilon }(A|BC)_{\rho }~.}$

## Operational interpretation of smoothed min-entropy

Henceforth, we shall drop the subscript ${\displaystyle \rho }$ from the min-entropy when it is obvious from the context on what state it is evaluated.

### Min-entropy as uncertainty about classical information

Suppose an agent had access to a quantum system ${\displaystyle B}$ whose state ${\displaystyle \rho _{B}^{x}}$ depends on some classical variable ${\displaystyle X}$. Furthermore, suppose that each of its elements ${\displaystyle x}$ is distributed according to some distribution ${\displaystyle P_{X}(x)}$. This can be described by the following state over the system ${\displaystyle XB}$.

${\displaystyle \rho _{XB}=\sum _{x}P_{X}(x)|x\rangle \langle x|\otimes \rho _{B}^{x},}$

where ${\displaystyle \{|x\rangle \}}$ form an orthonormal basis. We would like to know what the agent can learn about the classical variable ${\displaystyle x}$. Let ${\displaystyle p_{g}(X|B)}$ be the probability that the agent guesses ${\displaystyle X}$ when using an optimal measurement strategy

${\displaystyle p_{g}(X|B)=\sum _{x}P_{X}(x)tr(E_{x}\rho _{B}^{x}),}$

where ${\displaystyle E_{x}}$ is the POVM that maximizes this expression. It can be shown[citation needed] that this optimum can be expressed in terms of the min-entropy as

${\displaystyle p_{g}(X|B)=2^{-H_{\min }(X|B)}~.}$

If the state ${\displaystyle \rho _{XB}}$ is a product state i.e. ${\displaystyle \rho _{XB}=\sigma _{X}\otimes \tau _{B}}$ for some density operators ${\displaystyle \sigma _{X}}$ and ${\displaystyle \tau _{B}}$, then there is no correlation between the systems ${\displaystyle X}$ and ${\displaystyle B}$. In this case, it turns out that ${\displaystyle 2^{-H_{\min }(X|B)}=\max _{x}P_{X}(x)~.}$

#### Min-entropy as overlap with the maximally entangled state

The maximally entangled state ${\displaystyle |\phi ^{+}\rangle }$ on a bipartite system ${\displaystyle {\mathcal {H}}_{A}\otimes {\mathcal {H}}_{B}}$ is defined as

${\displaystyle |\phi ^{+}\rangle _{AB}={\frac {1}{\sqrt {d}}}\sum _{x_{A},x_{B}}|x_{A}\rangle |x_{B}\rangle }$

where ${\displaystyle \{|x_{A}\rangle \}}$ and ${\displaystyle \{|x_{B}\rangle \}}$ form an orthonormal basis for the spaces ${\displaystyle A}$ and ${\displaystyle B}$ respectively. For a bipartite quantum state ${\displaystyle \rho _{AB}}$, we define the maximum overlap with the maximally entangled state as

${\displaystyle q_{c}(A|B)=d_{A}\max _{\mathcal {E}}F\left((I_{A}\otimes {\mathcal {E}})\rho _{AB},|\phi ^{+}\rangle \langle \phi ^{+}|\right)^{2}}$

where the maximum is over all CPTP operations ${\displaystyle {\mathcal {E}}}$ and ${\displaystyle d_{A}}$ is the dimension of subsystem ${\displaystyle A}$. This is a measure of how correlated the state ${\displaystyle \rho _{AB}}$ is. It can be shown that ${\displaystyle q_{c}(A|B)=2^{-H_{\min }(A|B)}}$. If the information contained in ${\displaystyle A}$ is classical, this reduces to the expression above for the guessing probability.

### Proof of operational characterization of min-entropy

The proof is from a paper by König, Schaffner, Renner in 2008.[5] It involves the machinery of semidefinite programs.[6] Suppose we are given some bipartite density operator ${\displaystyle \rho _{AB}}$. From the definition of the min-entropy, we have

${\displaystyle H_{\min }(A|B)=-\inf _{\sigma _{B}}\inf _{\lambda }\{\lambda |\rho _{AB}\leq 2^{\lambda }(I_{A}\otimes \sigma _{B})\}~.}$

This can be re-written as

${\displaystyle -\log \inf _{\sigma _{B}}\operatorname {Tr} (\sigma _{B})}$

subject to the conditions

${\displaystyle \sigma _{B}\geq 0}$
${\displaystyle I_{A}\otimes \sigma _{B}\geq \rho _{AB}~.}$

We notice that the infimum is taken over compact sets and hence can be replaced by a minimum. This can then be expressed succinctly as a semidefinite program. Consider the primal problem

${\displaystyle {\text{min:}}\operatorname {Tr} (\sigma _{B})}$
${\displaystyle {\text{subject to: }}I_{A}\otimes \sigma _{B}\geq \rho _{AB}}$
${\displaystyle \sigma _{B}\geq 0~.}$

This primal problem can also be fully specified by the matrices ${\displaystyle (\rho _{AB},I_{B},\operatorname {Tr} ^{*})}$ where ${\displaystyle \operatorname {Tr} ^{*}}$ is the adjoint of the partial trace over ${\displaystyle A}$. The action of ${\displaystyle \operatorname {Tr} ^{*}}$ on operators on ${\displaystyle B}$ can be written as

${\displaystyle \operatorname {Tr} ^{*}(X)=I_{A}\otimes X~.}$

We can express the dual problem as a maximization over operators ${\displaystyle E_{AB}}$ on the space ${\displaystyle AB}$ as

${\displaystyle {\text{max:}}\operatorname {Tr} (\rho _{AB}E_{AB})}$
${\displaystyle {\text{subject to: }}\operatorname {Tr} _{A}(E_{AB})=I_{B}}$
${\displaystyle E_{AB}\geq 0~.}$

Using the Choi–Jamiołkowski isomorphism, we can define the channel ${\displaystyle {\mathcal {E}}}$ such that

${\displaystyle d_{A}I_{A}\otimes {\mathcal {E}}^{\dagger }(|\phi ^{+}\rangle \langle \phi ^{+}|)=E_{AB}}$

where the bell state is defined over the space ${\displaystyle AA'}$. This means that we can express the objective function of the dual problem as

${\displaystyle \langle \rho _{AB},E_{AB}\rangle =d_{A}\langle \rho _{AB},I_{A}\otimes {\mathcal {E}}^{\dagger }(|\phi ^{+}\rangle \langle \phi ^{+}|)\rangle }$
${\displaystyle =d_{A}\langle I_{A}\otimes {\mathcal {E}}(\rho _{AB}),|\phi ^{+}\rangle \langle \phi ^{+}|)\rangle }$

as desired.

Notice that in the event that the system ${\displaystyle A}$ is a partly classical state as above, then the quantity that we are after reduces to

${\displaystyle \max P_{X}(x)\langle x|{\mathcal {E}}(\rho _{B}^{x})|x\rangle ~.}$

We can interpret ${\displaystyle {\mathcal {E}}}$ as a guessing strategy and this then reduces to the interpretation given above where an adversary wants to find the string ${\displaystyle x}$ given access to quantum information via system ${\displaystyle B}$.