= Trace distance =

In quantum mechanics, and especially quantum information and the study of open quantum systems, the trace distance is a metric on the space of density matrices and gives a measure of the distinguishability between two states. It is the quantum generalization of the Kolmogorov distance for classical probability distributions.

== Definition ==
The trace distance is defined as half of the trace norm of the difference of the matrices:$T(\rho,\sigma) := \frac{1}{2}\|\rho - \sigma\|_{1} = \frac{1}{2} \mathrm{Tr} \left[ \sqrt{(\rho-\sigma)^\dagger (\rho-\sigma)} \right],$where $\|A\|_1\equiv \operatorname{Tr}[\sqrt{A^\dagger A}]$ is the trace norm of $A$, and $\sqrt A$ is the unique positive semidefinite $B$ such that $B^2=A$ (which is always defined for positive semidefinite $A$). This can be thought of as the matrix obtained from $A$ taking the algebraic square roots of its eigenvalues. For the trace distance, we more specifically have an expression of the form $|C|\equiv \sqrt{C^\dagger C}=\sqrt{C^2}$ where $C=\rho-\sigma$ is Hermitian. This quantity equals the sum of the singular values of $C$, which being $C$ Hermitian, equals the sum of the absolute values of its eigenvalues. More explicitly,
$T(\rho,\sigma) = \frac12 \operatorname{Tr}|\rho-\sigma| = \frac12\sum_{i=1}^{r}|\lambda_i|,$
where $\lambda_i\in\mathbb R$ is the $i$-th eigenvalue of $\rho-\sigma$, and $r$ is its rank.

The factor of two ensures that the trace distance between normalized density matrices takes values in the range $[0,1]$.

== Connection with the total variation distance ==
The trace distance serves as a direct quantum generalization of the total variation distance between probability distributions. Given two probability distributions $P$ and $Q$, their total variation distance is defined as$\delta(P,Q) = \frac12\|P-Q\|_1 = \frac12 \sum_k |P_k-Q_k|.$When extending this concept to quantum states, one must account for the fact that for quantum states different measurement can produce different distributions. A natural approach is to consider the (classical) total variation distance between the measurement outcomes produced by two states for a fixed choice of measurement, and then maximize over all possible measurements. This procedure leads precisely to the trace distance between the quantum states. More explicitly, this is the quantity$\max_\Pi \frac12\sum_i |\operatorname{Tr}(\Pi_i \rho) - \operatorname{Tr}(\Pi_i\sigma)|,$with the maximization performed with respect to all possible POVMs $\{\Pi_i\}_i$.

To understand why this maximum equals the trace distance between the states, note that there is a unique decomposition $\rho-\sigma=P-Q$ with $P,Q \ge 0$ positive semidefinite matrices with orthogonal support. With these operators we can write concisely $|\rho-\sigma|=P+Q$. Furthermore $\operatorname{Tr}(\Pi_i P),\operatorname{Tr}(\Pi_i Q)\ge0$, and thus $|\operatorname{Tr}(\Pi_iP)-\operatorname{Tr}(\Pi_i Q))|
\le \operatorname{Tr}(\Pi_iP)+\operatorname{Tr}(\Pi_i Q)$. We thus have$\sum_i |\operatorname{Tr}(\Pi_i (\rho-\sigma))|
=\sum_i |\operatorname{Tr}(\Pi_i (P-Q))|
\le \sum_i \operatorname{Tr}(\Pi_i(P+Q))
= \operatorname{Tr}|\rho-\sigma|.$This shows that$\max_\Pi \delta(P_{\Pi,\rho},P_{\Pi,\sigma}) \le T(\rho,\sigma),$where $P_{\Pi,\rho}$ denotes the classical probability distribution resulting from measuring $\rho$ with the POVM $\Pi$, $(P_{\Pi,\rho})_i \equiv \operatorname{Tr}(\Pi_i \rho)$, and the maximum is performed over all POVMs $\Pi\equiv\{\Pi_i\}_i$.

To conclude that the inequality is saturated by some POVM, we need only consider the projective measurement with elements corresponding to the eigenvectors of $\rho-\sigma$. With this choice,$\delta(P_{\Pi,\rho},P_{\Pi,\sigma}) =
\frac12\sum_i |\operatorname{Tr}(\Pi_i(\rho-\sigma))|
= \frac12 \sum_i |\lambda_i| = T(\rho,\sigma),$where $\lambda_i$ are the eigenvalues of $\rho-\sigma$.

== Physical interpretation ==
By using the Hölder duality for Schatten norms, the trace distance can be written in variational form as
$T(\rho,\sigma) = \frac{1}{2}\sup_{-\mathbb{I}\leq U \leq \mathbb{I}} \mathrm{Tr}[U(\rho-\sigma)]
=\sup_{0\leq P \leq \mathbb{I}} \mathrm{Tr}[P(\rho-\sigma)].$

As for its classical counterpart, the trace distance can be related to the maximum probability of distinguishing between two quantum states:

For example, suppose Alice prepares a system in either the state $\rho$ or $\sigma$, each with probability $\frac 12$ and sends it to Bob who has to discriminate between the two states using a binary measurement.
Let Bob assign the measurement outcome $0$ and a POVM element $P_0$ such as the outcome $1$ and a POVM element $P_1=1-P_0$ to identify the state $\rho$ or $\sigma$, respectively. His expected probability of correctly identifying the incoming state is then given by
$p_{\text{guess}} = \frac 12 p(0|\rho) + \frac 12 p(1|\sigma) =
\frac 12 \mathrm{Tr}(P_0\rho)+ \frac 12 \mathrm{Tr}(P_1\sigma)=\frac 12 \left(1+ \mathrm{Tr}\left(P_0(\rho-\sigma)\right)\right).$

Therefore, when applying an optimal measurement, Bob has the maximal probability

$p^{\text{max}}_{\text{guess}} = \sup_{P_0} \frac 12 \left(1+ \mathrm{Tr}\left(P_0(\rho-\sigma)\right)\right)
=\frac 12 (1 + T(\rho,\sigma))$
of correctly identifying in which state Alice prepared the system.

== Properties ==
The trace distance has the following properties
- It is a metric on the space of density matrices, i.e. it is non-negative, symmetric, and satisfies the triangle inequality, and $T(\rho,\sigma) = 0 \Leftrightarrow \rho=\sigma$
- $0 \leq T(\rho,\sigma) \leq 1$ and $T(\rho,\sigma)=1$ if and only if $\rho$ and $\sigma$ have orthogonal supports
- It is preserved under unitary transformations: $T(U\rho U^\dagger,U\sigma U^\dagger) = T(\rho,\sigma)$
- It is contractive under trace-preserving CP maps, i.e. if $\Phi$ is a CPTP map, then $T(\Phi(\rho),\Phi(\sigma))\leq T(\rho,\sigma)$
- It is convex in each of its inputs. E.g. $T(\sum_i p_i \rho_i,\sigma) \leq \sum_i p_i T(\rho_i,\sigma)$
- On pure states, it can be expressed uniquely in term of the inner product of the states: $T(|\psi\rangle\langle\psi|,|\phi\rangle\langle\phi|) = \sqrt{1-|\langle\psi | \phi\rangle|^2}$

For qubits, the trace distance is equal to half the Euclidean distance in the Bloch representation.

=== Relationship to other distance measures ===
==== Fidelity ====
The fidelity of two quantum states $F(\rho,\sigma)$ is related to the trace distance $T(\rho,\sigma)$ by the inequalities

$1-\sqrt{F(\rho,\sigma)} \le T(\rho,\sigma) \le\sqrt{1-F(\rho,\sigma)} \, .$

The upper bound inequality becomes an equality when $\rho$ and $\sigma$ are pure states. [Note that the definition for Fidelity used here is the square of that used in Nielsen and Chuang]

==== Total variation distance ====

The trace distance is a generalization of the total variation distance, and for two commuting density matrices, has the same value as the total variation distance of the two corresponding probability distributions.
