# Bhattacharyya distance

In statistics, the Bhattacharyya distance measures the similarity of two discrete or continuous probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations. Both measures are named after Anil Kumar Bhattacharya, a statistician who worked in the 1930s at the Indian Statistical Institute.[1] The coefficient can be used to determine the relative closeness of the two samples being considered. It is used to measure the separability of classes in classification and it is considered to be more reliable than the Mahalanobis distance, as the Mahalanobis distance is a particular case of the Bhattacharyya distance when the standard deviations of the two classes are the same. Therefore, when two classes have similar means but different standard deviations, the Mahalanobis distance would tend to zero, however, the Bhattacharyya distance would grow depending on the difference between the standard deviations.

## Definition

For discrete probability distributions p and q over the same domain X, it is defined as:

$D_B(p,q) = -\ln \left( BC(p,q) \right)$

where:

$BC(p,q) = \sum_{x\in X} \sqrt{p(x) q(x)}$

is the Bhattacharyya coefficient.

For continuous probability distributions, the Bhattacharyya coefficient is defined as:

$BC(p,q) = \int \sqrt{p(x) q(x)}\, dx$

In either case, $0 \le BC \le 1$ and $0 \le D_B \le \infty$. $D_B$ does not obey the triangle inequality, but the Hellinger distance $\sqrt{1-BC}$ does obey the triangle inequality.

In its simplest formulation, the Bhattacharyya distance between two classes under the normal distribution can be calculated [2] by extracting the mean and variances of two separate distributions or classes:

$D_B(p,q) = \frac{1}{4} \ln \left ( \frac 1 4 \left( \frac{\sigma_p^2}{\sigma_q^2}+\frac{\sigma_q^2}{\sigma_p^2}+2\right ) \right ) +\frac{1}{4} \left ( \frac{(\mu_p-\mu_q)^{2}}{\sigma_p^2+\sigma_q^2}\right )$

where:

 $D_{B}(p,q)$ is the Bhattacharyya distance between p and q distributions or classes, $\sigma_p$ is the variance of the p-th distribution, $\mu_p$ is the mean of the p-th distribution, and $p,q$ are two different distributions.

The Mahalanobis distance used in Fisher's linear discriminant analysis is a particular case of the Bhattacharyya Distance. When the variances of the two distributions are the same the first term of the distance is zero as this term depends solely on the variances of the distributions (left case of the figure). The first term will grow as the variances differ (right case of the figure). The second term, on the other hand, will be zero if the means are equal and is inversely proportional to the variances.

For multivariate normal distributions $p_i=\mathcal{N}(\boldsymbol\mu_i,\,\boldsymbol\Sigma_i)$,

$D_B={1\over 8}(\boldsymbol\mu_1-\boldsymbol\mu_2)^T \boldsymbol\Sigma^{-1}(\boldsymbol\mu_1-\boldsymbol\mu_2)+{1\over 2}\ln \,\left({\det \boldsymbol\Sigma \over \sqrt{\det \boldsymbol\Sigma_1 \, \det \boldsymbol\Sigma_2} }\right),$

where $\boldsymbol\mu_i$ and $\boldsymbol\Sigma_i$ are the means and covariances of the distributions, and

$\boldsymbol\Sigma={\boldsymbol\Sigma_1+\boldsymbol\Sigma_2 \over 2}.$

Note that, in this case, the first term in the Bhattacharyya distance is related to the Mahalanobis distance.

## Bhattacharyya coefficient

The Bhattacharyya coefficient is an approximate measurement of the amount of overlap between two statistical samples. The coefficient can be used to determine the relative closeness of the two samples being considered.

Calculating the Bhattacharyya coefficient involves a rudimentary form of integration of the overlap of the two samples. The interval of the values of the two samples is split into a chosen number of partitions, and the number of members of each sample in each partition is used in the following formula,

$BC(\mathbf{p},\mathbf{q}) = \sum_{i=1}^n \sqrt{p_i q_i},$[3]

where, considering the samples p and q, n is the number of partitions, and $p_i$, $q_i$ are the numbers of members of samples p and q in the i-th partition.

This formula hence is larger with each partition that has members from both sample, and larger with each partition that has a large overlap of the two sample's members within it. The choice of number of partitions depends on the number of members in each sample; too few partitions will lose accuracy by overestimating the overlap region, and too many partitions will lose accuracy by creating individual partitions with no members despite being in a densely populated sample space.

The Bhattacharyya coefficient will be 0 if there is no overlap at all due to the multiplication by zero in every partition. This means the distance between fully separated samples will not be exposed by this coefficient alone.

## Applications

The Bhattacharyya distance is widely used in research of feature extraction and selection,[4] image processing,[5] speaker recognition,[6] phone clustering.[7]

A "Bhattacharyya space" has been proposed as a feature selection technique that can be applied to texture segmentation.[8]

A "Bhattacharyya coefficient" has also been proposed as a feature selection technique that can be used to estimate the given distance between the Bhattacharyya number and any given Nesli coordinate.[8]