= Ball divergence =

Ball Divergence (BD) is a nonparametric two‐sample statistic that quantifies the discrepancy between two probability measures $\mu$ and $\nu$ on a metric space It is defined by integrating the squared difference of the measures over all closed balls in $V$. Let $\overline B(u,r) = \{w\in V\mid \rho(u,w)\le r\}$ be the closed ball of radius $r\ge0$ centered at Equivalently, one may set $r = \rho(u,v)$ and write The Ball divergence is then defined by
$BD(\mu,\nu)=
\iint_{V\times V}
\bigl[\mu(\overline B(u,\rho(u,v))) - \nu(\overline B(u,\rho(u,v)))\bigr]^{2}
\; \bigl[\mu(du)\,\mu(dv) + \nu(du)\,\nu(dv)\bigr].$
This measure can be seen as an integral of the Harald Cramér's distance over all possible pairs of points. By summing squared differences of $\mu$ and $\nu$ over balls of all scales, BD captures both global and local discrepancies between distributions, yielding a robust, scale-sensitive comparison. Moreover, since BD is defined as the integral of a squared measure difference, it is always non-negative, and $BD(\mu,\nu)=0$ if and only if $\mu=\nu$.

==Testing for equal distributions==

Next, we will try to give a sample version of Ball Divergence. For convenience, we can decompose the Ball Divergence into two parts:
$A=\iint_{V \times V}[\mu-\nu]^2(\bar{B}(u, \rho(u, v))) \mu(d u) \mu(d v),$
and
$C=\iint_{V \times V}[\mu-\nu]^2(\bar{B}(u, \rho(u, v))) \nu(d u) \nu(d v) .$
Thus $BD(\mu, \nu)=A+C .$

Let $\delta(x, y, z)=I(z \in \bar{B}(x, \rho(x, y)))$ denote whether point $z$ locates in the ball $\bar{B}(x, \rho(x, y))$. Given two independent samples $\{ X_1,\ldots, X_n \}$ form $\mu$ and $\{ Y_1,\ldots, Y_m \}$ form $\nu$

$\begin{align}
A_{i j}^X &= \frac{1}{n} \sum_{u=1}^n \delta{\left(X_i, X_j, X_u\right)}, & A_{i j}^Y &= \frac{1}{m} \sum_{v=1}^m \delta{\left(X_i, X_j, Y_v\right)}, \\
C_{k l}^X &= \frac{1}{n} \sum_{u=1}^n \delta{\left(Y_k, Y_l, X_u\right)}, & C_{i j}^Y &= \frac{1}{m} \sum_{v=1}^m \delta{\left(Y_k, Y_l, Y_v\right)},
\end{align}$
where $A_{i j}^X$ means the proportion of samples from the probability measure $\mu$ located in the ball $\bar{B}\left(X_i, \rho\left(X_i, X_j\right)\right)$ and $A_{i j}^Y$ means the proportion of samples from the probability measure $\nu$ located in the ball $\bar{B}\left(X_i, \rho\left(X_i, X_j\right)\right)$. Meanwhile, $C_{i j}^X$ and $C_{i j}^Y$ means the proportion of samples from the probability measure $\mu$ and $\nu$ located in the ball $\bar{B}\left(Y_i, \rho\left(Y_i, Y_j\right)\right)$. The sample versions of $A$ and $C$ are as follows

$A_{n, m}=\frac{1}{n^2} \sum_{i, j=1}^n\left(A_{i j}^X-A_{i j}^Y\right)^2, \qquad C_{n, m}=\frac{1}{m^2} \sum_{k, l=1}^m\left(C_{k l}^X-C_{k l}^Y\right)^2.$

Finally, we can give the sample ball divergence

$BD_{n, m} = A_{n, m}+C_{n, m}.$

It can be proved that $BD_{n,m}$ is a consistent estimator of BD. Moreover, if $\tfrac{n}{n+m}\to\tau$ for some $\tau\in[0,1]$, then under the null hypothesis $BD_{n,m}$ converges in distribution to a mixture of chi-squared distributions, whereas under the alternative hypothesis it converges to a normal distribution.
==Properties==

1. The square root of Ball Divergence is a symmetric divergence but not a metric, because it does not satisfy the triangle inequality.
2. It can be shown that Ball divergence, energy distance test, and MMD are unified within the variogram framework; for details see Remark 2.4 in.

==Homogeneity Test==
Ball divergence admits a straightforward extension to the K-sample setting. Suppose $\mu_1, \dots, \mu_K$ are $K (\ge2)$ probability measures on a Banach space $(V,\|\cdot\|)$. Define the K-sample BD by

$D(\mu_1,\dots,\mu_K)
=\sum_{1\le k<l\le K}
\iint_{V\times V}
\bigl[\mu_k\bigl(\overline B(u,\rho(u,v))\bigr)
-\mu_l\bigl(\overline B(u,\rho(u,v))\bigr)\bigr]^{2}
\;\bigl[\mu_k(du)\,\mu_k(dv)+\mu_l(du)\,\mu_l(dv)\bigr].$

It then follows from Theorems 1 and 2 that $D(\mu_1,\dots,\mu_K)=0$ if and only if $\mu_1=\mu_2=\cdots=\mu_K.$

By employing closed balls to define a metric distribution function, one obtains an alternative homogeneity measure.

Given a probability measure $\tilde\mu$ on a metric space $(V,\rho)$, its metric distribution function is defined by

$F^{M}_{\tilde\mu}(u,v)
=\tilde\mu\bigl(\overline B(u,\rho(u,v))\bigr)=\mathbb E\bigl[\delta(u,v,X)\bigr],
\quad u,v\in V,$

where $\overline B(u,r) = \{w\in V:d(u,w)\le r\}$ is the closed ball of radius $r\ge0$ centered at $u$, and $\delta(u,v,X)
=\prod_{k=1}^K\mathbf 1\{X^{(k)}\in \overline B_k(u_k,\rho_k(u_k,v_k))\}.$

If $(X_1,\dots,X_N)$ are i.i.d. draws from $(\tilde\mu)$, the empirical version is

$F^{M}_{\tilde\mu,N}(u,v)
=\frac1N\sum_{i=1}^N\delta(u,v,X_i).$

Based on these, the homogeneity measure based on MDF, also called metric Cramér-von Mises (MCVM) is
$\mathrm{MCVM}\bigl(\mu_{k}\parallel\mu\bigr)
=\int_{V\times V}
p_{k}^{2}\,w(u,v)\,
\bigl[F^{M}_{\mu_{k}}(u,v)
- F^{M}_{\mu}(u,v)\bigr]^{2}
\,d\mu_{k}(u)\,d\mu_{k}(v),$

where $\mu=\sum_{k=1}^{K} p_{k}\,\mu_{k}$ be their mixture with weights $p_{1},\dots,p_{K}$, and $w(u,v)=\exp\left(-\tfrac{d(u,v)^{2}}{2\sigma^{2}}\right)$.
The overall MCVM is then

$\mathrm{MCVM}(\mu_{1},\dots,\mu_{K})
=\sum_{k=1}^{K}p_{k}^{2}\,\mathrm{MCVM}\bigl(\mu_{k}\parallel\mu\bigr).$

The empirical MCVM is given by

$\widehat{\mathrm{MCVM}}\bigl(\mu_{k}\parallel\mu\bigr)
=\frac{1}{n_{k}^{2}}
\sum_{X^{(k)}_{i},X^{(k)}_{j}\in\mathcal X_{k}}
w\bigl(X^{(k)}_{i},X^{(k)}_{j}\bigr)\,
\left[
F^{M}_{\mu_{k},n_{k}}\bigl(X^{(k)}_{i},X^{(k)}_{j}\bigr)
- F^{M}_{\mu,n}\bigl(X^{(k)}_{i},X^{(k)}_{j}\bigr)
\right]^2.$

where $\mathcal X_{k}=\{X^{(k)}_{1},\dots,X^{(k)}_{n_{k}}\}$ be an i.i.d. sample from $\mu_{k}$, and $\hat p_{k}=\frac{n_{k}}{\sum_{\ell=1}^{K}n_{\ell}}.$
A practical choice for $\sigma^{2}$ is the median of the squared distances
$\left\{d(X,X')^{2}:X,X'\in\bigcup_{k=1}^{K}\mathcal X_{k}\right\}.$
