= Quasi-arithmetic mean =

In mathematics and statistics, the quasi-arithmetic mean or generalised f-mean or Kolmogorov-Nagumo-de Finetti mean is one generalisation of the more familiar means such as the arithmetic mean and the geometric mean, using a function $f$. It is also called Kolmogorov mean after Soviet mathematician Andrey Kolmogorov. It is a broader generalization than the regular generalized mean.

==Definition==

If $\ f\$ is a function that maps some continuous interval $\ I\$ of the real line to some other continuous subset $\ J \equiv f(I)\$ of the real numbers, and $\ f\$ is both continuous, and injective (one-to-one).
 (We require $\ f\$ to be injective on $\ I\$ in order for an inverse function $\ f^{-1}\$ to exist. We require $\ I\$ and $\ J\$ to both be continuous intervals in order to ensure that an average of any finite (or infinite) subset of values within $\ J\$ will always correspond to a value in $\ I\$.)
Subject to those requirements, the of $\ n\$ numbers $\ x_1, \ldots, x_n \in I\$ is defined to be
 $\ M_f(x_1, \dots, x_n)\; \equiv\; f^{-1}\!\left(\ \frac{1}{n}\Bigl(\ f(x_1) + \cdots + f(x_n)\ \Bigr)\ \right)\ ,$
or equivalently
 $\ M_f(\vec x)\; =\; f^{-1}\!\!\left(\ \frac{1}{n} \sum_{k=1}^{n}f(x_k)\ \right) ~.$

A consequence of $\ f\$ being defined over some selected interval, $\ I\ ,$ mapping to yet another interval, $\ J\ ,$ is that $\ \frac{1}{n} \left(\ f(x_1) + \cdots + f(x_n)\ \right)\$ must also lie within $\ J\ ~.$ And because $\ J\$ is the domain of $\ f^{-1}\ ,$ so in turn $\ f^{-1}\$ must produce a value inside the same domain the values originally came from, $\ I ~.$

Because $\ f\$ is injective and continuous, it necessarily follows that $\ f\$ is a strictly monotonic function, and therefore that the ' is neither larger than the largest number of the tuple $\ x_1, \ldots\ , x_n \equiv X\$ nor smaller than the smallest number contained in $\ X\ ,$ hence contained somewhere among the values of the original sample.

== Examples ==

- If $I = \mathbb{R}\ ,$ the real line, and $\ f(x) = x\ ,$ (or indeed any linear function $\ x \mapsto a\cdot x + b\ ,$ for $\ a \ne 0\ ,$ otherwise any $\ a\$ and any $\ b\$) then the corresponds to the arithmetic mean.

- If $\ I = \mathbb{R}^+\ ,$ the strictly positive real numbers, and $\ f(x)\ =\ \log(x)\ ,$ then the corresponds to the geometric mean. (The result is the same for any logarithm; it does not depend on the base of the logarithm, as long as that base is strictly positive but not 1.)

- If $\ I = \mathbb{R}^+\$ and $\ f(x)\ =\ \frac{\ 1\ }{ x }\ ,$ then the corresponds to the harmonic mean.

- If $\ I = \mathbb{R}^+\$ and $\ f(x)\ =\ x^{\ \!p}\ ,$ then the corresponds to the power mean with exponent $\ p\$ e.g., for $\ p = 2\$ one gets the root mean square

- If $\ I = \mathbb{R}\$ and $\ f(x)\ =\ \exp(x)\ ,$ then the is the mean in the log semiring, which is a constant-shifted version of the LogSumExp (LSE) function (which is the logarithmic sum), $\ M_f(\ x_1,\ \ldots,\ x_n\ )\ =\ \operatorname\mathsf{LSE}\left(\ x_1,\ \ldots,\ x_n\ \right) - \log(n) ~.$ (The $\ -\log(n)\$ in the expression corresponds to dividing by n, since logarithmic division is linear subtraction.) The LogSumExp function is a smooth maximum: It is a smooth approximation to the maximum function.

== Properties ==
The following properties hold for $\ M_f\$ for any single function $\ f\$:

Symmetry: The value of $\ M_f\$ is unchanged if its arguments are permuted.

Idempotency: for all $\ x\ ,$ the repeated average $\ M_f(\ x,\ \dots,\ x\ ) = x ~.$

Monotonicity: $\ M_f\$ is monotonic in each of its arguments (since $\ f\$ is monotonic).

Continuity: $\ M_f\$ is continuous in each of its arguments (since $\ f\$ is continuous).

Replacement: Subsets of elements can be averaged a priori, without altering the mean, given that the multiplicity of elements is maintained. With $\ m\ \equiv\ M_f\!\left(\ x_1,\ \ldots\ ,\ x_k\ \right)\$ it holds:
 $\ M_f\!\left(\ x_1,\ \dots,\ x_k,\ x_{k+1},\ \ldots\ ,\ x\ _n\ \right)\ =\ M_f\!\left(\; \underbrace{m,\,\ \ldots\ ,\ m}_{\ k \text{ times}\ }\ ,\; x_{k+1}\ ,\ \ldots\ ,\ x_n\; \right) ~.$

Partitioning: The computation of the mean can be split into computations of equal sized sub-blocks:
 $M_f\!\left(\ x_1,\ \dots,\ x_{n\cdot k}\ \right)\; =\;
  M_f\!\Bigl(\; M_f\left(\ x_1,\ \ldots\ ,\ x_{k}\ \right),\;
      M_f\!\left(\ x_{k+1},\ \ldots\ ,\ x_{2\cdot k}\ \right),\;
      \dots,\;
      M_f\!\left(\ x_{(n-1)\cdot k + 1},\ \ldots\ ,\ x_{n\cdot k}\ \right)\; \Bigr) ~.$

Self-distributivity: For any quasi-arithmetic (q.a.) mean $\ M_\mathsf{q\ \!a}\$ of two variables:
 $\ M\mathsf{q\ \!a\ \!}\!\Bigl(\; x,\ M\mathsf{q\ \!a\ \!}\!\left(\ y,\ z\ \right)\; \Bigr) = M\mathsf{q\ \!a\ \!}\!\Bigl(\; M\mathsf{q\ \!a\ \!}\!\left(\ x,\ y\ \right),\; M\mathsf{q\ \!a\ \!}\!\left(\ x,\ z\ \right)\; \Bigr) ~.$

Mediality: For any quasi-arithmetic mean $\ M\mathsf{q\ \!a}\$ of two variables:
 $\ M\mathsf{q\ \!a\ \!}\!\Bigl(\; M\mathsf{q\ \!a\ \!}\!\left(\ x,\ y\ \right),\; M\mathsf{q\ \!a\ \!}\!\left(\ z,\ w\ \right)\; \Bigr) = M\mathsf{q\ \!a\ \!}\!\Bigl(\; M\mathsf{q\ \!a\ \!}\!\left(\ x,\ z\ \right),\; M\mathsf{q\ \!a\ \!}\!\left(\ y,\ w\ \right)\; \Bigr) ~.$

Balancing: For any quasi-arithmetic mean $\ M\mathsf{q\ \!a}\$ of two variables:
 $\ M\mathsf{q\ \!a\ \!}\!\biggl(\;\ M\mathsf{q\ \!a\ \!}\!\Bigl(\; x,\; M\mathsf{q\ \!a\ \!}\!\left(\ x,\ y\ \right)\; \Bigr),\;\ M\mathsf{q\ a\ \!}\!\Bigl(\; y,\ M\mathsf{q\ \!a\ \!}\!\left(\ x,\ y\ \right)\; \Bigr)\;\ \biggr) ~=~ M\mathsf{q\ \!a\ \!}\!\bigl(\ x,\ y\ \bigr) ~.$

Scale-invariance: The quasi-arithmetic mean is invariant with respect to offsets and non-trivial scaling of quasi-arithmetic $\ f\ :$ For any $\ p(t)\ \equiv\ a + b \cdot q(t)\ ,$ with $\ a\$ and $\ b \ne 0\$ constants, and $\ q\$ a quasi-aritmetic function, $\ M_p(\ x\ )\$ and $M_q(\ x\ )\$ are always the same. In mathematical notation:
 Given $\ q\$ quasi-aritmetic, and $\ p\ :\ \bigl(\ p(t) = a + b \cdot q(t)\;\ \forall\ t\ \bigr)\; \forall\ a\; \forall\ b \ne 0 \quad \Rightarrow \quad M_p(\ x\ ) = M_q(\ x\ )\; \forall\ x ~.$

Central limit theorem : Under certain regularity conditions, and for a sufficiently large sample,
 $\ z ~\equiv~ \sqrt{n\ }\ \biggl[\; M_f(\ X_1,\ \ldots\ ,\ X_n\ )\; -\; \operatorname\mathbb{E}_X\! \Bigl(\ M_f(\ X_1,\ \ldots\ ,\ X_n\ )\ \Bigr)\; \biggr]\$
is approximately normally distributed. A similar result is available for Bajraktarević means and deviation means, which are generalizations of quasi-arithmetic means.

== Characterization ==
There are several different sets of properties that characterize the quasi-arithmetic mean (i.e., each function that satisfies these properties is an f-mean for some function f).

- Mediality is essentially sufficient to characterize quasi-arithmetic means.
- Self-distributivity is essentially sufficient to characterize quasi-arithmetic means.
- Replacement: Kolmogorov proved that the five properties of symmetry, fixed-point, monotonicity, continuity, and replacement fully characterize the quasi-arithmetic means.
- Continuity is superfluous in the characterization of two variables quasi-arithmetic means. See [10] for the details.
- Balancing: An interesting problem is whether this condition (together with symmetry, fixed-point, monotonicity and continuity properties) implies that the mean is quasi-arithmetic. Georg Aumann showed in the 1930s that the answer is no in general, but that if one additionally assumes $M$ to be an analytic function then the answer is positive.

== Homogeneity ==

Means are usually homogeneous, but for most functions $f$, the f-mean is not.
Indeed, the only homogeneous quasi-arithmetic means are the power means (including the geometric mean); see Hardy-Littlewood-Pólya, page 68.

The homogeneity property can be achieved by normalizing the input values by some (homogeneous) mean $C$.
$M_{f,C} x = C x \cdot f^{-1}\left( \frac{f\left(\frac{x_1}{C x}\right) + \cdots + f\left(\frac{x_n}{C x}\right)}{n} \right)$
However this modification may violate monotonicity and the partitioning property of the mean.

== Generalizations ==

Consider a Legendre-type strictly convex function $F$. Then the gradient map $\nabla F$ is globally invertible and the weighted multivariate quasi-arithmetic mean is defined by
$M_{\nabla F}(\theta_1,\ldots,\theta_n;w) = {\nabla F}^{-1}\left(\sum_{i=1}^n w_i \nabla F(\theta_i)\right)$, where $w$ is a normalized weight vector ($w_i=\frac{1}{n}$ by default for a balanced average). From the convex duality, we get a dual quasi-arithmetic mean $M_{\nabla F^*}$ associated to the quasi-arithmetic mean $M_{\nabla F}$.
For example, take $F(X)=-\log\det(X)$ for $X$ a symmetric positive-definite matrix.
The pair of matrix quasi-arithmetic means yields the matrix harmonic mean:
$M_{\nabla F}(\theta_1,\theta_2)=2(\theta_1^{-1}+\theta_2^{-1})^{-1}.$

== See also ==
- Generalized mean
- Jensen's inequality
