# Smooth maximum

Jump to navigation Jump to search

In mathematics, a smooth maximum of an indexed family x1, ..., xn of numbers is a smooth approximation to the maximum function $\max(x_{1},\ldots ,x_{n}),$ meaning a parametric family of functions $m_{\alpha }(x_{1},\ldots ,x_{n})$ such that for every α, the function $m_{\alpha }$ is smooth, and the family converges to the maximum function $m_{\alpha }\to \max$ as $\alpha \to \infty$ . The concept of smooth minimum is similarly defined. In many cases, a single family approximates both: maximum as the parameter goes to positive infinity, minimum as the parameter goes to negative infinity; in symbols, $m_{\alpha }\to \max$ as $\alpha \to \infty$ and $m_{\alpha }\to \min$ as $\alpha \to -\infty$ . The term can also be used loosely for a specific smooth function that behaves similarly to a maximum, without necessarily being part of a parametrized family.

## Examples Smoothmax applied on '-x' and x function with various coefficients. Very smooth for $\alpha$ =0.5, and more sharp for $\alpha$ =8.

For large positive values of the parameter $\alpha >0$ , the following formulation is a smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.

${\mathcal {S}}_{\alpha }(x_{1},\ldots ,x_{n})={\frac {\sum _{i=1}^{n}x_{i}e^{\alpha x_{i}}}{\sum _{i=1}^{n}e^{\alpha x_{i}}}}$ ${\mathcal {S}}_{\alpha }$ has the following properties:

1. ${\mathcal {S}}_{\alpha }\to \max$ as $\alpha \to \infty$ 2. ${\mathcal {S}}_{0}$ is the arithmetic mean of its inputs
3. ${\mathcal {S}}_{\alpha }\to \min$ as $\alpha \to -\infty$ The gradient of ${\mathcal {S}}_{\alpha }$ is closely related to softmax and is given by

$\nabla _{x_{i}}{\mathcal {S}}_{\alpha }(x_{1},\ldots ,x_{n})={\frac {e^{\alpha x_{i}}}{\sum _{j=1}^{n}e^{\alpha x_{j}}}}[1+\alpha (x_{i}-{\mathcal {S}}_{\alpha }(x_{1},\ldots ,x_{n}))].$ This makes the softmax function useful for optimization techniques that use gradient descent.

### LogSumExp

Another smooth maximum is LogSumExp:

$\mathrm {LSE} (x_{1},\ldots ,x_{n})=\log(\exp(x_{1})+\ldots +\exp(x_{n}))$ This can also be normalized if the $x_{i}$ are all non-negative, yielding a function with domain $[0,\infty )^{n}$ and range $[0,\infty )$ :

$g(x_{1},\ldots ,x_{n})=\log(\exp(x_{1})+\ldots +\exp(x_{n})-(n-1))$ The $(n-1)$ term corrects for the fact that $\exp(0)=1$ by canceling out all but one zero exponential, and $\log 1=0$ if all $x_{i}$ are zero.