# Smooth maximum

In mathematics, a smooth maximum of an indexed family x1, ..., xn of numbers is a smooth approximation to the maximum function ${\displaystyle \max(x_{1},\ldots ,x_{n}),}$ meaning a parametric family of functions ${\displaystyle m_{\alpha }(x_{1},\ldots ,x_{n})}$ such that for every α, the function ${\displaystyle m_{\alpha }}$ is smooth, and the family converges to the maximum function ${\displaystyle m_{\alpha }\to \max }$ as ${\displaystyle \alpha \to \infty }$. The concept of smooth minimum is similarly defined. In many cases, a single family approximates both: maximum as the parameter goes to positive infinity, minimum as the parameter goes to negative infinity; in symbols, ${\displaystyle m_{\alpha }\to \max }$ as ${\displaystyle \alpha \to \infty }$ and ${\displaystyle m_{\alpha }\to \min }$ as ${\displaystyle \alpha \to -\infty }$. The term can also be used loosely for a specific smooth function that behaves similarly to a maximum, without necessarily being part of a parametrized family.

## Examples

Smoothmax applied on '-x' and x function with various coefficients. Very smooth for ${\displaystyle \alpha }$=0.5, and more sharp for ${\displaystyle \alpha }$=8.

For large positive values of the parameter ${\displaystyle \alpha >0}$, the following formulation is a smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.

${\displaystyle {\mathcal {S}}_{\alpha }(x_{1},\ldots ,x_{n})={\frac {\sum _{i=1}^{n}x_{i}e^{\alpha x_{i}}}{\sum _{i=1}^{n}e^{\alpha x_{i}}}}}$

${\displaystyle {\mathcal {S}}_{\alpha }}$ has the following properties:

1. ${\displaystyle {\mathcal {S}}_{\alpha }\to \max }$ as ${\displaystyle \alpha \to \infty }$
2. ${\displaystyle {\mathcal {S}}_{0}}$ is the arithmetic mean of its inputs
3. ${\displaystyle {\mathcal {S}}_{\alpha }\to \min }$ as ${\displaystyle \alpha \to -\infty }$

The gradient of ${\displaystyle {\mathcal {S}}_{\alpha }}$ is closely related to softmax and is given by

${\displaystyle \nabla _{x_{i}}{\mathcal {S}}_{\alpha }(x_{1},\ldots ,x_{n})={\frac {e^{\alpha x_{i}}}{\sum _{j=1}^{n}e^{\alpha x_{j}}}}[1+\alpha (x_{i}-{\mathcal {S}}_{\alpha }(x_{1},\ldots ,x_{n}))].}$

This makes the softmax function useful for optimization techniques that use gradient descent.

### LogSumExp

Another smooth maximum is LogSumExp:

${\displaystyle \mathrm {LSE} (x_{1},\ldots ,x_{n})=\log(\exp(x_{1})+\ldots +\exp(x_{n}))}$

This can also be normalized if the ${\displaystyle x_{i}}$ are all non-negative, yielding a function with domain ${\displaystyle [0,\infty )^{n}}$ and range ${\displaystyle [0,\infty )}$:

${\displaystyle g(x_{1},\ldots ,x_{n})=\log(\exp(x_{1})+\ldots +\exp(x_{n})-(n-1))}$

The ${\displaystyle (n-1)}$ term corrects for the fact that ${\displaystyle \exp(0)=1}$ by canceling out all but one zero exponential, and ${\displaystyle \log 1=0}$ if all ${\displaystyle x_{i}}$ are zero.