= Moreau envelope =

The Moreau envelope (or the Moreau-Yosida regularization) $M_f$ of a proper lower semi-continuous convex function $f$ is a smoothed version of $f$. It was proposed by Jean-Jacques Moreau in 1965.

The Moreau envelope has important applications in mathematical optimization: minimizing over $M_f$ and minimizing over $f$ are equivalent problems in the sense that the sets of minimizers of $f$ and $M_f$ are the same. However, first-order optimization algorithms can be directly applied to $M_f$, since $f$ may be non-differentiable while $M_f$ is always continuously differentiable. Indeed, many proximal gradient methods can be interpreted as a gradient descent method over $M_f$.

== Definition ==
The Moreau envelope of a proper lower semi-continuous convex function $f$ from a Hilbert space $\mathcal{X}$ to $(-\infty,+\infty]$ is defined as

$M_{f}(v) = \inf_{x\in\mathcal{X}} \left(f(x) + \frac{1}{2} \|x - v\|_2^2\right).$

Given a parameter $\lambda \in \mathbb{R}$, the Moreau envelope of $\lambda f$ is also called as the Moreau envelope of $f$ with parameter $\lambda$.

== Properties ==
- The Moreau envelope can also be seen as the infimal convolution between $f$ and $(1/2)\| \cdot \|^2_2$.
- The proximal operator of a function is related to the gradient of the Moreau envelope by the following identity:
$\nabla M_{\lambda f}(x) = \frac{1}{\lambda} (x - \mathrm{prox}_{\lambda f}(x))$. By defining the sequence $x_{k+1} = \mathrm{prox}_{\lambda f}(x_k)$ and using the above identity, we can interpret the proximal operator as a gradient descent algorithm over the Moreau envelope.

- Using Fenchel's duality theorem, one can derive the following dual formulation of the Moreau envelope:
$M_{\lambda f}(v) = \max_{p \in \mathcal X} \left( \langle p, v \rangle - \frac{\lambda}{2} \| p \|^2 - f^*(p)\right),$ where $f^*$ denotes the convex conjugate of $f$.
Since the subdifferential of a proper, convex, lower semicontinuous function on a Hilbert space is inverse to the subdifferential of its convex conjugate, we can conclude that if $p_0 \in \mathcal X$ is the maximizer of the above expression, then $x_0 := v - \lambda p_0$ is the minimizer in the primal formulation and vice versa.

- By Hopf–Lax formula, the Moreau envelope is a viscosity solution to a Hamilton–Jacobi equation. Stanley Osher and co-authors used this property and Cole–Hopf transformation to derive an algorithm to compute approximations to the proximal operator of a function.

== See also ==
- Proximal operator
- Proximal gradient method
