Laplace's method: Difference between revisions

Content deleted Content added

Inline

Revision as of 00:19, 17 January 2014

In mathematics, Laplace's method, named after Pierre-Simon Laplace, is a technique used to approximate integrals of the form

\int _{a}^{b}\!e^{Mf(x)}\,dx

where ƒ(x) is some twice-differentiable function, M is a large number, and the integral endpoints a and b could possibly be infinite. This technique was originally presented in Laplace (1774, pp. 366–367).

The idea of Laplace's method

Assume that the function ƒ(x) has a unique global maximum at x₀. Then, the value ƒ(x₀) will be larger than other values ƒ(x). If we multiply this function by a large number M, the ratio between Mƒ(x₀) and Mƒ(x) will stay the same (since Mƒ(x₀)/Mƒ(x) = ƒ(x₀)/ƒ(x)), but it will grow exponentially in the function (see figure)

e^{Mf(x)}.\,

Thus, significant contributions to the integral of this function will come only from points x in a neighborhood of x₀, which can then be estimated.

General theory of Laplace's method

To state and motivate the method, we need several assumptions. We will assume that x₀ is not an endpoint of the interval of integration, that the values ƒ(x) cannot be very close to ƒ(x₀) unless x is close to x₀, and that the second derivative $f''(x_{0})<0$ .

We can expand ƒ(x) around x₀ by Taylor's theorem,

f(x)=f(x_{0})+f'(x_{0})(x-x_{0})+{\frac {1}{2}}f''(x_{0})(x-x_{0})^{2}+R

where

R=O\left((x-x_{0})^{3}\right).

Since ƒ has a global maximum at x₀, and since x₀ is not an endpoint, it is a stationary point, so the derivative of ƒ vanishes at x₀. Therefore, the function ƒ(x) may be approximated to quadratic order

f(x)\approx f(x_{0})-{\frac {1}{2}}|f''(x_{0})|(x-x_{0})^{2}

for x close to x₀ (recall that the second derivative is negative at the global maximum ƒ(x₀)). The assumptions made ensure the accuracy of the approximation

\int _{a}^{b}\!e^{Mf(x)}\,dx\approx e^{Mf(x_{0})}\int _{a}^{b}e^{-M|f''(x_{0})|(x-x_{0})^{2}/2}\,dx

(see the picture on the right). This latter integral is a Gaussian integral if the limits of integration go from −∞ to +∞ (which can be assumed because the exponential decays very fast away from x₀), and thus it can be calculated. We find

\int _{a}^{b}\!e^{Mf(x)}\,dx\approx {\sqrt {\frac {2\pi }{M|f''(x_{0})|}}}e^{Mf(x_{0})}{\text{ as }}M\to \infty .\,

A generalization of this method and extension to arbitrary precision is provided by Fog (2008).

Formal statement and proof:

Assume that $f(x)$ is a twice differentiable function on $[a,b]$ with $x_{0}\in [a,b]$ the unique point such that $f(x_{0})=\max _{[a,b]}f(x)$ . Assume additionally that $f''(x_{0})<0$ .

Then,

\lim _{n\to +\infty }\left({\frac {\int _{a}^{b}e^{nf(x)}\,dx}{\left(e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}\right)}}\right)=1

Proof:

Lower bound:

Let $\varepsilon >0$ . Then by the continuity of $f''$ there exists $\delta >0$ such that if $|x_{0}-c|<\delta$ then $f''(c)\geq f''(x_{0})-\varepsilon .$ . By Taylor's Theorem, for any $x\in (x_{0}-\delta ,x_{0}+\delta )$ , $f(x)\geq f(x_{0})+{\frac {1}{2}}(f''(x_{0})-\varepsilon )(x-x_{0})^{2}$ .

Then we have the following lower bound:

\int _{a}^{b}e^{nf(x)}\,dx\geq \int _{x_{0}-\delta }^{x_{0}+\delta }e^{nf(x)}\,dx\geq e^{nf(x_{0})}\int _{x_{0}-\delta }^{x_{0}+\delta }e^{{\frac {n}{2}}(f''(x_{0})-\varepsilon )(x-x_{0})^{2}}\,dx=e^{nf(x_{0})}{\sqrt {\frac {1}{n(-f''(x_{0})+\varepsilon )}}}\int _{-\delta {\sqrt {n(-f''(x_{0})+\varepsilon )}}}^{\delta {\sqrt {n(-f''(x_{0})+\varepsilon )}}}e^{-{\frac {1}{2}}y^{2}}\,dy

where the last equality was obtained by a change of variables $y={\sqrt {n(-f''(x_{0})+\varepsilon )}}(x-x_{0})$ . Remember that $f''(x_{0})<0$ so that is why we can take the square root of its negation.

If we divide both sides of the above inequality by $e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}$ and take the limit we get:

\lim _{n\to +\infty }\left({\frac {\int _{a}^{b}e^{nf(x)}\,dx}{\left(e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}\right)}}\right)\geq \lim _{n\to +\infty }{\frac {1}{\sqrt {2\pi }}}\int _{-\delta {\sqrt {n(-f''(x_{0})+\varepsilon )}}}^{\delta {\sqrt {n(-f''(x_{0})+\varepsilon )}}}e^{-{\frac {1}{2}}y^{2}}\,dy{\sqrt {\frac {-f''(x_{0})}{-f''(x_{0})+\varepsilon }}}={\sqrt {\frac {-f''(x_{0})}{-f''(x_{0})+\varepsilon }}}

since this is true for arbitrary $\varepsilon$ we get the lower bound:

\lim _{n\to +\infty }\left({\frac {\int _{a}^{b}e^{nf(x)}\,dx}{\left(e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}\right)}}\right)\geq 1

Note that this proof works also when $a=-\infty$ or $b=\infty$ (or both).

Upper bound:

The proof of the upper bound is similar to the proof of the lower bound but there are a few inconveniences. Again we start by picking an $\varepsilon >0$ but in order for the proof to work we need $\varepsilon$ small enough so that $f''(x_{0})+\varepsilon <0$ . Then, as above, by continuity of $f''$ and Taylor's Theorem we can find $\delta >0$ so that if $|x-x_{0}|<\delta$ , then $f(x)\leq f(x_{0})+{\frac {1}{2}}(f''(x_{0})+\varepsilon )(x-x_{0})^{2}$ . Lastly, by our assumptions (assuming $a,b$ are finite) there exists an $\eta >0$ such that if $|x-x_{0}|\geq \delta$ , then $f(x)\leq f(x_{0})-\eta$ .

Then we can calculate the following upper bound:

\int _{a}^{b}e^{nf(x)}\,dx\leq \int _{a}^{x_{0}-\delta }e^{nf(x)}\,dx+\int _{x_{0}-\delta }^{x_{0}+\delta }e^{nf(x)}\,dx+\int _{x_{0}+\delta }^{b}e^{nf(x)}\,dx\leq (b-a)e^{n(f(x_{0})-\eta )}+\int _{x_{0}-\delta }^{x_{0}+\delta }e^{nf(x)}\,dx

\leq (b-a)e^{n(f(x_{0})-\eta )}+e^{nf(x_{0})}\int _{x_{0}-\delta }^{x_{0}+\delta }e^{{\frac {n}{2}}(f''(x_{0})+\varepsilon )(x-x_{0})^{2}}\,dx\leq (b-a)e^{n(f(x_{0})-\eta )}+e^{nf(x_{0})}\int _{-\infty }^{+\infty }e^{{\frac {n}{2}}(f''(x_{0})+\varepsilon )(x-x_{0})^{2}}\,dx

\leq (b-a)e^{n(f(x_{0})-\eta )}+e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0})-\varepsilon )}}}

If we divide both sides of the above inequality by $e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}$ and take the limit we get:

\lim _{n\to +\infty }\left({\frac {\int _{a}^{b}e^{nf(x)}\,dx}{\left(e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}\right)}}\right)\leq \lim _{n\to +\infty }\left((b-a)e^{-\eta n}{\sqrt {\frac {n(-f''(x_{0}))}{2\pi }}}+{\sqrt {\frac {-f''(x_{0})}{-f''(x_{0})-\varepsilon }}}\right)={\sqrt {\frac {-f''(x_{0})}{-f''(x_{0})-\varepsilon }}}

Since $\varepsilon$ is arbitrary we get the upper bound:

\lim _{n\to +\infty }\left({\frac {\int _{a}^{b}e^{nf(x)}\,dx}{\left(e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}\right)}}\right)\leq 1

And combining this with the lower bound gives the result.

Note that the above proof obviously fails when $a=-\infty$ or $b=\infty$ (or both). To deal with these cases, we need some extra assumptions. A sufficient (not necessary) assumption is that for $n=1$ , the integral $\int _{a}^{b}e^{nf(x)}\,dx$ is finite, and that the number $\eta$ as above exists (note that this must be an assumption in the case when the interval $[a,b]$ is infinite). The proof proceeds otherwise as above, but the integrals

\int _{a}^{x_{0}-\delta }e^{nf(x)}\,dx+\int _{x_{0}+\delta }^{b}e^{nf(x)}\,dx

must be approximated by

\int _{a}^{x_{0}-\delta }e^{nf(x)}\,dx+\int _{x_{0}+\delta }^{b}e^{nf(x)}\,dx\leq \int _{a}^{b}e^{f(x)}e^{(n-1)(f(x_{0})-\eta )}\,dx=e^{(n-1)(f(x_{0})-\eta )}\int _{a}^{b}e^{f(x)}\,dx

instead of $(b-a)e^{n(f(x_{0})-\eta )}$ as above, so that when we divide by $e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}$ , we get for this term

{\frac {e^{(n-1)(f(x_{0})-\eta )}\int _{a}^{b}e^{f(x)}\,dx}{e^{nf(x_{0})}{\sqrt {\frac {2\pi }{n(-f''(x_{0}))}}}}}=e^{-(n-1)\eta }{\sqrt {n}}e^{-f(x_{0})}\int _{a}^{b}e^{f(x)}\,dx{\sqrt {\frac {-f''(x_{0})}{2\pi }}}

whose limit as $n\rightarrow \infty$ is $0$ . The rest of the proof (the analysis of the interesting term) proceeds as above.

The given condition in the infinite interval case is, as said above, sufficient but not necessary. However, the condition is fulfilled in many, if not in most, applications: the condition simply says that the integral we are studying must be well-defined (not infinite) and that the maximum of the function at $x_{0}$ must be a "true" maximum (the number $\eta >0$ must exist). There is no need to demand that the integral is finite for $n=1$ but it is enough to demand that the integral is finite for some $n=N$ .

Other formulations

Laplace's approximation is sometimes written as

\int _{a}^{b}\!h(x)e^{Mg(x)}\,dx\approx {\sqrt {\frac {2\pi }{M|g''(x_{0})|}}}h(x_{0})e^{Mg(x_{0})}{\text{ as }}M\to \infty \,

where $h$ is positive.

Importantly, the accuracy of the approximation depends on the variable of integration, that is, on what stays in $g(x)$ and what goes into $h(x)$ .^[1]

In the multivariate case where $\mathbf {x}$ is a $d$ -dimensional vector and $f(\mathbf {x} )$ is a scalar function of $\mathbf {x}$ , Laplace's approximation is usually written as:

\int e^{Mf(\mathbf {x} )}\,d\mathbf {x} \approx \left({\frac {2\pi }{M}}\right)^{d/2}|H(f)(\mathbf {x} _{0})|^{-1/2}e^{Mf(\mathbf {x} _{0})}{\text{ as }}M\to \infty \,

where $H(f)(\mathbf {x} _{0})$ is the Hessian matrix of $f$ evaluated at $\mathbf {x} _{0}$ .

Laplace's method extension: Steepest descent

In extensions of Laplace's method, complex analysis, and in particular Cauchy's integral formula, is used to find a contour of steepest descent for an (asymptotically with large M) equivalent integral, expressed as a line integral. In particular, if no point x₀ where the derivative of ƒ vanishes exists on the real line, it may be necessary to deform the integration contour to an optimal one, where the above analysis will be possible. Again the main idea is to reduce, at least asymptotically, the calculation of the given integral to that of a simpler integral that can be explicitly evaluated. See the book of Erdelyi (1956) for a simple discussion (where the method is termed steepest descents).

The appropriate formulation for the complex z-plane is

\int _{a}^{b}\!e^{Mf(z)}\,dz\approx {\sqrt {\frac {2\pi }{-Mf''(z_{0})}}}e^{Mf(z_{0})}{\text{ as }}M\to \infty .\,

for a path passing through the saddle point at z₀. Note the explicit appearance of a minus sign to indicate the direction of the second derivative: one must not take the modulus. Also note that if the integrand is meromorphic, one may have to add residues corresponding to poles traversed while deforming the contour (see for example section 3 of Okounkov's paper Symmetric functions and random partitions).

Further generalizations

An extension of the steepest descent method is the so-called nonlinear stationary phase/steepest descent method. Here, instead of integrals, one needs to evaluate asymptotically solutions of Riemann–Hilbert factorization problems.

Given a contour C in the complex sphere, a function ƒ defined on that contour and a special point, say infinity, one seeks a function M holomorphic away from the contour C, with prescribed jump across C, and with a given normalization at infinity. If ƒ and hence M are matrices rather than scalars this is a problem that in general does not admit an explicit solution.

An asymptotic evaluation is then possible along the lines of the linear stationary phase/steepest descent method. The idea is to reduce asymptotically the solution of the given Riemann–Hilbert problem to that of a simpler, explicitly solvable, Riemann–Hilbert problem. Cauchy's theorem is used to justify deformations of the jump contour.

The nonlinear stationary phase was introduced by Deift and Zhou in 1993, based on earlier work of Its. A (properly speaking) nonlinear steepest descent method was introduced by Kamvissis, K. McLaughlin and P. Miller in 2003, based on previous work of Lax, Levermore, Deift, Venakides and Zhou.

The nonlinear stationary phase/steepest descent method has applications to the theory of soliton equations and integrable models, random matrices and combinatorics.

Complex integrals

For complex integrals in the form:

{\frac {1}{2\pi i}}\int _{c-i\infty }^{c+i\infty }g(s)e^{st}\,ds

with t >> 1, we make the substitution t = iu and the change of variable s = c + ix to get the Laplace bilateral transform:

{\frac {1}{2\pi }}\int _{-\infty }^{\infty }g(c+ix)e^{-ux}e^{icu}\,dx.

We then split g(c+ix) in its real and complex part, after which we recover u = t / i. This is useful for inverse Laplace transforms, the Perron formula and complex integration.

Example 1: Stirling's approximation

Laplace's method can be used to derive Stirling's approximation

N!\approx {\sqrt {2\pi N}}N^{N}e^{-N}\,

for a large integer N.

From the definition of the Gamma function, we have

N!=\Gamma (N+1)=\int _{0}^{\infty }e^{-x}x^{N}\,dx.

Now we change variables, letting

x=Nz\,

so that

dx=N\,dz.

Plug these values back in to obtain

{\begin{aligned}N!&=\int _{0}^{\infty }e^{-Nz}\left(Nz\right)^{N}N\,dz\\&=N^{N+1}\int _{0}^{\infty }e^{-Nz}z^{N}\,dz\\&=N^{N+1}\int _{0}^{\infty }e^{-Nz}e^{N\ln z}\,dz\\&=N^{N+1}\int _{0}^{\infty }e^{N(\ln z-z)}\,dz.\end{aligned}}

This integral has the form necessary for Laplace's method with

f\left(z\right)=\ln {z}-z

which is twice-differentiable:

f'(z)={\frac {1}{z}}-1,\,

f''(z)=-{\frac {1}{z^{2}}}.\,

The maximum of ƒ(z) lies at z₀ = 1, and the second derivative of ƒ(z) has the value −1 at this point. Therefore, we obtain

N!\approx N^{N+1}{\sqrt {\frac {2\pi }{N}}}e^{-N}={\sqrt {2\pi N}}N^{N}e^{-N}.\,

Example 2: parameter estimation and probabilistic inference

Azevedo-Filho & Shachter 1994 reviews Laplace's method results (univariate and multivariate) and presents a detailed example showing the method used in parameter estimation and probabilistic inference under a Bayesian perspective. Laplace's method is applied to a meta-analysis problem from the medical domain, involving experimental data, and compared to other techniques.

References

^ Butler, Ronald W (2007). Saddlepoint approximations and applications. Cambridge University Press. ISBN 978-0-521-87250-8.

Azevedo-Filho, A.; Shachter, R. (1994), "Laplace's Method Approximations for Probabilistic Inference in Belief Networks with Continuous Variables", in Mantaras, R.; Poole, D. (eds.), Uncertainty in Artificial Intelligence, San Francisco, CA: Morgan Kauffman, CiteSeer^x: 10.1.1.91.2064.
Deift, P.; Zhou, X. (1993), "A steepest descent method for oscillatory Riemann–Hilbert problems. Asymptotics for the MKdV equation", Ann. of Math., vol. 137, no. 2, pp. 295–368, doi:10.2307/2946540.
Erdelyi, A. (1956), Asymptotic Expansions, Dover.
Fog, A. (2008), "Calculation Methods for Wallenius' Noncentral Hypergeometric Distribution", Communications in Statistics, Simulation and Computation, vol. 37, no. 2, pp. 258–273, doi:10.1080/03610910701790269.
Kamvissis, S.; McLaughlin, K. T.-R.; Miller, P. (2003), "Semiclassical Soliton Ensembles for the Focusing Nonlinear Schrödinger Equation", Annals of Mathematics Studies, vol. 154, Princeton University Press.
Laplace, P. S. (1774). Memoir on the probability of causes of events. Mémoires de Mathématique et de Physique, Tome Sixième. (English translation by S. M. Stigler 1986. Statist. Sci., 1(19):364–378).

This article incorporates material from saddle point approximation on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.

[1] Butler, Ronald W (2007). Saddlepoint approximations and applications. Cambridge University Press. ISBN 978-0-521-87250-8.

[1]

@@ Line 115: / Line 115: @@
 And combining this with the lower bound gives the result.
-Note that the above proof obviously fails when <math> a = -\infty </math> or <math> b = \infty </math> (or both). To deal with these cases, we need some extra assumptions. A sufficient (not necessary) assumption is that for <math> n = 1 </math>, the integral <math> \int_a^b e^{nf(x)} \, dx </math> is finite, and that the number <math> \eta </math> as above exist (note that this must be an assumption in the case when the interval <math> [a,b] </math> is infinite). The proof proceeds otherwise as above, but the integrals
+Note that the above proof obviously fails when <math> a = -\infty </math> or <math> b = \infty </math> (or both). To deal with these cases, we need some extra assumptions. A sufficient (not necessary) assumption is that for <math> n = 1 </math>, the integral <math> \int_a^b e^{nf(x)} \, dx </math> is finite, and that the number <math> \eta </math> as above exists (note that this must be an assumption in the case when the interval <math> [a,b] </math> is infinite). The proof proceeds otherwise as above, but the integrals
 : <math>
@@ Line 131: / Line 131: @@
 : <math>
-\frac{e^{(n-1)(f(x_0) - \eta)} \int_a^b e^{f(x)} \, dx }{e^{nf(x_0)}\sqrt{\frac{2 \pi}{n (-f''(x_0))}}} = e^{-(n-1)\eta)} \sqrt{n} e^{-f(x_0)} \int_a^b e^{f(x)} \, dx \sqrt{\frac{ -f''(x_0)}{ 2 \pi}}
+\frac{e^{(n-1)(f(x_0) - \eta)} \int_a^b e^{f(x)} \, dx }{e^{nf(x_0)}\sqrt{\frac{2 \pi}{n (-f''(x_0))}}} = e^{-(n-1)\eta} \sqrt{n} e^{-f(x_0)} \int_a^b e^{f(x)} \, dx \sqrt{\frac{ -f''(x_0)}{ 2 \pi}}
 </math>