# Finite difference

A finite difference is a mathematical expression of the form f(x + b) − f(x + a). If a finite difference is divided by b − a, one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in finite difference methods for the numerical solution of differential equations, especially boundary value problems.

Recurrence relations can be written as difference equations by replacing iteration notation with finite differences.

## Forward, backward, and central differences

Three forms are commonly considered: forward, backward, and central differences.

A forward difference is an expression of the form

$\Delta_h[f](x) = f(x + h) - f(x). \$

Depending on the application, the spacing h may be variable or constant. When omitted, h is taken to be 1: $\Delta[f](x) = \Delta_1[f](x)$.

A backward difference uses the function values at x and x − h, instead of the values at x + h and x:

$\nabla_h[f](x) = f(x) - f(x-h). \$

Finally, the central difference is given by

$\delta_h[f](x) = f(x+\tfrac12h)-f(x-\tfrac12h). \$

## Relation with derivatives

The derivative of a function f at a point x is defined by the limit

$f'(x) = \lim_{h\to0} \frac{f(x+h) - f(x)}{h}.$

If h has a fixed (non-zero) value instead of approaching zero, then the right-hand side of the above equation would be written

$\frac{f(x + h) - f(x)}{h} = \frac{\Delta_h[f](x)}{h}.$

Hence, the forward difference divided by h approximates the derivative when h is small. The error in this approximation can be derived from Taylor's theorem. Assuming that f is differentiable, we have

$\frac{\Delta_h[f](x)}{h} - f'(x) \to 0 \quad \text{as }(h \to 0).$

The same formula holds for the backward difference:

$\frac{\nabla_h[f](x)}{h} - f'(x) \to 0 \quad \text{as }(h \to 0).$

However, the central difference yields a more accurate approximation. If f is twice differentiable,

$\frac{\delta_h[f](x)}{h} - f'(x) = o(h) . \!$

The main problem with the central difference method, however, is that oscillating functions can yield zero derivative. If f(nh)=1 for n odd, and f(nh)=2 for n even, then f ' (nh)=0 if it is calculated with the central difference scheme. This is particularly troublesome if the domain of f is discrete.

## Higher-order differences

In an analogous way one can obtain finite difference approximations to higher order derivatives and differential operators. For example, by using the above central difference formula for f ' (x+h/2) and f ' (xh/2) and applying a central difference formula for the derivative of f ' at x, we obtain the central difference approximation of the second derivative of f:

2nd order central

$f''(x) \approx \frac{\delta_h^2[f](x)}{h^2} = \frac{f(x+h) - 2 f(x) + f(x-h)}{h^{2}} .$

Similarly we can apply other differencing formulas in a recursive manner.

2nd order forward

$f''(x) \approx \frac{\Delta_h^2[f](x)}{h^2} = \frac{f(x+2h) - 2 f(x+h) + f(x)}{h^{2}} .$

More generally, the n-th order forward, backward, and central differences are given by, respectively,

Forward

$\Delta^n_h[f](x) = \sum_{i = 0}^{n} (-1)^i \binom{n}{i} f(x + (n - i) h),$

or for h=1,

$\Delta^n [f](x)= \sum_{k=0}^n\binom nk(-1)^{n-k}f(x + k)$

Backward

$\nabla^n_h[f](x) = \sum_{i = 0}^{n} (-1)^i \binom{n}{i} f(x - ih),$

Central

$\delta^n_h[f](x) = \sum_{i = 0}^{n} (-1)^i \binom{n}{i} f\left(x + \left(\frac{n}{2} - i\right) h\right).$

These equations are using binomial coefficients after the summation sign shown as $\ \binom{n}{i}$. Each row of Pascal's triangle provides the coefficient for each value of i.

Note that the central difference will, for odd n, have h multiplied by non-integers. This is often a problem because it amounts to changing the interval of discretization. The problem may be remedied taking the average of $\delta^n[f](x - h/2)$ and $\delta^n[f](x + h/2)$.

Forward differences applied to a sequence are sometimes called the binomial transform of the sequence, and have a number of interesting combinatorial properties. Forward differences may be evaluated using the Nörlund–Rice integral. The integral representation for these types of series is interesting, because the integral can often be evaluated using asymptotic expansion or saddle-point techniques; by contrast, the forward difference series can be extremely hard to evaluate numerically, because the binomial coefficients grow rapidly for large n.

The relationship of these higher-order differences with the respective derivatives is straightforward,

$\frac{d^n f}{d x^n}(x) = \frac{\Delta_h^n[f](x)}{h^n}+O(h) = \frac{\nabla_h^n[f](x)}{h^n}+O(h) = \frac{\delta_h^n[f](x)}{h^n} + O(h^2).$

Higher-order differences can also be used to construct better approximations. As mentioned above, the first-order difference approximates the first-order derivative up to a term of order h. However, the combination

$\frac{\Delta_h[f](x) - \frac12 \Delta_h^2[f](x)}{h} = - \frac{f(x+2h)-4f(x+h)+3f(x)}{2h}$

approximates f'(x) up to a term of order h2. This can be proven by expanding the above expression in Taylor series, or by using the calculus of finite differences, explained below.

If necessary, the finite difference can be centered about any point by mixing forward, backward, and central differences.

### Arbitrarily sized kernels

Using a little linear algebra, one can fairly easily construct approximations, which sample an arbitrary number of points to the left and a (possibly different) number of points to the right of the center point, for any order of derivative. This involves solving a linear system such that the Taylor expansion of the sum of those points, around the center point, well approximates the Taylor expansion of the desired derivative.

This is useful for differentiating a function on a grid, where, as one approaches the edge of the grid, one must sample fewer and fewer points on one side.

The details are outlined in these notes.

### Properties

• For all positive k and n
$\Delta^n_{kh} (f, x) = \sum\limits_{i_1=0}^{k-1} \sum\limits_{i_2=0}^{k-1} \cdots \sum\limits_{i_n=0}^{k-1} \Delta^n_h (f, x+i_1h+i_2h+\cdots+i_nh).$
$\Delta^n_h (fg, x) = \sum\limits_{k=0}^n \binom{n}{k} \Delta^k_h (f, x) \Delta^{n-k}_h(g, x+kh).$

## Finite difference methods

An important application of finite differences is in numerical analysis, especially in numerical differential equations, which aim at the numerical solution of ordinary and partial differential equations respectively. The idea is to replace the derivatives appearing in the differential equation by finite differences that approximate them. The resulting methods are called finite difference methods.

Common applications of the finite difference method are in computational science and engineering disciplines, such as thermal engineering, fluid mechanics, etc.

## Newton's series

The Newton series consists of the terms of the Newton forward difference equation, named after Isaac Newton; in essence, it is the Newton interpolation formula, first published in his Principia Mathematica in 1687,[1] namely the discrete analog of the continuum Taylor expansion,

 $f(x)=\sum_{k=0}^\infty\frac{\Delta^k [f](a)}{k!} ~(x-a)_k = \sum_{k=0}^\infty {x-a \choose k}~ \Delta^k [f](a) ~,$

which holds for any polynomial function f and for most (but not all) analytic functions. Here, the expression

${x \choose k} = \frac{(x)_k}{k!}$

is the binomial coefficient, and

$(x)_k=x(x-1)(x-2)\cdots(x-k+1)$

is the "falling factorial" or "lower factorial", while the empty product (x)0 is defined to be 1. In this particular case, there is an assumption of unit steps for the changes in the values of x, h = 1 of the generalization below.

Note also the formal correspondence of this result to Taylor's theorem. Historically, this, as well as the Chu–Vandermonde identity,

$(x+y)_n=\sum_{k=0}^n {n \choose k} (x)_{n-k} ~(y)_k ~,$

(following from it, and corresponding to the binomial theorem), are included in the observations which matured to the system of the umbral calculus.

To illustrate how one may use Newton's formula in actual practice, consider the first few terms of doubling the Fibonacci sequence f = 2, 2, 4, ... One can find a polynomial that reproduces these values, by first computing a difference table, and then substituting the differences which correspond to x0 (underlined) into the formula as follows,

\begin{matrix} \begin{array}{|c||c|c|c|} \hline x & f=\Delta^0 & \Delta^1 & \Delta^2 \\ \hline 1&\underline{2}& & \\ & &\underline{0}& \\ 2&2& &\underline{2} \\ & &2& \\ 3&4& & \\ \hline \end{array} & \quad \begin{align} f(x) & =\Delta^0 \cdot 1 +\Delta^1 \cdot \dfrac{(x-x_0)_1}{1!} + \Delta^2 \cdot \dfrac{(x-x_0)_2}{2!} \quad (x_0=1)\\ \\ & =2 \cdot 1 + 0 \cdot \dfrac{x-1}{1} + 2 \cdot \dfrac{(x-1)(x-2)}{2} \\ \\ & =2 + (x-1)(x-2) \\ \end{align} \end{matrix}

For the case of nonuniform steps in the values of x, Newton computes the divided differences,

$\Delta _{j,0}=y_j,\quad \quad \Delta _{j,k}=\frac{\Delta _{j+1,k-1}-\Delta _{j,k-1}}{x_{j+k}-x_j}\quad \ni \quad \left\{ k>0,\ \ j\le \max \left( j \right)-k \right\},\quad \quad \Delta 0_k=\Delta _{0,k}$

the series of products,

${P_0}=1,\quad \quad P_{k+1}=P_k\cdot \left( \xi -x_k \right) ~,$

and the resulting polynomial is the scalar product, $f(\xi ) = \Delta 0 \cdot P\left( \xi \right)$ .[2]

In analysis with p-adic numbers, Mahler's theorem states that the assumption that f is a polynomial function can be weakened all the way to the assumption that f is merely continuous.

Carlson's theorem provides necessary and sufficient conditions for a Newton series to be unique, if it exists. However, a Newton series will not, in general, exist.

The Newton series, together with the Stirling series and the Selberg series, is a special case of the general difference series, all of which are defined in terms of suitably scaled forward differences.

In a compressed and slightly more general form and equidistant nodes the formula reads

$f(x)=\sum_{k=0}{\frac{x-a}h \choose k} \sum_{j=0}^k (-1)^{k-j}{k\choose j}f(a+j h).$

## Calculus of finite differences

The forward difference can be considered as a difference operator,[3][4] which maps the function f to Δh[f ]. This operator amounts to

$\Delta_h = T_h-I, \,$

where Th is the shift operator with step h, defined by Th[f ](x) = f(x+h), and I is the identity operator.

The finite difference of higher orders can be defined in recursive manner as Δhn ≡ Δhhn−1). Another equivalent definition is Δhn = [ThI]n.

The difference operator Δh is a linear operator and it satisfies a special Leibniz rule indicated above, Δh(f(x)g(x)) = (Δhf(x)) g(x+h) + f(x) (Δhg(x)). Similar statements hold for the backward and central differences.

Formally applying the Taylor series with respect to h, yields the formula

$\Delta_h = hD + \frac{1}{2} h^2D^2 + \frac{1}{3!} h^3D^3 + \cdots = \mathrm{e}^{hD} - I ~,$

where D denotes the continuum derivative operator, mapping f to its derivative f'. The expansion is valid when both sides act on analytic functions, for sufficiently small h. Thus, Th=ehD, and formally inverting the exponential yields

$hD = \log(1+\Delta_h) = \Delta_h - \tfrac{1}{2} \Delta_h^2 + \tfrac{1}{3} \Delta_h^3 + \cdots. \,$

This formula holds in the sense that both operators give the same result when applied to a polynomial.

Even for analytic functions, the series on the right is not guaranteed to converge; it may be an asymptotic series. However, it can be used to obtain more accurate approximations for the derivative. For instance, retaining the first two terms of the series yields the second-order approximation to f’(x) mentioned at the end of the section Higher-order differences.

The analogous formulas for the backward and central difference operators are

$hD = -\log(1-\nabla_h) \quad\text{and}\quad hD = 2 \, \operatorname{arsinh}(\tfrac12\delta_h).$

The calculus of finite differences is related to the umbral calculus of combinatorics. This remarkably systematic correspondence is due to the identity of the commutators of the umbral quantities to their continuum analogs (h→0 limits),

 $\Bigl[ \frac{\Delta_h}{h} ~,~ x\, T^{-1}_h \Bigr] = [ D ~,~ x ] = I ~.$

A large number of formal differential relations of standard calculus involving functions f(x) thus map systematically to umbral finite-difference analogs involving f(xTh−1).

For instance, the umbral analog of a monomial xn is a generalization of the above falling factorial (Pochhammer k-symbol),

$~(x)_n\equiv (xT_h^{-1})^n=x (x-h) (x-2h) \cdots (x-(n-1)h)$ ,

so that

$\frac{\Delta_h}{h} ~(x)_n=n ~(x)_{n-1} ~,$

hence the above Newton interpolation formula (by matching coefficients in the expansion of an arbitrary function f(x) in such symbols), and so on.

For example, the umbral sine is

$\sin (x\,T_h^{-1}) = x -\frac{(x)_3}{3!} + \frac{(x)_5}{5!} - \frac{(x)_7}{7!} + \cdots .$

As in the continuum limit, the eigenfunction of Δh /h also happens to be an exponential,

$\frac{\Delta_h}{h}~(1+\lambda h)^{x/h} =\frac{\Delta_h}{h} ~e^{\ln (1+\lambda h) ~x/h}= \lambda ~e^{\ln (1+\lambda h) ~x/h} ~,$

and hence Fourier sums of continuum functions are readily mapped to umbral Fourier sums faithfully, i.e., involving the same Fourier coefficients multiplying these umbral basis exponentials.[5] This umbral exponential thus amounts to the exponential generating function of the Pochhammer symbols.

Thus, for instance, the Dirac delta function maps to its umbral correspondent, the cardinal sine function,

$\delta (x) \mapsto \frac{\sin \bigl[ \frac{\pi}{2}(1+x/h) \bigr]}{ \pi (x+h) }~,$

and so forth.[6] Difference equations can often be solved with techniques very similar to those for solving differential equations.

The inverse operator of the forward difference operator, so then the umbral integral, is the indefinite sum or antidifference operator.

## Rules for calculus of finite difference operators

Analogous to rules for finding the derivative, we have:

• Constant rule: If c is a constant, then
$\Delta c = 0{\,}$
$\Delta (a f + b g) = a \,\Delta f + b \,\Delta g$

All of the above rules apply equally well to any difference operator, including $\nabla$ as to $\Delta$.

$\Delta (f g) = f \,\Delta g + g \,\Delta f + \Delta f \,\Delta g$
$\nabla (f g) = f \,\nabla g + g \,\nabla f - \nabla f \,\nabla g$
$\nabla \left( \frac{f}{g} \right) = \frac{1}{g} \det \begin{bmatrix} \nabla f & \nabla g \\ f & g \end{bmatrix} \left( \det {\begin{bmatrix} g & \nabla g \\ 1 & 1 \end{bmatrix}}\right)^{-1}$
or
$\nabla\left( \frac{f}{g} \right)= \frac {g \,\nabla f - f \,\nabla g}{g \cdot (g - \nabla g)}$
$\Delta\left( \frac{f}{g} \right)= \frac {g \,\Delta f - f \,\Delta g}{g \cdot (g + \Delta g)}$
• Summation rules:
$\sum_{n=a}^{b} \Delta f(n) = f(b+1)-f(a)$
$\sum_{n=a}^{b} \nabla f(n) = f(b)-f(a-1)$

## Generalizations

• A generalized finite difference is usually defined as
$\Delta_h^\mu[f](x) = \sum_{k=0}^N \mu_k f(x+kh),$

where $\mu = (\mu_0,\ldots,\mu_N)$ is its coefficients vector. An infinite difference is a further generalization, where the finite sum above is replaced by an infinite series. Another way of generalization is making coefficients $\mu_k$ depend on point $x$ : $\mu_k=\mu_k(x)$, thus considering weighted finite difference. Also one may make step $h$ depend on point $x$ : $h=h(x)$. Such generalizations are useful for constructing different modulus of continuity.

• The generalized difference can be seen as the polynomial rings $R[T_h]$ . It leads to difference algebras.
• As a convolution operator: Via the formalism of incidence algebras, difference operators and other Möbius inversion can be represented by convolution with a function on the poset, called the Möbius function μ; for the difference operator, μ is the sequence (1, −1, 0, 0, 0, ...).

## Finite difference in several variables

Finite differences can be considered in more than one variable. They are analogous to partial derivatives in several variables.

Some partial derivative approximations are (using central step method):

$f_{x}(x,y) \approx \frac{f(x+h ,y) - f(x-h,y)}{2h} \$
$f_{y}(x,y) \approx \frac{f(x,y+k ) - f(x,y-k)}{2k} \$
$f_{xx}(x,y) \approx \frac{f(x+h ,y) - 2 f(x,y) + f(x-h,y)}{h^2} \$
$f_{yy}(x,y) \approx \frac{f(x,y+k) - 2 f(x,y) + f(x,y-k)}{k^2} \$
$f_{xy}(x,y) \approx \frac{f(x+h,y+k) - f(x+h,y-k) - f(x-h,y+k) + f(x-h,y-k)}{4hk} ~.$

Alternatively, for applications in which the computation of f is the most costly step, and both first and second derivatives must be computed, a more efficient formula for the last case is

$f_{xy}(x,y) \approx \frac{f(x+h, y+k) - f(x+h, y) - f(x, y+k) + 2 f(x,y) - f(x-h, y) - f(x, y-k) + f(x-h, y-k)}{2hk} ~,$

since the only values to be computed which are not already needed for the previous four equations are f(x+h, y+k) and f(xh, yk).