# Newton polynomial

In the mathematical field of numerical analysis, a Newton polynomial, named after its inventor Isaac Newton, is the interpolation polynomial for a given set of data points in the Newton form. The Newton polynomial is sometimes called Newton's divided differences interpolation polynomial because the coefficients of the polynomial are calculated using divided differences.

For any given finite set of data points, there is only one polynomial, of least possible degree, that passes through all of them. Thus, it is more appropriate to speak of "the Newton form of the interpolation polynomial" rather than of "the Newton interpolation polynomial". Like the Lagrange form, it is merely another way to write the same polynomial.

## Definition

Given a set of k + 1 data points

$(x_0, y_0),\ldots,(x_k, y_k)$

where no two xj are the same, the interpolation polynomial in the Newton form is a linear combination of Newton basis polynomials

$N(x) := \sum_{j=0}^{k} a_{j} n_{j}(x)$

with the Newton basis polynomials defined as

$n_j(x) := \prod_{i=0}^{j-1} (x - x_i)$

for j > 0 and $n_0(x) \equiv 1$.

The coefficients are defined as

$a_j := [y_0,\ldots,y_j]$

where

$[y_0,\ldots,y_j]$

is the notation for divided differences.

Thus the Newton polynomial can be written as

$N(x) = [y_0] + [y_0,y_1](x-x_0) + \cdots + [y_0,\ldots,y_k](x-x_0)(x-x_1)\cdots(x-x_{k-1}).$

The Newton Polynomial above can be expressed in a simplified form when $x_0, x_1, \dots, x_k$ are arranged consecutively with equal space. Introducing the notation $h = x_{i+1}-x_i$ for each $i=0,1,\dots,k-1$ and $x=x_0+sh$, the difference $x-x_i$ can be written as $(s-i)h$. So the Newton Polynomial above becomes:

\begin{align} N(x) &= [y_0] + [y_0,y_1]sh + \cdots + [y_0,\ldots,y_k] s (s-1) \cdots (s-k+1){h}^{k} \\ &= \sum_{i=0}^{k}s(s-1) \cdots (s-i+1){h}^{i}[y_0,\ldots,y_i] \\ &= \sum_{i=0}^{k}{s \choose i}i!{h}^{i}[y_0,\ldots,y_i] \end{align}

is called the Newton Forward Divided Difference Formula.

If the nodes are reordered as ${x}_{k},{x}_{k-1},\dots,{x}_{0}$, the Newton Polynomial becomes:

$N(x)=[y_k]+[{y}_{k}, {y}_{k-1}](x-{x}_{k})+\cdots+[{y}_{k},\ldots,{y}_{0}](x-{x}_{k})(x-{x}_{k-1})\cdots(x-{x}_{1})$

If ${x}_{k},\;{x}_{k-1},\;\dots,\;{x}_{0}$ are equally spaced with x=${x}_{k}+sh$ and ${x}_{i}={x}_{k}-(k-i)h$ for i = 0, 1, ..., k, then,

\begin{align} N(x) &= [{y}_{k}]+ [{y}_{k}, {y}_{k-1}]sh+\cdots+[{y}_{k},\ldots,{y}_{0}]s(s+1)\cdots(s+k-1){h}^{k} \\ &=\sum_{i=0}^{k}{(-1)}^{i}{-s \choose i}i!{h}^{i}[{y}_{k},\ldots,{y}_{k-i}] \end{align}

is called the Newton Backward Divided Difference Formula.

## Significance

Newton's formula is of interest because it is the straightforward and natural differences-version of Taylor's polynomial. Taylor's polynomial tells where a function will go, based on its y value, and its derivatives (its rate of change, and the rate of change of its rate of change, etc.) at one particular x value. Newton's formula is Taylor's polynomial based on finite differences instead of instantaneous rates of change.

As with other difference formulas, the degree of a Newton's interpolating polynomial can be increased by adding more terms and points without discarding existing ones. Newton's form has the simplicity that the new points are always added at one end: Newton's forward formula can add new points to the right, and Newton's backwards formula can add new points to the left. Unfortunately, the accuracy of polynomial interpolation depends on how close the interpolated point is to the middle of the x values of the set of points used; as Newton's form always adds new points at the same end, an increase in degree cannot be used to increase the accuracy anywhere but at that end. Gauss, Stirling, and Bessel all developed formulae to remedy that problem.[citation needed]

Gauss's formula alternately adds new points at the left and right ends, thereby keeping the set of points centered near the same place (near the evaluated point). When so doing, it uses terms from Newton's formula, with data points and x values renamed in keeping with one's choice of what data point is designated as the x0 data point.

Stirling's formula remains centered about a particular data point, for use when the evaluated point is nearer to a data point than to a middle of two data points. Bessel's formula remains centered about a particular middle between two data points, for use when the evaluated point is nearer to a middle than to a data point. They achieve that by sometimes using the average of two differences where Newton's or Gauss's would use just one difference. Stirling's does that in odd-degree terms; Bessels does that in even-degree terms. Calculating and averaging two differences need not involve extra work, since it can be done by formula, in advance—the expression for the averaged difference is not more complicated than that of the simple difference.

## Strengths and weaknesses of various formulae

The suitability of Stirling's, Bessel's and Gauss's formulae depends on 1) the importance of the small accuracy gain given by average differences; and 2) if greater accuracy is necessary, whether the interpolated point is closer to a data point or to a middle between two data points.

In general, the difference methods can be a good choice when one does not know how many points, what degree of interpolating polynomial, will be needed for the desired accuracy, and when one wants to look first at linear and other low-degree interpolation, successively judging accuracy by the difference in the results of two successive polynomial degrees. Lagrange's formula (not a difference formula) allows that also, but going to the next higher degree without re-doing work requires that each term's value be recorded—not a problem with a computer, but maybe awkward with a calculator.

Other than that, Lagrange is easier to calculate than the difference methods, and is (probably rightly) regarded by many as the best choice when one already knows what polynomial degree will be needed. And when all the interpolation will be done at one x value, with only the data points' y values varying from one problem to another, Lagrange's formula becomes so much more convenient that it begins to be the only choice to consider.

Lagrange's formula's ease of calculation is best achieved by its "barycentric forms". Its 2nd barycentric form might be the most efficient of all when using a computer, but its 1st barycentric form might be more convenient when using a calculator.

With the Newton form of the interpolating polynomial a compact and effective algorithm exists for combining the terms to find the coefficients of the polynomial. [1]

### Accuracy

When a particular data point is designated as x0, then as the evaluated point approaches that data point, the difference formula terms after the constant term tend toward zero. Therefore, Stirling's formula is at its best in the region where it is less needed. Bessel's is at its best when the evaluated point is near the middle between two data points, and therefore Bessel's is at its best when the added accuracy is most needed. So, Bessel's formula could be said to be the most consistently accurate difference formula, and, in general, the most consistently accurate of the familiar polynomial interpolation formulas.

It should be added that, when Bessel's or Stirling's gains a little accuracy over Gauss's and Lagrange's, it would be unusual for that extra accuracy to be needed. No one should quit using Lagrange's or Gauss's because of it.

When, with Stirling's or Bessel's, the last term used includes the average of two differences, then one more point is being used than Newton's or other polynomial interpolations would use for the same polynomial degree. So, in that instance, Stirling's or Bessel's is not putting an N−1 degree polynomial through N points, but is, instead, trading equivalence with Newton's for better centering and accuracy, giving those methods sometimes potentially greater accuracy, for a given polynomial degree, than other polynomial interpolations.

The other difference formulas, such as those of Stirling, Bessel and Gauss, can be derived from Newton's, using Newton's terms, with data points and x values renamed in keeping with the choice of x zero, and based on the fact that they must add up to the same sum value as Newton's (With Stirling that is so when polynomial degree is even. With Bessel's that is so when polynomial degree is odd).

## General case

For the special case of xi = i, there is a closely related set of polynomials, also called the Newton polynomials, that are simply the binomial coefficients for general argument. That is, one also has the Newton polynomials $p_n(z)$ given by

$p_n(z)={z \choose n}= \frac{z(z-1)\cdots(z-n+1)}{n!}$

In this form, the Newton polynomials generate the Newton series. These are in turn a special case of the general difference polynomials which allow the representation of analytic functions through generalized difference equations.

## Main idea

Solving an interpolation problem leads to a problem in linear algebra where we have to solve a system of linear equations. Using a standard monomial basis for our interpolation polynomial we get the very complicated Vandermonde matrix. By choosing another basis, the Newton basis, we get a system of linear equations with a much simpler lower triangular matrix which can be solved faster.

For k + 1 data points we construct the Newton basis as

$n_j(x) := \prod_{i=0}^{j-1} (x - x_i) \qquad j=0,\ldots,k.$

Using these polynomials as a basis for $\Pi_k$ we have to solve

$\begin{bmatrix} 1 & & \ldots & & 0 \\ 1 & x_1-x_0 & & & \\ 1 & x_2-x_0 & (x_2-x_0)(x_2-x_1) & & \vdots \\ \vdots & \vdots & & \ddots & \\ 1 & x_k-x_0 & \ldots & \ldots & \prod_{j=0}^{k-1}(x_k - x_j) \end{bmatrix} \begin{bmatrix} a_0 \\ \\ \vdots \\ \\ a_{k} \end{bmatrix} = \begin{bmatrix} y_0 \\ \\ \vdots \\ \\ y_{k} \end{bmatrix}$

to solve the polynomial interpolation problem.

This system of equations can be solved recursively by solving

$\sum_{i=0}^{j} a_{i} n_{i}(x_j) = y_j \qquad j = 0,\dots,k.$

## Taylor polynomial

The limit of the Newton polynomial if all nodes coincide is a Taylor polynomial, because the divided differences become derivatives.

$\lim_{(x_0,\dots,x_n)\to(z,\dots,z)} f[x_0] + f[x_0,x_1]\cdot(\xi-x_0) + \dots + f[x_0,\dots,x_n]\cdot(\xi-x_0)\cdot\dots\cdot(\xi-x_{n-1}) =$
$= f(z) + f'(z)\cdot(\xi-z) + \dots + \frac{f^{(n)}(z)}{n!}\cdot(\xi-z)^n$

## Application

As can be seen from the definition of the divided differences new data points can be added to the data set to create a new interpolation polynomial without recalculating the old coefficients. And when a data point changes we usually do not have to recalculate all coefficients. Furthermore if the xi are distributed equidistantly the calculation of the divided differences becomes significantly easier. Therefore the Newton form of the interpolation polynomial is usually preferred over the Lagrange form for practical purposes, although, in fact (and contrary to widespread claims), Lagrange, too, allows calculation of the next higher degree interpolation without re-doing previous calculations—and is considerably easier to evaluate.[citation needed]

### Example

The divided differences can be written in the form of a table. For example, for a function f is to be interpolated on points $x_0, \ldots, x_n$. Write

$\begin{matrix} x_0 & f(x_0) & & \\ & & {f(x_1)-f(x_0)\over x_1 - x_0} & \\ x_1 & f(x_1) & & {{f(x_2)-f(x_1)\over x_2 - x_1}-{f(x_1)-f(x_0)\over x_1 - x_0} \over x_2 - x_0} \\ & & {f(x_2)-f(x_1)\over x_2 - x_1} & \\ x_2 & f(x_2) & & \vdots \\ & & \vdots & \\ \vdots & & & \vdots \\ & & \vdots & \\ x_n & f(x_n) & & \\ \end{matrix}$

Then the interpolating polynomial is formed as above using the topmost entries in each column as coefficients.

For example, suppose we are to construct the interpolating polynomial to f(x) = tan(x) using divided differences, at the points

 $x_0=-\tfrac{3}{2}$ $x_1=-\tfrac{3}{4}$ $x_2=0$ $x_3=\tfrac{3}{4}$ $x_4=\tfrac{3}{2}$ $f(x_0)=-14.1014$ $f(x_1)=-0.931596$ $f(x_2)=0$ $f(x_3)=0.931596$ $f(x_4)=14.1014$

Using six digits of accuracy, we construct the table

$\begin{matrix} -\tfrac{3}{2} & -14.1014 & & & &\\ & & 17.5597 & & &\\ -\tfrac{3}{4} & -0.931596 & & -10.8784 & &\\ & & 1.24213 & & 4.83484 & \\ 0 & 0 & & 0 & & 0\\ & & 1.24213 & & 4.83484 &\\ \tfrac{3}{4} & 0.931596 & & 10.8784 & &\\ & & 17.5597 & & &\\ \tfrac{3}{2} & 14.1014 & & & &\\ \end{matrix}$

Thus, the interpolating polynomial is

$-14.1014+17.5597(x+\tfrac{3}{2})-10.8784(x+\tfrac{3}{2})(x+\tfrac{3}{4}) +4.83484(x+\tfrac{3}{2})(x+\tfrac{3}{4})(x)+0(x+\tfrac{3}{2})(x+\tfrac{3}{4})(x)(x-\tfrac{3}{4}) =$
$=-0.00005-1.4775x-0.00001x^2+4.83484x^3$

Given more digits of accuracy in the table, the first and third coefficients will be found to be zero.