# Loss of significance

Loss of significance is an undesirable effect in calculations using floating-point arithmetic. It occurs when an operation on two numbers increases relative error substantially more than it increases absolute error, for example in subtracting two nearly equal numbers (known as catastrophic cancellation). The effect is that the number of accurate (significant) digits in the result is reduced unacceptably. Ways to avoid this effect are studied in numerical analysis.

## Demonstration of the problem

The effect can be demonstrated with decimal numbers. The following example demonstrates loss of significance for a decimal floating-point data type with 10 significant digits:

Consider the decimal number

   0.1234567891234567890


A floating-point representation of this number on a machine that keeps 10 floating-point digits would be

   0.1234567891


which is fairly close – the difference is very small in comparison with either of the two numbers.

Now perform the calculation

   0.1234567891234567890 − 0.1234567890


The answer, accurate to 10 digits, is

   0.0000000001234567890


However, on the 10-digit floating-point machine, the calculation yields

   0.1234567891 − 0.1234567890 = 0.0000000001


Whereas the original numbers are accurate in all of the first (most significant) 10 digits, their floating-point difference is only accurate in its first nonzero digit. This amounts to loss of significance.

## Workarounds

It is possible to do computations using an exact fractional representation of rational numbers and keep all significant digits, but this is often prohibitively slower than floating-point arithmetic. Furthermore, it usually only postpones the problem: What if the data is accurate to only ten digits? The same effect will occur.

One of the most important parts of numerical analysis is to avoid or minimize loss of significance in calculations. If the underlying problem is well-posed, there should be a stable algorithm for solving it. The art is in finding it.

## Loss of significant bits

Let x and y be positive normalized floating point numbers.

In the subtraction xy, r significant bits are lost where

$q \le r \le p$
$2^{-p} \le 1 - \frac{y}{x} \le 2^{-q}$

for some positive integers p and q.

## Instability of the quadratic equation

For example, consider the venerable quadratic equation:

$a x^2 + b x + c = 0$,

with the two exact solutions:

$x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}.$

This formula may not always produce an accurate result. For example, when c is very small, loss of significance can occur in either of the root calculations, depending on the sign of b.

The case $a = 1$, $b = 200$, $c = -0.000015$ will serve to illustrate the problem:

$x^2 + 200 x - 0.000015 = 0.$

We have

$\sqrt{b^2 - 4 a c} = \sqrt{200^2 + 4 \times 1 \times 0.000015} = 200.00000015...$

In real arithmetic, the roots are

$( -200 - 200.00000015 ) / 2 = -200.000000075,$
$( -200 + 200.00000015 ) / 2 = 0.000000075.$

In 10-digit floating-point arithmetic,

$( -200 - 200.0000001 ) / 2 = -200.00000005,$
$( -200 + 200.0000001 ) / 2 = 0.00000005.$

Notice that the solution of greater magnitude is accurate to ten digits, but the first nonzero digit of the solution of lesser magnitude is wrong.

Because of the subtraction that occurs in the quadratic equation, it does not constitute a stable algorithm to calculate the two roots.

## A better algorithm

A better algorithm for solving quadratic equations is based on two observations: that one solution is always accurate when the other is not, and that given one solution of the quadratic, the other is easy to find.

If

\begin{alignat}{3} & x_1 && = \frac{-b + \sqrt{b^2 - 4ac}}{2a} \qquad & \text{(1)} \\ \end{alignat}

and

\begin{alignat}{3} & x_2 && = \frac{2c}{-b + \sqrt{b^2 - 4ac}} \qquad & \text{(2)} \\ \end{alignat}

then we have the identity (one of Viète's formulas for a second degree polynomial)

$x_1 x_2 = c / a \$.

The above formulas (1) and (2) work perfectly for a quadratic equation whose coefficient 'b' is negative (b < 0). If 'b' is negative then '-b' in the formulas get converted to a positive value as -(-b) is equal to 'b'. Hence, we can avoid subtraction and loss of significant digits caused by it. But if the coefficient 'b' is positive then we need to use a different set of formulas. The second set of formulas that are valid for finding roots when coefficient 'b' is positive are mentioned below.

\begin{alignat}{3} & x_1 && = \frac{-b - \sqrt{b^2 - 4ac}}{2a} \qquad & \text{(3)} \\ \end{alignat}

and

\begin{alignat}{3} & x_2 && = \frac{2c}{-b - \sqrt{b^2 - 4ac}} \qquad & \text{(4)} \\ \end{alignat}

In the above formulas (3) and (4) when 'b' is positive the formula converts it to negative value as -(+b) is equal to -b. Now, as per the formulas '-b' is subtracted by square root of (b*b - 4ac) so basically it's an addition operation. In our example, coefficient 'b' of quadratic equation is positive . Hence, we need to use the second set of formulas i.e. formula (3) and (4).

The algorithm is as follows. Use the quadratic formula to find the solution of greater magnitude, which does not suffer from loss of precision. Then use this identity to calculate the other root. Since no subtraction is involved, no loss of precision occurs.

Applying this algorithm to our problem, and using 10-digit floating-point arithmetic, the solution of greater magnitude, as before, is $x_1 = -200.00000005.$ The other solution is then

$x_2 = c / (-200.00000005) = 0.000000075,$

which is accurate. However, there remains a further possible source of cancellation when computing $b^2 - 4ac$ and indeed this can lead to up to half of significant figures being lost: to correct for this requires $b^2 - 4ac$ to be computed in extended precision, of twice the precision of the final result [1] (see quadratic equation for details).