NaN
From Wikipedia, the free encyclopedia
| This article needs additional citations for verification. Please help improve this article by adding reliable references. Unsourced material may be challenged and removed. (November 2008) |
In computing, NaN, which stands for Not a Number, is a value or symbol that is usually produced as the result of an operation on invalid input operands, especially in floating-point calculations. For example, most floating-point units are unable to explicitly calculate the square root of negative numbers, and will instead indicate that the operation was invalid and return a NaN result. NaNs may also be used to represent missing values in computations.[1][2]
Contents |
[edit] NaNs in floating point
In floating-point calculations, NaN is not the same as infinity, although both are typically handled as special cases in floating-point representations of real numbers as well as in floating-point operations. An invalid operation is also not the same as an arithmetic overflow (which might return an infinity) or an arithmetic underflow (which would return the smallest normal number, a denormal number, or zero).
IEEE 754 NaNs are represented with the exponential field filled with ones and some non-zero number in the significand. A bit-wise example of a IEEE floating-point standard single precision NaN: x111 1111 1axx xxxx xxxx xxxx xxxx xxxx where x means don't care. If a = 1, it is a quiet NaN, otherwise it is a signalling NaN.
A NaN does not compare equal to any floating-point number or NaN, even if the latter has an identical representation. One can therefore test whether a variable has a NaN value by comparing it to itself, thus if x = x gives false then x is a NaN code. Likewise
gives true. Some compilers may incorrectly transform these expressions to true and false, following the rules of ordinary arithmetic. A function such as IsNaN(x) should therefore be made available in systems offering such facilities. If two values are to be compared the simple test if x = y then ... does not catch the case where x and y are both NaN even if they have identical bit patterns. Although in many cases an algorithm that was devised before NaN codings were considered will still do more or less the "right thing" with NaN codings a possibility, odd effects can arise. Search algorithms will have particular difficulty.
In the IEEE floating-point standard, arithmetic operations involving NaN always produce NaN, allowing the value to propagate through a calculation so that errors can be detected at the end without extensive testing during intermediate stages.
In the revised IEEE 754 standard the same rule applies, except that a few anomalous functions (such as the maxnum function, which returns the maximum of two operands which are expected to be numbers) favor numbers—if just one of the operands is a NaN then the value of the other operand is returned.
The NaN 'toolbox' for GNU Octave and MATLAB goes one step further and skips all NaNs. NaNs are assumed to represent missing values and so the statistical functions ignore NaNs in the data instead of propagating them. Every computation in the NaN toolbox is based on the non-NaN data only.
Since the representation of a NaN code has many bits whose value is unspecified, one possibility would be to agree that certain bits would be set to on for different causative occasions: one for x/0, another for the square root of negative numbers, and so on. Then, when one NaN condition is to be combined with another, these bits would be subjected to an or operation so that at the end of a sequence, the final NaN code would give some indication of the provenance of the errors encountered along the way. Similarly, it could be agreed that other bits would be available for the user's indications of errors. All this would require formalisms for inspecting and manipulating the component fields.
[edit] How is a NaN created?
The following practices may cause NaNs:[3]
- All mathematical operations with a NaN as at least one operand
- The divisions 0/0, ∞/∞, ∞/-∞, -∞/∞, and -∞/-∞
- The multiplications 0×∞ and 0×-∞
- The additions ∞ + (-∞), (-∞) + ∞ and equivalent subtractions.
- Applying a function to arguments outside its domain, including:
- Taking the square root of a negative number
- Taking the logarithm of a negative number
- Taking the tangent of an odd multiple of 90 degrees (or π/2 radians)
- Taking the inverse sine or cosine of a number which is less than -1 or greater than +1.
In the last case, NaN conditions may result during the function's calculations simply as a consequence of the arithmetic, or, the function's calculations might have been revised so as to deal with and generate special codes in an organized manner. These special codes are yet to become standard across all (or most) computer designs, and until their usage becomes standardized general purpose computer languages will not have their standards expanded to incorporate them. Thus, the descriptions of standard functions such as square root seldom discuss these non-standard issues.
In addition to the above, NaNs may also be used by assignment to variables. This is particularly useful as a representation for missing values. Prior to the IEEE standard, programmers often used a special value (such as -99999999) to represent NaNs but these are always risky since legitimate data matching the special value could be misidentified.[1]
It is important to realize that these NaNs are not necessarily generated by the processor. In the case of quiet NaNs the first item is always valid for each processor; the others may not necessarily be. For example, on the Intel Architecture processors, the FPU never creates a NaN except in the first case, unless the corresponding floating point exception mask bits have been set[4]. The other items would cause exceptions, not NaNs. However, the software exception handler may examine the operands and decide to return a NaN (e.g. in the case of 0/0).
[edit] Quiet NaNs
Quiet NaNs, or qNaNs, do not raise any additional exceptions as they propagate through most operations. The exceptions are where the NaN cannot simply be passed through unchanged to the output, such as in format conversions or certain comparison operations (which do not "expect" a NaN input).
[edit] Signalling NaNs
Signalling NaNs, or sNaNs, are special forms of a NaN which when consumed by most operations should raise an invalid exception and then, if appropriate, be "quieted" into a qNaN which may then propagate. They were introduced in IEEE 754. There have been several ideas for how these might be used:
- Filling uninitialized memory with signalling NaNs would produce an invalid exception if the data is used before it is initialized
- Using an sNaN as a placeholder for a more complicated object such as:
- a representation of a number which has underflowed
- a representation of a number which has overflowed
- number in a higher precision format
- a complex number
When encountered a trap handler could decode the sNaN and return an index to the computed result. In practice this approach is faced with many complications. The treatment of the sign bit of NaNs for some simple operations (such as absolute value) is different from that for arithmetic operations. Traps are not required by the standard. There are other approaches to this sort of problem which would be more portable.
There were questions about if signalling NaNs should continue to be required in the revised standard. In the end it appears they will be left in.
[edit] NaNs in function definitions
There are differences of opinion about the proper definition for the result of a numeric function which receives a (quiet) NaN as input. One view is that the NaN should propagate to the output of the function in all cases to propagate the indication of an error. Another view is that if the function has multiple arguments and the output is uniquely determined by all the non-NaN inputs, then that value should be the result.
If we define pow(x,y) = x ** y What is pow(1, NaN)?
The first view is that the output should be NaN since one of the inputs is. The second view is that since pow(1, y) = 1 for any real number y, or even if y is infinity or -infinity, then it is appropriate to return 1 for the case of pow(1, NaN). This is the approach in many math libraries. However 1∞ is an indeterminate form, a limit of this form can tend to any number or infinity, and therefore NaN is a better answer in some circumstances.
A similar concern is for the test
(x <= infinity)
which is true for all extended real values of x. But in IEEE 754 (NaN <= infinity) is false.
[edit] NaNs in integers
Most fixed sized integer formats do not have any way of explicitly indicating invalid data.
Perl's BigInt package uses "NaN" for the result of strings which don't represent valid integers.
>perl -mMath::BigInt -e "print Math::BigInt->new('foo')"NaN
[edit] Displaying NaN
Note that the software libraries of different operating systems and programming languages will have different string representations of NaN.
nan NaN NaN% NAN NaNQ NaNS qNaN sNaN 1.#SNAN 1.#QNAN -1.#IND
Since, in practice, encoded NaNs have both a sign and optional 'diagnostic information' (sometimes called a payload), these will often be found in string representations of NaNs, too, for example:
-NaN NaN12345 -sNaN12300
(other variants exist)
[edit] NaN encodings
The encoding to distinguish a signaling NaN from a quiet NaN was not specified in IEEE 754, which has led to at least two variant encodings. The current IEEE 754 standard recommends (for binary encodings, in section 6.2.1) that the first fraction bit of the significand be set to one for a qNaN and be zero for an sNaN. Preferably, one other bit in the significand would also be set to one. This ensures that when an sNaN is converted to a qNaN (by inverting that bit) the result is guaranteed to still be a NaN (rather than perhaps an infinity, should all remaining bits of the significand be zero).
On processors from Intel and AMD the first fraction bit of a binary significand is set to one for a qNaN and is zero for an sNaN. Other vendors use different schemes.
For the IEEE 754 decimal encodings, Infinities and NaNs are distinguished at a 'higher level', and so there is no confusion between NaNs and Infinities. Therefore, in this case, a 1 is used (in the equivalent position) to indicate sNaN, because turning this to 0 still indicates a (quiet) NaN. Hence, an initialization of all-ones sets any storage for these encodings to signaling-NaN, which is an appropriate setting for 'uninitialized' numeric data.
[edit] References
- ^ a b Bowman, Kenneth (2006) An introduction to programming with IDL: Interactive Data Language. Academic Press. p. 26 ISBN 012088559X
- ^ William H. Press, Saul A. Teukolsky, William T. Vetterling (2007) Numerical recipes: the art of scientific computing.p. 34 Cambridge University Press, ISBN 0521880688
- ^ David Goldberg. "What Every Computer Scientist Should Know About Floating-Point". http://docs.sun.com/source/806-3568/ncg_goldberg.html.
- ^ "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture". April 2008. pp. 118–125, 266–267, 334–335. http://www.intel.com/products/processor/manuals/index.htm.