Unspecified behavior

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Unspecified behavior is behavior that may vary on different implementations of a programming language.[clarification needed] A program can be said to contain unspecified behavior when its source code may produce an executable that exhibits different behavior when compiled on a different compiler, or on the same compiler with different settings. While the respective language standards or specifications may impose a range of possible behaviors, the exact behavior depends on the implementation, and may not be completely determined upon examination of the program's source code.[1] Unspecified behavior will often not manifest itself in the resulting program's external behavior, but it may sometimes lead to differing outputs or results, causing portability problems.

Definition[edit]

To enable compilers to produce optimal code for their respective target platforms, programming language standards do not always impose a certain specific behavior for a given source code construct.[2] Failing to explicitly define the exact behavior of every possible program is not considered an error or weakness in the language specification, and doing so would be infeasible.[1] In the C and C++ languages, such non-portable constructs are generally grouped into three categories: Implementation-defined, unspecified, and undefined behavior.[3]

The exact definition of unspecified behavior varies. In C++, it is defined as "behavior, for a well-formed program construct and correct data, that depends on the implementation."[4] Unlike implementation-defined behavior, there is no requirement for the implementation to document its behavior.[4] Similarly, the C Standard defines it as behavior for which the standard "provides two or more possibilities and imposes no further requirements on which is chosen in any instance".[5] The C++ Standard also notes that the range of possible behaviors is usually provided.[4] Unspecified behavior is different from undefined behavior. The latter is typically a result of an erroneous program construct or data, and no requirements are placed on the translation or execution of such constructs.[6]

Examples[edit]

Order of evaluation of subexpressions[edit]

Many programming languages do not specify the order of evaluation of the sub-expressions of a complete expression. If one or more of the sub-expressions has side effects, then the result of evaluating the full-expression may be different depending on the order of evaluation of the sub-expressions.[1] For example, given

a = f(b) + g(b);

, where f and g both modify b, the result stored in a may be different depending on whether f(b) or g(b) is evaluated first.[1] In the C and C++ languages, this also applies to function arguments. Example:[2]

#include <iostream>
int f() {
  std::cout << "In f\n";
  return 3;
}
 
int g() {
  std::cout << "In g\n";
  return 4;
}
 
int sum(int i, int j) {
  return i + j;
}
 
int main() {
  return sum(f(), g()); 
}

The resulting program will write its two lines of output in an unspecified order.[2] In other languages, such as Java, the order of evaluation of operands and function arguments is explicitly defined.[7]

Pointer comparisons[edit]

In C and C++, the comparison of pointers to objects is only strictly defined if the pointers point to members of the same object, or elements of the same array.[8] Example:

int main(void)
{
  int a = 0;
  int b = 0;
  return &a < &b; /* unspecified behavior in C++, undefined in C */
}

See also[edit]

References[edit]

  1. ^ a b c d ISO/IEC (2009-05-29). ISO/IEC PDTR 24772.2: Guidance to Avoiding Vulnerabilities in Programming Languages through Language Selection and Use
  2. ^ a b c Becker, Pete (2006-05-16). "Living By the Rules". Dr. Dobb's Journal. Retrieved 26 November 2009. 
  3. ^ Henricson, Mats; Nyquist, Erik (1997). Industrial Strength C++. Prentice Hall. ISBN 0-13-120965-5. 
  4. ^ a b c ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.3.13 unspecified behavior [defns.unspecified]
  5. ^ ISO/IEC (1999). ISO/IEC 9899:1999(E): Programming Languages - C §3.4.4 para 1
  6. ^ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.3.12 undefined behavior [defns.undefined]
  7. ^ James Gosling, Bill Joy, Guy Steele, and Gilad Bracha (2005). The Java Language Specification, Third Edition. Addison-Wesley. ISBN 0-321-24678-0
  8. ^ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §5.9 Relational operators [expr.rel] para. 2