Jump to content

Short-circuit evaluation: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Bluhd (talk | contribs)
m →‎Example: Whoops, missed something
No edit summary
Line 44: Line 44:
| Last value
| Last value
|-
|-
| [[OCaml]]
| [[OCaml]], [[Haskell]]
| ''none''
| ''none''
| <code>&amp;&amp;</code> , <code><nowiki>||</nowiki></code>
| <code>&amp;&amp;</code> , <code><nowiki>||</nowiki></code>

Revision as of 01:24, 1 January 2009

Short-circuit evaluation or minimal evaluation denotes the semantics of some boolean operators in some programming languages in which the second argument is only executed or evaluated if the first argument does not suffice to determine the value of the expression: when the first argument of and evaluates to false, the overall value must be false; and when the first argument of or evaluates to true, the overall value must be true. In some programming languages (Lisp), the usual boolean operators are short-circuit. In others (C, Ada), both short-circuit and standard boolean operators are available.

The short-circuit operator x Sand y is equivalent to the conditional expression if x then y else false; x Sor y is equivalent to if x then true else y.

Typically short-circuit operators are, in effect, control structures rather than simple arithmetic operators, as they are not strict. However ALGOL 68 used "proceduring" to achieve user defined short-circuit operators & procedures.

In loosely-typed languages which have more than the two truth-values True and False, short-circuit operators may return the last evaluated subexpression, so that x Sand y is actually equivalent to if x then x else y (without actually evaluating x twice). This is called "Last value" in the table below.

In languages that use lazy evaluation by default (like Haskell), all functions effectively "short-circuit", and special short-circuit operators are unnecessary.

Boolean operators in various languages
Language Standard operators Short-circuit operators Value
Ada, Eiffel and , or and then , or else Boolean
ALGOL 68 and/&/∧ , or/∨ andf , orf - user defined Boolean
C1, C#, Java & , | && , || Boolean
Erlang and, or andalso, orelse Boolean
Javascript & , | && , || Last value
Lisp, Lua, Scheme none and , or Last value
OCaml, Haskell none && , || Boolean
Pascal and, or2 and_then, or_else3 Boolean
Perl, Ruby & , | && , and , || , or Last value
PHP & , | && , and , || , or Boolean
Python & , | and , or Last value
Smalltalk & , | and: , or: Boolean
Visual Basic And , Or AndAlso , OrElse Boolean

1 C does not actually have a distinct boolean type, but the bit-wise operations & and | function as standard boolean operators on the values returned by the relational operators.
2 ISO Pascal allows but does not require short-circuiting; most implementations do not short-circuit.[1]
3 ISO-10206 Extended Pascal supports and_then and or_else.[2]

Example

Usual example.

int denom = 0;
if (denom && nom/denom) {
        oops_i_just_divided_by_zero(); // never happens       
}

Consider the following example using the C programming language:

int a = 0;
if (a && myfunc(b)) {
    do_something();
}

In this example, short-circuit evaluation guarantees that myfunc(b) is never called. This is because a evaluates to false. This feature permits two useful programming constructs. Firstly, if the first sub-expression checks whether an expensive computation is needed and the check evaluates to false, one can eliminate expensive computation in the second argument. Secondly, it permits a construct where the first expression guarantees a condition without which the second expression may cause a runtime error, such as in the following C code where minimal evaluation prevents a null pointer dereference:

int IsThreeCharsLong(const char *p) {
  return (p != NULL) && (strlen(p) == 3);
}

Despite these benefits, minimal evaluation may cause problems for programmers who do not realize (or forget) it is happening. For example, in the code

if (expressionA && myfunc(b)) {
    do_something();
}

if myfunc(b) is supposed to perform some required operation regardless of whether do_something() is executed, such as allocating system resources, and expressionA evaluates as false, then myfunc(b) will not execute, which could cause problems. Some programming languages, such as Java, have two operators, one that employs minimal evaluation and one that does not, to avoid this problem.

It is worth noting that these expressions may be considered a more compact way of expressing a nested branch or sequence of branches. For example, consider the following block of code, when minimal evaluation is in use:

if (CONDITION_A && CONDITION_B) {
   // TRUE_BRANCH
} else {
   // FALSE_BRANCH
}

becomes (through logical AND expansion)

if (CONDITION_A) {
    if (CONDITION_B) {
        // BOTH_TRUE
    } else {
        // B_FALSE
    }
} else {
    // A_FALSE
}

Note that, due to the nature of the boolean logic involved, both the A_FALSE and B_FALSE code branches are represented by the single FALSE branch in the first example. However, following the more verbose second example does, at least, have one possible advantage — different code can be executed depending upon which condition, if any, is FALSE. Note however that if both conditions are false, only A_FALSE will be executed, and not B_FALSE.

A compiler using the common subexpression elimination and constant propagation optimizations might generate the same code for all such versions.

Minimal evaluation can also be used to conditionally execute some code for side effects, as in the common Perl idioms:

some_condition or die;    # Abort execution if some_condition is false
some_condition and die;   # Abort execution if some_condition is true

There are some cases where short-circuit evaluation leads to slower code. This is because, if both CONDITION_A and CONDITION_B are simple (i.e. ordering or equality check), it is actually faster to evaluate both conditions combined in boolean AND and check only one condition. Short-circuiting can lead to errors in branch prediction on modern processors, and dramatically reduce performance (a notable example is highly optimized ray with axis aligned box intersection code in raytracing). Some compilers can detect such cases and emit faster code, but it is not always possible due to possible violations of the C standard. Highly optimized code should use other ways for doing this (like manual usage of assembly code).

ALGOL 68

The original language of the 1968 Final Report differs in syntax from the 1973 Revised Report. The original had the feature of proceduring, i.e. coercing the value of a term into a procedure which evaluates the term. Proceduring effectively can make evaluations lazy. The most useful application could have been the short-circuited evaluation of boolean operators.

Examples of user defined operators:

op andf = (bool a, proc bool b)bool: if a then b() else a fi; ¢ b is only evaluated, if a is true.  ¢
op orf  = (bool a, proc bool b)bool: if a then a else b() fi; ¢ b is only evaluated, if a is false. ¢

Hence before the 1973 Revised Report, the programmer could decide to have the arguments of an operator (and indeed a procedure) evaluated serially instead of collaterally.

As defined in ALGOL 68, it did not work as expected. Most implementations emulate the Short-circuit behaviour for this special case by an extension of andf/orf or andth/orel to the "standard preclude".

References

See also