Constant folding

Constant folding and constant propagation are related compiler optimizations used by many modern compilers.^[1] An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

Constant folding[edit]

Constant folding is the process of recognizing and evaluating constant expressions at compile time rather than computing them at runtime. Terms in constant expressions are typically simple literals, such as the integer literal 2, but they may also be variables whose values are known at compile time. Consider the statement:

  i = 320 * 200 * 32;

Most compilers would not actually generate two multiply instructions and a store for this statement. Instead, they identify constructs such as these and substitute the computed values at compile time (in this case, 2,048,000).

Constant folding can make use of arithmetic identities. If x is numeric, the value of 0 * x is zero even if the compiler does not know the value of x (note that this is not valid for IEEE floats since x could be Infinity or NaN. Still, some environments that favor performance such as GLSL shaders allow this for constants, which can occasionally cause bugs).

Constant folding may apply to more than just numbers. Concatenation of string literals and constant strings can be constant folded. Code such as "abc" + "def" may be replaced with "abcdef".

Constant folding and cross compilation[edit]

In implementing a cross compiler, care must be taken to ensure that the behaviour of the arithmetic operations on the host architecture matches that on the target architecture, as otherwise enabling constant folding will change the behaviour of the program. This is of particular importance in the case of floating point operations, whose precise implementation may vary widely.

Constant propagation[edit]

Constant propagation is the process of substituting the values of known constants in expressions at compile time. Such constants include those defined above, as well as intrinsic functions applied to constant values. Consider the following pseudocode:

  int x = 14;
  int y = 7 - x / 2;
  return y * (28 / x + 2);

Propagating x yields:

  int x = 14;
  int y = 7 - 14 / 2;
  return y * (28 / 14 + 2);

Continuing to propagate yields the following (which would likely be further optimized by dead-code elimination of both x and y.)

  int x = 14;
  int y = 0;
  return 0;

Constant propagation is implemented in compilers using reaching definition analysis results. If all variable's reaching definitions are the same assignment which assigns a same constant to the variable, then the variable has a constant value and can be replaced with the constant.

Constant propagation can also cause conditional branches to simplify to one or more unconditional statements, when the conditional expression can be evaluated to true or false at compile time to determine the only possible outcome.

The optimizations in action[edit]

Constant folding and propagation are typically used together to achieve many simplifications and reductions, and their interleaved, iterative application continues until those effects cease.

Consider this unoptimized pseudocode returning a number unknown pending analysis:

  int a = 30;
  int b = 9 - (a / 5);
  int c;

  c = b * 4;
  if (c > 10) {
     c = c - 10;
  }
  return c * (60 / a);

Applying constant propagation once, followed by constant folding, yields:

  int a = 30;
  int b = 3;
  int c;

  c = b * 4;
  if (c > 10) {
     c = c - 10;
  }
  return c * 2;

Repeating both steps twice produces:

  int a = 30;
  int b = 3;
  int c;

  c = 12;
  if (true) {
     c = 2;
  }
  return c * 2;

Having replaced all uses of variables a and b with constants, the compiler's dead-code elimination applies to those variables, leaving:

  int c;
  c = 12;

  if (true) {
     c = 2;
  }
  return c * 2;

(Boolean constructs vary among languages and compilers, but their details—such as the status, origin, and representation of true—do not affect these optimization principles.)

Traditional constant propagation produces no further optimization; it does not restructure programs.

However, a similar optimization, sparse conditional constant propagation, goes further by selecting the appropriate conditional branch,^[2] and removing the always-true conditional test. Thus, variable c becomes redundant, and only an operation on a constant remains:

  return 4;

If that pseudocode constitutes a function body, the compiler knows the function evaluates to integer constant 4, allowing replacement of calls to the function with 4, and further increasing program efficiency.

References[edit]

^ Steven Muchnick; Muchnick and Associates (15 August 1997). Advanced Compiler Design Implementation. Morgan Kaufmann. ISBN 978-1-55860-320-2. constant propagation OR constant folding.
^ Wegman, Mark N; Zadeck, F. Kenneth (April 1991), "Constant Propagation with Conditional Branches", ACM Transactions on Programming Languages and Systems, 13 (2): 181–210, CiteSeerX 10.1.1.130.3532, doi:10.1145/103135.103136, S2CID 52813828

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Use-define chain
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis