Loop splitting

Loop splitting is a compiler optimization technique. It attempts to simplify a loop or eliminate dependencies by breaking it into multiple loops which have the same bodies but iterate over different contiguous portions of the index range.

Loop peeling

Loop peeling is a special case of loop splitting which splits any problematic first (or last) few iterations from the loop and performs them outside of the loop body.

Suppose a loop was written like this:

 int p = 10;
 for (int i=0; i<10; ++i)
 {
   y[i] = x[i] + x[p];
   p = i;
 }

Notice that p = 10 only for the first iteration, and for all other iterations, p = i - 1. A compiler can take advantage of this by unwinding (or "peeling") the first iteration from the loop.

After peeling the first iteration, the code would look like this:

 y[0] = x[0] + x[10];
 for (int i=1; i<10; ++i)
 {
   y[i] = x[i] + x[i-1];
 }

This equivalent form eliminates the need for the variable p inside the loop body.

Loop peeling was introduced in gcc in version 3.4. More generalised loop splitting was added in GCC 7.^[1]

Brief history of the term

Apparently the term "peeling" was for the first time used by Cannings, Thompson and Skolnick^[2] in their 1976 paper on computational models for (human) inheritance. There the term was used to denote a method for collapsing phenotypic information onto parents. From there the term was used again in their papers, including their seminal paper on probability functions on complex pedigrees.^[3]

In compiler technology, the term first turned up in late 1980s papers on VLIW and superscalar compilation, including ^[4] and.^[5]

References

^ GCC 7 Release Series — Changes, New Features, and Fixes - GNU Project
^ Cannings, C.; Thompson, E. A.; Skolnick, H. H. (1976). "The recursive derivation of likelihoods on complex pedigrees". Advances in Applied Probability. 8 (4): 622–625. doi:10.2307/1425918. JSTOR 1425918.
^ Cannings, C.; Thompson, E. A.; Skolnick, H. H. (1978). "Probability functions on complex pedigrees". Advances in Applied Probability. 10 (1): 26–61. doi:10.2307/1426718. JSTOR 1426718.
^ Callahan, D.; Kennedy, Ken (1988). "Compiling Programs for Distributed-memory Multiprocessors". The Journal of Supercomputing. 2 (2): 151–169. doi:10.1007/BF00128175. S2CID 10214341.
^ Mahlke, S. A.; Lin, D. C.; Chen, W. Y.; Hank, R. E.; Bringman, R. A. (1992). Effective compiler support for predicated execution using the hyperblock. 25th Annual International Symposium on Microarchitecture. pp. 45–54.

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Upwards exposed uses Use-define chain Reaching definitions
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis

Loop peeling

Brief history of the term

References

Further reading