Instruction path length
In computer performance, the instruction path length is the number of machine code instructions required to execute a section of a computer program. The total path length for the entire program could be deemed a measure of the algorithm's performance on a particular computer hardware. The path length of a simple conditional instruction would normally be considered as equal to 2,[citation needed] one instruction to perform the comparison and another to take a branch if the particular condition is satisfied. The length of time to execute each instruction is not normally considered in determining path length and so path length is merely an indication of relative performance rather than in any sense absolute.
When executing a benchmark program, most of the instruction path length is typically inside the program's inner loop.
Before the introduction of caches, the path length was an approximation of running time, but in modern CPUs with caches, it can be a much worse approximation, with some load instructions taking hundreds of cycles when the data is not in cache, or orders of magnitude faster when in cache (even the same instruction in another round in a loop).
Assembly programs
Since there is, typically, a one-to-one relationship between assembly instructions and machine instructions, the instruction path length is frequently taken as the number of assembly instructions required to perform a function or particular section of code. Performing a simple table lookup on an unsorted list of 1,000 entries might require perhaps 2,000 machine instructions (on average, assuming uniform distribution of input values), while performing the same lookup on a sorted list using a binary search algorithm might require only about 40 machine instructions, a very considerable saving. Expressed in terms of instruction path length, this metric would be reduced in this instance by a massive factor of 50 – a reason why actual instruction timings might be a secondary consideration compared to a good choice of algorithm requiring a shorter path length.
The instruction path length of an assembly language program is generally vastly different than the number of source lines of code for that program, because the instruction path length includes only code in the executed control flow for the given input and does not include code that is not relevant for the particular input, or unreachable code.
High-level language (HLL) programs
Since one statement written in a high-level language can produce multiple machine instructions of variable number, it is not always possible to determine instruction path length without, for example, an instruction set simulator – that can count the number of 'executed' instructions during simulation. If the high-level language supports and optionally produces an 'assembly list', it is sometimes possible to estimate the instruction path length by examining this list.
Factors determining instruction path length
- in-line code versus the overheads of calling and returning from a function, procedure, or method containing the same statements
- order of items in unsorted lookup list – most frequently occurring items should be placed first to avoid long searches
- choice of algorithm – indexed, binary or linear (item-by-item) search
- calculate afresh versus retain earlier calculated (memoization) – may reduce multiple complex iterations
- read some tables into memory once versus external read afresh each time – avoiding high path length through multiple I/O function calls
Use of instruction path lengths
From the above, it can be realized that knowledge of instruction path lengths can be used:
- to choose an appropriate algorithm to minimize overall path lengths for programs in any language
- to monitor how well a program has been optimized in any language
- to determine how efficient particular HLL statements are for any HLL language
- as an approximate measure of overall computer performance