Thunk (functional programming)
In computer science, a thunk (also suspension, suspended computation or delayed computation) is a parameterless closure created to prevent the evaluation of an expression until forced at a later time. In lazy languages thunks are created and forced implicitly. In eager languages thunks can be created explicitly by wrapping an expression in a parameterless anonymous function and forced by calling it in order to emulate lazy evaluation. Some functional languages require functions to take at least one parameter, requiring the anonymous function to take a dummy parameter, usually of unit type, instead.
Call by name
Implementations of the call by name and call by need evaluation strategies typically use a thunk to refer to an incompletely evaluated expression. In this context, the thunk is simply a placeholder for a part of computation which, when executed, returns either the fully evaluated expression, or returns a tag, which shows that there is a part of the expression which is yet to be instantiated, in which case blind, mindless computation cannot yet be performed (because, for example, some variable in the above-mentioned algebraic formula is not yet mapped to a number — hence the formula does not yet reduce to a concrete number).
Call by need
In call-by-need, the thunk is replaced by its return value after its first complete execution. In languages with late binding (for example, Haskell), the "computation" performed by the thunk may include a lookup in the run-time context of the program to determine the current binding of a variable.
Thunks were invented by Peter Zilahy Ingerman in 1961. According to the inventor, the name "thunk" came about because it could be optimized by the compiler by "thinking about it", so "thunk", according to its inventor, "is the past tense of 'think' at two in the morning"
An early implementation of thunks for call-by-name was in Algol 60.
begin integer idx; real procedure sum (i, lo, hi, term); value lo, hi; integer i, lo, hi; real term; comment term is passed by-name, and so is i; begin real temp; temp := 0; for i := lo step 1 until hi do temp := temp + term; sum := temp end; print (sum (idx, 1, 100, 1/idx)) end
The above example (see Jensen's Device) relies on the fact that the actual parameters
1/idx are passed "by name", so that the program is equivalent to the "inlined" version
begin integer idx; real sum; begin real temp; temp := 0; for idx := 1 step 1 until 100 do temp := temp + 1/idx; sum := temp end; print (sum) end
Notice that the expression
1/idx is not evaluated at the point of the call to
sum; instead, it is evaluated anew each time the formal parameter
term is mentioned in the definition of
sum. A compiler using thunks to implement call-by-name would process the original code as if it had been written using function pointers or lambdas (represented below in Algol-like pseudocode):
begin integer idx; real procedure sum (i_lvalue, lo, hi, term_rvalue); value lo, hi; integer lo, hi; thunk i_lvalue; thunk term_rvalue; begin real temp; temp := 0; for i_lvalue() := lo step 1 until hi do temp := temp + term_rvalue(); sum := temp end; procedure lvalue_of_idx (); begin lvalue_of_idx := address of idx end; procedure rvalue_of_1_over_idx (); begin rvalue_of_1_over_idx := 1/idx end; print (sum (lvalue_of_idx, 1, 100, rvalue_of_1_over_idx)) end
rvalue_of_1_over_idx would be generated automatically by the compiler whenever a call-by-name actual parameter was encountered. These automatically generated procedures would be called thunks.
- P.Z. Ingerman (January 1961). "Thunks - A Way of Compling Procedure Statements with Some Comments on Procedure Declarations". IFIP ALGOL Bulletin & CACM Vol 4 No1. pp. 55–58. Retrieved March 3, 2011.
- P.Z. Ingerman. "The New Hacker's Dictionary Third Edition (personal communication to one contributor)". pp. 444–445.
- Thunk at the Haskell Wiki