Jump to content

Memoization: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
QTJ (talk | contribs)
Major rewrite
QTJ (talk | contribs)
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''Memoization''' is an [[Optimization (computer science)|optimization]] technique used in [[computing]] primarily to speed up [[computer programs]] by storing the results of [[Subroutine|function calls]] for later reuse, rather than recomputing them at each invocation of the function. Memoization has also been used in other contexts (and for other purposes than speed gains) in computing, such as in [[parsing]]. Although related to [[Cache|caching]], memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as [[buffering]] or [[Page replacement algorithm|page replacement]].
In [[computing]], '''memoization''' is an [[Optimization (computer science)|optimization]] technique used primarily to speed up [[computer programs]] by storing the results of [[Subroutine|function calls]] for later reuse, rather than recomputing them at each invocation of the function. Memoization has also been used in other contexts (and for other purposes than speed gains), such as in [[parsing]]. Although related to [[Cache|caching]], memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as [[buffering]] or [[Page replacement algorithm|page replacement]].


==Overview==
==Overview==


The term "memoization" was coined by [[Donald Michie]] in [[1968]] <ref name="Michie1968">Michie, Donald, "Memo Functions and Machine Learning," ''Nature'', No. 218, pp. 19-22, 1968.</ref> and is derived from the [[Latin]] word ''[[memorandum]]'' (''to be remembered''), and thus carries the meaning of ''turning [the results of] a function into something to be remembered.'' While ''memoization'' might be confused with ''memo'''r'''ization'' (because of the shared [[cognate]]), memoization has a specialized meaning in [[computing]].
The term "memoization" was coined by [[Donald Michie]] in [[1968]] <ref name="Michie1968">Michie, Donald, "Memo Functions and Machine Learning," ''Nature'', No. 218, pp. 19-22, 1968.</ref> and is derived from the [[Latin]] word ''[[memorandum]]'' (''to be remembered''), and thus carries the meaning of ''turning [the results of] a function into something to be remembered.'' While ''memoization'' might be confused with ''memo'''r'''ization'' (because of the shared [[cognate]]), memoization has a specialized meaning in computing.


A memoized function stores results of previous calls with the same call parameters, and if and when the function is later called with those same parameters, returns the stored result rather than recalculating it, thus moving the primary cost of a call with given parameters to the first call made to the function with those parameters. A function can only be memoized if it is [[referentially transparent]]; that is, only if calling the function has the exact same effect as replacing that function call with its return value. (Special case exceptions to this restriction exist, however.) While related to [[lookup table]]s, in that memoization uses such tables in its implementation, memoization differs from pure table lookup in that the tables memoization might use are populated transparently on an [[on the fly|''as-needed'']] basis.
A memoized function stores results of previous calls with the same call parameters, and if and when the function is later called with those same parameters, returns the stored result rather than recalculating it, thus moving the primary cost of a call with given parameters to the first call made to the function with those parameters. A function can only be memoized if it is [[referentially transparent]]; that is, only if calling the function has the exact same effect as replacing that function call with its return value. (Special case exceptions to this restriction exist, however.) While related to [[lookup table]]s, since memoization uses such tables in its implementation, memoization differs from pure table lookup in that the tables memoization might use are populated transparently on an [[on the fly|''as-needed'']] basis.


Memoization is a means of lowering a function's ''time'' cost in exchange for ''space'' cost; that is, memoized functions become optimized for ''speed'' in exchange for a higher use of [[computer memory]] ''space''. The time/space "cost" of [[algorithm]]s has a specific name in computing: ''[[Computational complexity theory|computational complexity]]''. All functions have a computational complexity in ''time'' (i.e. they take time to execute) and in ''space''.
Memoization is a means of lowering a function's ''time'' cost in exchange for ''space'' cost; that is, memoized functions become optimized for ''speed'' in exchange for a higher use of [[computer memory]] ''space''. The time/space "cost" of [[algorithm]]s has a specific name in computing: ''[[Computational complexity theory|computational complexity]]''. All functions have a computational complexity in ''time'' (i.e. they take time to execute) and in ''space''.


Although a trade-off occurs, in that space used is speed gained, this differs from some other optimizations that involve time-space trade-off, such as [[strength reduction]], in that memoization is a [[runtime]] rather than [[compile time]] optimization. Moreover, strength reduction potentially replaces an expensive operation such as multiplication by a less expensive operation such as addition, and the results in savings can be highly [[Machine-dependent|non-portable across machines]], whereas memoization is a [[machine-independent]] strategy.
Although a [[trade-off]] occurs (i.e., space used is speed gained), this differs from some other optimizations that involve time-space trade-off, such as [[strength reduction]], in that memoization is a [[runtime]] rather than [[compile time]] optimization. Moreover, strength reduction potentially replaces an expensive operation such as multiplication with a less expensive operation such as addition, and the results in savings can be highly [[Machine-dependent|non-portable across machines]], whereas memoization is a [[machine-independent]] strategy.


Consider the following [[pseudocode]] function to calculate the [[factorial]] of ''n'':
Consider the following [[pseudocode]] function to calculate the [[factorial]] of ''n'':
Line 21: Line 21:
end function
end function


For every [[integer]] ''n'' such that <math>n \ge 0</math>, the final result of the function <code>factorial</code> is [[Invariant (computer science)|invariant]]; if invoked as <code>x = factorial(3)</code>, the result is such that ''x'' will ''always'' be assigned the value 6. A non-memoized version of the above, given the nature of the [[recursion|recursive]] [[algorithm]] involved, would require ''n + 1'' invocations of <code>factorial</code> to arrive at a result, and each of these invocations, in turn, has an associated cost in the time it takes the function to return the value of ''n!''. Depending on the machine, this cost might be the sum of:
For every [[integer]] ''n'' such that <math>n \ge 0</math>, the final result of the function <code>factorial</code> is [[Invariant (computer science)|invariant]]; if invoked as <code>x = factorial(3)</code>, the result is such that ''x'' will ''always'' be assigned the value 6. A non-memoized version of the above, given the nature of the [[recursion|recursive]] [[algorithm]] involved, would require ''n + 1'' invocations of <code>factorial</code> to arrive at a result, and each of these invocations, in turn, has an associated cost in the time it takes the function to return the value computed. Depending on the machine, this cost might be the sum of:


# The cost to set up the functional call stack frame.
# The cost to set up the functional call stack frame.
Line 30: Line 30:
# The cost to store the return result so that it may be used by the calling context.
# The cost to store the return result so that it may be used by the calling context.


In a non-memoized implementation, ''every'' top-level call to <code>factorial</code> includes the cumulative cost of steps 2 through 6 proportional to the initial value of ''n''. If the arbitrary value of 1 ''cost-unit'' is assigned to each step,<ref>A value of 1 is here being used for each step, when in fact a real [[CPU]] would likely have different [[machine cycle]] costs for each type of core operation.</ref> then, the total cost (in these arbitrary units) is <math>\bold{Cost}(n) = 5n + 6</math>.<ref>A step has been skipped here for space. The full derivation is: <math>\bold{Cost}(Factorial(n)) = 1 + 5(n + 1) = 5n + 6</math>. The actual duration of elapsed time for each ''unit'' is not important (i.e. [[second]]s, [[microsecond]]s, [[nanosecond]]s); what matters in this analysis is the ''number'' of such units expended by the algorithm.</ref>
In a non-memoized implementation, ''every'' top-level call to <code>factorial</code> includes the cumulative cost of steps 2 through 6 proportional to the initial value of ''n''.


A memoized version of the <code>factorial</code> function follows:
A memoized version of the <code>factorial</code> function follows:
Line 45: Line 45:
end if
end if
store ''x'' in lookup-table in the ''n''<sup>th</sup> slot [[''remember the result of n! for later'']
store ''x'' in ''lookup-table'' in the ''n''<sup>th</sup> slot [''remember the result of n! for later'']
return x
return x
Line 53: Line 53:


In this particular example, if <code>factorial</code> is first invoked with 5, and then invoked later with any value less than or equal to five, those return values will also have been memoized, since <code>factorial</code> will have been called recursively with the values 5, 4, 3, 2, 1, and 0, and the return values for ''each'' of those will have been stored. If it is then called with a number greater than 5, such as 7, only 2 recursive calls will be made (7 and 6), and the value for 5! will have been stored from the previous call. In this way, memoization allows a function to become more time-efficient the more often it is called, thus resulting in eventual overall speed up.
In this particular example, if <code>factorial</code> is first invoked with 5, and then invoked later with any value less than or equal to five, those return values will also have been memoized, since <code>factorial</code> will have been called recursively with the values 5, 4, 3, 2, 1, and 0, and the return values for ''each'' of those will have been stored. If it is then called with a number greater than 5, such as 7, only 2 recursive calls will be made (7 and 6), and the value for 5! will have been stored from the previous call. In this way, memoization allows a function to become more time-efficient the more often it is called, thus resulting in eventual overall speed up.

Whereas the cost of ''m'' calls (where <math>m \ge 2</math>) to the non-memoized version of <code>factorial</code> is <math>\bold{Cost}(m,n) = m(5n + 6)</math>, the cost of ''m'' calls to the memoized version is <math>\bold{Cost}(m,n) = 5n + m + 6</math>. Thus, whereas 100 calls to the non-memoized version where ''n'' is 5 cost a total of 3100 cost-units, 100 calls to the memoized version where ''n'' is 5 cost a total of only 131 cost-units.


==Some other considerations==
==Some other considerations==
Line 60: Line 58:
===Automatic memoization===
===Automatic memoization===


While memoization may be added to functions ''internally'' and ''explicitly'' by a [[computer programmer]] in much the same way the above memoized version of <code>factorial</code> is implemented, referentially transparent functions may also be automatically memoized ''externally''.<ref name="Norvig1991">Norvig, Peter, "Techniques for Automatic Memoization with Applications to Context-Free Parsing," ''Computational Linguistics'', Vol. 17 No. 1, pp. 91-98, March 1991.</ref> The techniques employed by Norvig have application not only in [[Common Lisp]] (the language in which his paper demonstrated automatic memoization), but in various other [[programming language]]s. Applications of automatic memoization has also been formally explored in the study of [[term rewriting]]<ref name="Hoffman1992">Hoffman, Berthold, "Term Rewriting with Sharing and Memoïzation," ''Algebraic and Logic Programming: Third International Conference, Proceedings'', H. Kirchner and G. Levi (eds.), pp. 128-142, Volterra, Italy, 2-4 September 1992.</ref> and [[artificial intelligence]].<ref name="MayfieldEtAl1995">Mayfield, James, ''et al'', ''Using Automatic Memoization as a Software Engineering Tool in Real-World AI Systems'', TBD, 1995.</ref>
While memoization may be added to functions ''internally'' and ''explicitly'' by a [[computer programmer]] in much the same way the above memoized version of <code>factorial</code> is implemented, referentially transparent functions may also be automatically memoized ''externally''.<ref name="Norvig1991">Norvig, Peter, "Techniques for Automatic Memoization with Applications to Context-Free Parsing," ''Computational Linguistics'', Vol. 17 No. 1, pp. 91-98, March 1991.</ref> The techniques employed by Norvig have application not only in [[Common Lisp]] (the language in which his paper demonstrated automatic memoization), but in various other [[programming language]]s. Applications of automatic memoization have also been formally explored in the study of [[term rewriting]]<ref name="Hoffman1992">Hoffman, Berthold, "Term Rewriting with Sharing and Memoïzation," ''Algebraic and Logic Programming: Third International Conference, Proceedings'', H. Kirchner and G. Levi (eds.), pp. 128-142, Volterra, Italy, 2-4 September 1992.</ref> and [[artificial intelligence]].<ref name="MayfieldEtAl1995">Mayfield, James, ''et al'', ''Using Automatic Memoization as a Software Engineering Tool in Real-World AI Systems'', TBD, 1995.</ref>


In those programming languages where functions are [[first-class object|first or second-class objects]] (such as [[Lua (programming language)|Lua]], with its first-class functions), automatic memoization can be implemented by replacing (at [[runtime]]) a function with its calculated value once a value has been calculated for a given set of parameters. The function that does this value-for-function-object replacement can generically wrap any referentially transparent function. Consider the following [[pseudocode]] (where it is assumed that functions are first-class values):
In those programming languages where functions are [[first-class object|first or second-class objects]] (such as [[Lua (programming language)|Lua]], with its first-class functions), automatic memoization can be implemented by replacing (at [[runtime]]) a function with its calculated value once a value has been calculated for a given set of parameters. The function that does this value-for-function-object replacement can generically wrap any referentially transparent function. Consider the following [[pseudocode]] (where it is assumed that functions are first-class values):
Line 70: Line 68:
end if;
end if;
if ''values[arguments]'' is empty then
if F.''values[arguments]'' is empty then
''values[arguments]'' = ''F''(arguments);
F.''values[arguments]'' = ''F''(arguments);
end if;
end if;
return ''values[arguments]'';
return F.''values[arguments]'';
end function
end function


Line 90: Line 88:
end if;
end if;
if ''values[arguments]'' is empty then
if self.''values[arguments]'' is empty then
''values[arguments]'' = ''F''(arguments);
self.''values[arguments]'' = ''F''(arguments);
end if;
end if;
return ''values[arguments]'';
return self.''values[arguments]'';
end let;
end let;
Line 108: Line 106:
factorial = construct-memoized-functor(factorial)
factorial = construct-memoized-functor(factorial)


Essentially, such techniques involve attaching the ''original function object'' to the created functor and invoking that original object via an alias when a call to the actual function is required (to avoid endless [[recursion]]), as illustrated below:
Essentially, such techniques involve attaching the ''original function object'' to the created functor and forwarding calls to the original function being memoized via an alias when a call to the actual function is required (to avoid endless [[recursion]]), as illustrated below:


function construct-memoized-functor (''F'' is a function object parameter)
function construct-memoized-functor (''F'' is a function object parameter)
Line 119: Line 117:
allocate a new function object called ''alias'';
allocate a new function object called ''alias'';
attach ''alias'' to ''self''; [''for later ability to invoke <b>F</b> indirectly'']
attach ''alias'' to ''self''; [''for later ability to invoke <b>F</b> indirectly'']
''alias'' = ''F'';
self.''alias'' = ''F'';
end if;
end if;
if ''values[arguments]'' is empty then
if self.''values[arguments]'' is empty then
''values[arguments]'' = ''alias''(arguments); [''<b>not</b> a direct call to <b>F</b>'']
self.''values[arguments]'' = self.''alias''(arguments); [''<b>not</b> a direct call to <b>F</b>'']
end if;
end if;
return ''values[arguments]'';
return self.''values[arguments]'';
end let;
end let;
return ''memoized-version'';
return ''memoized-version'';
end function
end function

(Note: Some of the steps shown above may be implicitly managed by the implementation language and are provided for illustration.)


===Parsers===
===Parsers===


Memoization was explicitly explored as a [[parsing]] strategy in 1991 by [[Peter Norvig|Norvig]], who demonstrated that an algorithm similar to [[Earley's algorithm]] could be generated by introducing automatic memoization to a simple [[backtracking]] [[recursive descent parser]].<ref name="Norvig1991"/> It was again explored in the context of parsing in 1995 by Johnson<!-- note:don't wikify name: not same MarkJ --> and Dörre.<ref name="Johnson1995">Johnson, Mark, "Memoization of Top-Down Parsing,” ''Computational Linguistics'', Vol. 21 No. 3, pp. 405-417, September 1995.</ref><ref name="Johnson&Dorre">Johnson, Mark & Dörre, Jochen, "Memoization of Coroutined Constraints," ''Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics'', Cambridge, Massachusetts, 1995.</ref> In 2002, it was examined in considerable depth by Ford in the form called [[Packrat parser|packrat parsing]].<ref name="Ford2002">Ford, Bryan, ''Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking'', Master’s thesis, Massachusetts Institute of Technology, September, 2002.</ref>
Memoization was explored as a [[parsing]] strategy in 1991 by [[Peter Norvig|Norvig]], who demonstrated that an algorithm similar to [[Earley's algorithm]] could be generated by introducing automatic memoization to a simple [[backtracking]] [[recursive descent parser]].<ref name="Norvig1991"/> It was again explored in the context of parsing in 1995 by Johnson<!-- note:don't wikify name: not same MarkJ --> and Dörre.<ref name="Johnson1995">Johnson, Mark, "Memoization of Top-Down Parsing,” ''Computational Linguistics'', Vol. 21 No. 3, pp. 405-417, September 1995.</ref><ref name="Johnson&Dorre">Johnson, Mark & Dörre, Jochen, "Memoization of Coroutined Constraints," ''Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics'', Cambridge, Massachusetts, 1995.</ref> In 2002, it was examined in considerable depth by Ford in the form called [[Packrat parser|packrat parsing]].<ref name="Ford2002">Ford, Bryan, ''Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking'', Master’s thesis, Massachusetts Institute of Technology, September, 2002.</ref>


While Norvig increased the ''power'' of the parser through memoization, the augmented parser was still as time complex as Earley's algorithm, which demonstrates a case of the use of memoization for something other than speed optimization. Johnson and Dörre<ref name="Johnson&Dorre"/> demonstrate another such non-speed related application of memoization: the use of memoization to delay linguistic constraint resolution to a point in a parse where sufficient information has been accumulated to resolve those constraints. By contrast, in the speed optimization application of memoization, Ford demonstrated that memoization could guarantee that [[parsing expression grammar]]s could parse in [[Big O notation|linear]] time even those [[Formal language|languages]] that resulted in worst-case backtracking behavior.<ref name="Ford2002"/>
While Norvig increased the ''power'' of the parser through memoization, the augmented parser was still as time complex as Earley's algorithm, which demonstrates a case of the use of memoization for something other than speed optimization. Johnson and Dörre<ref name="Johnson&Dorre"/> demonstrate another such non-speed related application of memoization: the use of memoization to delay linguistic constraint resolution to a point in a parse where sufficient information has been accumulated to resolve those constraints. By contrast, in the speed optimization application of memoization, Ford demonstrated that memoization could guarantee that [[parsing expression grammar]]s could parse in [[Big O notation|linear]] time even those [[Formal language|languages]] that resulted in worst-case backtracking behavior.<ref name="Ford2002"/>
Line 180: Line 180:
==See also==
==See also==


* [[Computational complexity theory]] - More information on algorithm cost analysis.
* [[Computational complexity theory]] - More information on algorithm complexity.
* [[Strength reduction]] - Similar to memoization, but a [[compiler]] optimization.
* [[Strength reduction]] - A [[compiler]] optimization that replaces an expensive operation with an equivalent, less expensive one.
* [[Lazy evaluation]] - Shares some concepts with memoization.
* [[Lazy evaluation]] - Shares some concepts with memoization.
* [[Lookup table]] - A key [[data structure]] used in memoization.
* [[Lookup table]] - A key [[data structure]] used in memoization.
Line 198: Line 198:
[[Category:Dynamic programming]]
[[Category:Dynamic programming]]
[[Category:Software design patterns]]
[[Category:Software design patterns]]
[[Category:Articles with example pseudocode]]


[[de:Memoisation]]
[[de:Memoisation]]

Revision as of 09:17, 26 October 2006

In computing, memoization is an optimization technique used primarily to speed up computer programs by storing the results of function calls for later reuse, rather than recomputing them at each invocation of the function. Memoization has also been used in other contexts (and for other purposes than speed gains), such as in parsing. Although related to caching, memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as buffering or page replacement.

Overview

The term "memoization" was coined by Donald Michie in 1968 [1] and is derived from the Latin word memorandum (to be remembered), and thus carries the meaning of turning [the results of] a function into something to be remembered. While memoization might be confused with memorization (because of the shared cognate), memoization has a specialized meaning in computing.

A memoized function stores results of previous calls with the same call parameters, and if and when the function is later called with those same parameters, returns the stored result rather than recalculating it, thus moving the primary cost of a call with given parameters to the first call made to the function with those parameters. A function can only be memoized if it is referentially transparent; that is, only if calling the function has the exact same effect as replacing that function call with its return value. (Special case exceptions to this restriction exist, however.) While related to lookup tables, since memoization uses such tables in its implementation, memoization differs from pure table lookup in that the tables memoization might use are populated transparently on an as-needed basis.

Memoization is a means of lowering a function's time cost in exchange for space cost; that is, memoized functions become optimized for speed in exchange for a higher use of computer memory space. The time/space "cost" of algorithms has a specific name in computing: computational complexity. All functions have a computational complexity in time (i.e. they take time to execute) and in space.

Although a trade-off occurs (i.e., space used is speed gained), this differs from some other optimizations that involve time-space trade-off, such as strength reduction, in that memoization is a runtime rather than compile time optimization. Moreover, strength reduction potentially replaces an expensive operation such as multiplication with a less expensive operation such as addition, and the results in savings can be highly non-portable across machines, whereas memoization is a machine-independent strategy.

Consider the following pseudocode function to calculate the factorial of n:

 function factorial (n is a positive integer)
    if n is 0 then
     return 1 [by the convention that 0! = 1]
    else   
     return factorial(n - 1) times n [recursively invoke factorial with the parameter 1 less than n]
    end if
 end function

For every integer n such that , the final result of the function factorial is invariant; if invoked as x = factorial(3), the result is such that x will always be assigned the value 6. A non-memoized version of the above, given the nature of the recursive algorithm involved, would require n + 1 invocations of factorial to arrive at a result, and each of these invocations, in turn, has an associated cost in the time it takes the function to return the value computed. Depending on the machine, this cost might be the sum of:

  1. The cost to set up the functional call stack frame.
  2. The cost to compare n to 0.
  3. The cost to subtract 1 from n.
  4. The cost to set up the recursive call stack frame. (As above.)
  5. The cost to multiply the result of the recursive call to factorial by n.
  6. The cost to store the return result so that it may be used by the calling context.

In a non-memoized implementation, every top-level call to factorial includes the cumulative cost of steps 2 through 6 proportional to the initial value of n.

A memoized version of the factorial function follows:

 function factorial (n is a positive integer)
    allocate temporary integer variable x
 
    if n is in lookup-table then
     return lookup-table-value-for-n;
    else if n is 0 then
     return 1 [by the convention that 0! = 1]
    else   
     x = factorial(n - 1) times n [recursively invoke factorial with the parameter 1 less than n]
    end if
 
    store x in lookup-table in the nth slot [remember the result of n! for later]
 
    return x
 end function

In the memoized version above, the lookup-table is assumed to be a persistent storage space, such as an associative array that uses n as its key. Since this lookup table will use space (that is, computer memory), the time required to call factorial repeatedly on a second call to the function with the same parameter has been traded for the memory required to store the lookup table.

In this particular example, if factorial is first invoked with 5, and then invoked later with any value less than or equal to five, those return values will also have been memoized, since factorial will have been called recursively with the values 5, 4, 3, 2, 1, and 0, and the return values for each of those will have been stored. If it is then called with a number greater than 5, such as 7, only 2 recursive calls will be made (7 and 6), and the value for 5! will have been stored from the previous call. In this way, memoization allows a function to become more time-efficient the more often it is called, thus resulting in eventual overall speed up.

Some other considerations

Automatic memoization

While memoization may be added to functions internally and explicitly by a computer programmer in much the same way the above memoized version of factorial is implemented, referentially transparent functions may also be automatically memoized externally.[2] The techniques employed by Norvig have application not only in Common Lisp (the language in which his paper demonstrated automatic memoization), but in various other programming languages. Applications of automatic memoization have also been formally explored in the study of term rewriting[3] and artificial intelligence.[4]

In those programming languages where functions are first or second-class objects (such as Lua, with its first-class functions), automatic memoization can be implemented by replacing (at runtime) a function with its calculated value once a value has been calculated for a given set of parameters. The function that does this value-for-function-object replacement can generically wrap any referentially transparent function. Consider the following pseudocode (where it is assumed that functions are first-class values):

  function memoized-call (F is a function object parameter)
     if F has no attached array values then
        allocate an associative array called values;
        attach values to F;
     end if;
 
     if F.values[arguments] is empty then
        F.values[arguments] = F(arguments);
     end if;
 
     return F.values[arguments];     
  end function

In order to call an automatically memoized version of factorial using the above strategy, rather than calling factorial directly, code invokes memoized-call(factorial(n)). Each such call first checks to see if a holder array has been allocated to store results, and if not, attaches that array. If no entry exists at the position values[arguments] (where arguments are used as the key of the associative array), a real call is made to factorial with the supplied arguments. Finally, the entry in the array at the key position is returned to the caller.

The above strategy requires explicit wrapping at each call to a function that is to be memoized. In those languages that allow closures, memoization can be effected implicitly by a functor factory that returns a wrapped memoized function object. In pseudocode, this can be expressed as follows:

 function construct-memoized-functor (F is a function object parameter)
    allocate a function object called memoized-version;
 
    let memoized-version(arguments) be
       if self has no attached array values then [self is a reference to this object]
          allocate an associative array called values;
          attach values to self;
       end if;

       if self.values[arguments] is empty then
          self.values[arguments] = F(arguments);
       end if;

       return self.values[arguments];     
    end let;
 
    return memoized-version;
 end function

Rather than call factorial, a new function object memfact is created as follows:

 memfact = construct-memoized-functor(factorial)

The above example assumes that the function factorial has already been defined before the call to construct-memoized-functor is made. From this point forward, memfact(n) is called whenever the factorial of n is desired. In languages such as Lua, more sophisticated techniques exist which allow a function to be replaced by a new function with the same name, which would permit:

 factorial = construct-memoized-functor(factorial)

Essentially, such techniques involve attaching the original function object to the created functor and forwarding calls to the original function being memoized via an alias when a call to the actual function is required (to avoid endless recursion), as illustrated below:

function construct-memoized-functor (F is a function object parameter)
    allocate a function object called memoized-version;
 
    let memoized-version(arguments) be
       if self has no attached array values then [self is a reference to this object]
          allocate an associative array called values;
          attach values to self;
          allocate a new function object called alias;
          attach alias to self; [for later ability to invoke F indirectly]
          self.alias = F;
       end if;

       if self.values[arguments] is empty then
          self.values[arguments] = self.alias(arguments); [not a direct call to F]
       end if;

       return self.values[arguments];     
    end let;
 
    return memoized-version;
 end function

(Note: Some of the steps shown above may be implicitly managed by the implementation language and are provided for illustration.)

Parsers

Memoization was explored as a parsing strategy in 1991 by Norvig, who demonstrated that an algorithm similar to Earley's algorithm could be generated by introducing automatic memoization to a simple backtracking recursive descent parser.[2] It was again explored in the context of parsing in 1995 by Johnson and Dörre.[5][6] In 2002, it was examined in considerable depth by Ford in the form called packrat parsing.[7]

While Norvig increased the power of the parser through memoization, the augmented parser was still as time complex as Earley's algorithm, which demonstrates a case of the use of memoization for something other than speed optimization. Johnson and Dörre[6] demonstrate another such non-speed related application of memoization: the use of memoization to delay linguistic constraint resolution to a point in a parse where sufficient information has been accumulated to resolve those constraints. By contrast, in the speed optimization application of memoization, Ford demonstrated that memoization could guarantee that parsing expression grammars could parse in linear time even those languages that resulted in worst-case backtracking behavior.[7]

Consider the following grammar:

 S → (A c) | (B d)
 A → X (a|b)
 B → X b
 X → x [X]

(Notation note: In the above example, the production S → (A c) | (B d) reads: "An S is either an A followed by a c or a B followed by a d." The production X → x [X] reads "An X is an x followed by an optional X.")

This grammar generates one of the following three variations of string: xac, xbc, or xbd (where x here is understood to mean one or more x 's.) Next, consider how this grammar, used as a parse specification, might effect a top-down, left-right parse of the string xxxxxbd:

The rule A will recognize xxxxxb (by first descending into X to recognize one x, and again descending into X until all the x 's are consumed, and then recognizing the b), and then return to S, and fail to recognize a c. The next clause of S will then descend into B, which in turn again descends into X and recognizes the x 's by means of many recursive calls to X, and then a b, and returns to S and finally recognizes a d.

The key concept here is inherent in the phrase again descends into X. The process of looking forward, failing, backing up, and then retrying the next alternative is known in parsing as backtracking, and it is primarily backtracking that presents opportunities for memoization in parsing. Consider a function RuleAcceptsSomeInput(Rule, Position, Input), where the parameters are as follows:

  • Rule is the name of the rule under consideration.
  • Position is the offset currently under consideration in the input.
  • Input is the input under consideration.

Let the return value of the function RuleAcceptsSomeInput be the length of the input accepted by Rule, or 0 if that rule does not accept any input at that offset in the string. In a backtracking scenario with such memoization, the parsing process is as follows:

When the rule A descends into X at offset 0, it memoizes the length 5 against that position and the rule X. After having failed at d, B then, rather than descending again into X, queries the position 0 against rule X in the memoization engine, and is returned a length of 5, thus saving having to actually descend again into X, and carries on as if it had descended into X as many times as before.

In the above example, one or many descents into X may occur, allowing for strings such as xxxxxxxxxxxxxxxxbd. In fact, there may be any number of x 's before the b. While the call to S must recursively descend into X as many times as there are x 's, B will never have to descend into X at all, since the return value of RuleAcceptsSomeInput(X, 0, xxxxxxxxxxxxxxxxbd) will be 16 (in this particular case).

Those parsers that make use of syntactic predicates are also able to memoize the results of predicate parses, as well, thereby reducing such constructions as:

 S → (A)? A
 A → /* some rule */

to one descent into A.

If a parser builds a parse tree during a parse, it must memoize not only the length of the input that matches at some offset against a given rule, but also must store the sub-tree that is generated by that rule at that offset in the input, since subsequent calls to the rule by the parser will not actually descend and rebuild that tree. For the same reason, memoized parser algorithms that generate calls to external code (sometimes called a semantic action routine) when a rule matches must use some scheme to insure that such rules are invoked in a predictable order.

Since, for any given backtracking or syntactic predicate capable parser not every grammar will need backtracking or predicate checks, the overhead of storing each rule's parse results against every offset in the input (and storing the parse tree if the parsing process does that implicitly) may actually slow down a parser. This effect can be mitigated by explicit selection of those rules the parser will memoize.[8]

References and notes

  1. ^ Michie, Donald, "Memo Functions and Machine Learning," Nature, No. 218, pp. 19-22, 1968.
  2. ^ a b Norvig, Peter, "Techniques for Automatic Memoization with Applications to Context-Free Parsing," Computational Linguistics, Vol. 17 No. 1, pp. 91-98, March 1991.
  3. ^ Hoffman, Berthold, "Term Rewriting with Sharing and Memoïzation," Algebraic and Logic Programming: Third International Conference, Proceedings, H. Kirchner and G. Levi (eds.), pp. 128-142, Volterra, Italy, 2-4 September 1992.
  4. ^ Mayfield, James, et al, Using Automatic Memoization as a Software Engineering Tool in Real-World AI Systems, TBD, 1995.
  5. ^ Johnson, Mark, "Memoization of Top-Down Parsing,” Computational Linguistics, Vol. 21 No. 3, pp. 405-417, September 1995.
  6. ^ a b Johnson, Mark & Dörre, Jochen, "Memoization of Coroutined Constraints," Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, 1995.
  7. ^ a b Ford, Bryan, Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking, Master’s thesis, Massachusetts Institute of Technology, September, 2002.
  8. ^ Acar, Umut A. A. et al., "Selective Memoization," Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, New Orleans, Louisiana, pp. 14-25, 15-17 January 2003.

See also

Examples of memoization in various programming languages