Closure (computer programming): Difference between revisions

Content deleted Content added

Inline

Revision as of 09:09, 25 August 2009

In computer science, a closure is a first-class function with free variables. Such a function is said to be "closed over" its free variables. A closure is defined within the scope of its free variables, and the extent of those variables is at least as long as the lifetime of the closure itself. The explicit use of closures is associated with functional programming and with languages such as ML, Lisp and Perl. Closures are used to implement continuation passing style, and in this manner, hide state. Constructs such as objects and monads can thus be implemented with closures.

The concept of closures was developed in the 1960s and was first fully implemented as a language feature in the programming language Scheme. Since then, many languages have been designed to support closures.

In some languages, a closure may occur when a function is defined within another function, and the inner function refers to local variables of the outer function. At runtime, when the outer function executes, a closure is formed, consisting of the inner function’s code and references to any variables of the outer function required by the closure; such variables are called the upvalues of the closure.

The term closure is often mistakenly used to mean anonymous function. This is probably because most languages implementing anonymous functions allow them to form closures and programmers are usually introduced to both concepts at the same time. These are, however, distinct concepts.

Function objects are sometimes also called closures.

Closures and state representation

A closure can be used to associate a function with a set of "private" variables, which persist over several invocations of the function. The scope of the variable encompasses only the closed-over function, so it cannot be accessed from other program code. However, the variable is of indefinite extent, so a value established in one invocation remains available in the next. As a consequence, closures can be used to hide state, and thus to implement paradigms for state representation.

However, closures used in this way no longer have referential transparency, and are thus no longer pure functions. Nevertheless, they are commonly used in "near-functional" languages such as Scheme.

Closures and first-class functions

Closures typically appear in languages in which functions are first-class values—in other words, such languages allow functions to be passed as arguments, returned from function calls, bound to variable names, etc., just like simpler types such as strings and integers. For example, consider the following Scheme function:

; Return a list of all books with at least THRESHOLD copies sold.
(define (best-selling-books threshold)
  (filter 
    (lambda (book) (>= (book-sales book) threshold))
    book-list))

In this example, the lambda expression (lambda (book) (>= (book-sales book) threshold)) appears within the function best-selling-books. When the lambda expression is evaluated, Scheme creates a closure consisting of the code for the lambda and a reference to the threshold variable, which is a free variable inside the lambda.

The closure is then passed to the filter function, which calls it repeatedly to determine which books are to be added to the result list and which are to be discarded. Because the closure itself has a reference to threshold, it can use that variable each time filter calls it. The function filter itself might be defined in a completely separate file.

Here is the same example rewritten in ECMAScript (JavaScript), another popular language with support for closures:

// Return a list of all books with at least 'threshold' copies sold.
function bestSellingBooks(threshold) {
  return bookList.filter(
      function (book) { return book.sales >= threshold; }
    );
}

The function keyword is used here instead of lambda, and an Array.filter method^[1] instead of a global filter function, but otherwise the structure and the effect of the code are the same.

A function may create a closure and return it, as in the following example:

// Return a function that approximates the derivative of f
// using an interval of dx, which should be appropriately small.
function derivative(f, dx) {
  return function (x) {
    return (f(x + dx) - f(x)) / dx;
  };
}

Because the closure in this case outlives the scope of the function that creates it, the variables f and dx live on after the function derivative returns. In languages without closures, the lifetime of a local variable coincides with the execution of the scope where that variable is declared. In languages with closures, variables must continue to exist as long as any existing closures have references to them. This is most commonly implemented using some form of garbage collection.

Uses of closures

Closures have many uses:

Because closures delay evaluation—i.e., they do not "do" anything until they are called—they can be used to define control structures. For example, all Smalltalk's standard control structures, including branches (if/then/else) and loops (while and for), are defined using objects whose methods accept closures. Users can easily define their own control structures as well.
Multiple functions can be produced that close over the same environment, enabling them to communicate privately by altering that environment (in languages that allow assignment). In Scheme:

(define foo #f)
(define bar #f)

(let ((secret-message "none"))
  (set! foo (lambda (msg) (set! secret-message msg)))
  (set! bar (lambda () secret-message)))

(display (bar)) ; prints "none"
(newline)
(foo "meet me by the docks at midnight")
(display (bar)) ; prints "meet me by the docks at midnight"

Closures can be used to implement object systems.^[2]

Note: Some speakers call any data structure that binds a lexical environment a closure, but the term usually refers specifically to functions.

Differences in semantics

As different languages do not always have a common definition of the lexical environment, their definitions of closure may vary as well. The commonly held minimalist definition of the lexical environment defines it as a set of all bindings of variables in the scope, and that is also what closures in any language have to capture. It should be noted, though, that the meaning of a variable binding also differs. In imperative languages, variables bind to relative locations in memory that can store values. Although the relative location of a binding does not change at runtime, the value in the bound location can. In such languages, since closure captures the binding, any operation on the variable, whether done from the closure or not, is performed on the same relative memory location. Here is an example illustrating the concept in ECMAScript, which is one such language:

var f, g;
function foo() {
  var x = 0;
  f = function() { return ++x; };
  g = function() { return --x; };
  x = 1;
  print(f()); // "2"
}
foo();
print(g()); // "1"
print(f()); // "2"

Note how function foo and the closures referred to by variables f and g all use the same relative memory location signified by local variable x.

On the other hand, many functional languages, such as ML, bind variables directly to values. In this case, since there is no way to change the value of the variable once it is bound, there is no need to share the state between closures—they just use the same values.

Yet another subset, lazy functional languages such as Haskell, bind variables to a result of a computation in the future. Consider this example in Haskell:

 
foo x y = let r = x / y
          in (\z -> z + r)
f = foo 1 0
main = do putStr (show (f 123))

The binding of r captured by the closure defined within function foo is to the computation (x / y) - which in this case results in division by zero. However, since it is the computation that is captured, and not the value, the error only manifests itself when the closure is invoked, and actually attempts to use the captured binding.

Yet more differences manifest themselves in the behavior of other lexically-scoped constructs, such as return, break and continue statements. Such constructs can, in general, be considered in terms of invoking an escape continuation established by an enclosing control statement (in case of break and continue, such interpretation requires looping constructs to be considered in terms of recursive function calls). In some languages, such as ECMAScript, return refers to the continuation established by the closure lexically innermost with respect to the statement—thus, a return within a closure transfers control to the code that called it. In Smalltalk, however, the superficially similar ^ operator invokes the escape continuation established for the method invocation, ignoring the escape continuations of any intervening nested closures. The escape continuation of a particular closure can only be invoked in Smalltalk implicitly by reaching the end of the closure's code. The following examples in ECMAScript and Smalltalk highlight the difference:

"Smalltalk"
foo
  | xs |
  xs := #(1 2 3 4).
  xs do: [:x | ^x].
  ^0
bar
  Transcript show: (self foo printString) "prints 1"

// ECMAScript
function foo() {
  var xs = new Array(1, 2, 3, 4);
  xs.forEach(function(x) { return x; });
  return 0;
}
print(foo()); // prints 0

// C#
int foo()
{
  var xs = new[] { 1, 2, 3, 4 };            
  xs.Select((el, x) => el);
  return 0;
}
...
Console.WriteLine(foo());

The above code snippets will behave differently because the Smalltalk ^ operator and the JavaScript return operator are not analogous. In the ECMAScript example, return x will leave the inner closure to begin a new iteration of the forEach loop, whereas in the Smalltalk example, ^x will abort the loop and return from the method foo.

Common Lisp provides a construct that can express either of the above actions: Smalltalk ^x behaves as (return-from foo x), while JavaScript return x behaves as (return-from nil x). Hence, Smalltalk makes it possible for a captured escape continuation to outlive the extent in which it can be successfully invoked. Consider:

foo
    ^[ :x | ^x ]
bar
    | f |
    f := self foo.
    f value: 123 "error!"

When the closure returned by the method foo is invoked, it attempts to return a value from the invocation of foo that created the closure. Since that call has already returned and the Smalltalk method invocation model does not follow the spaghetti stack discipline to allow multiple returns, this operation results in an error.

Some languages, such as Ruby, allow the programmer to choose the way return is captured. An example in Ruby:

# ruby
def foo
  f = Proc.new { return "return from foo from inside proc" }
  f.call # control leaves foo here
  return "return from foo"
end

def bar
  f = lambda { return "return from lambda" }
  f.call # control does not leave bar here
  return "return from bar"
end
  
puts foo # prints "return from foo from inside proc"
puts bar # prints "return from bar"

Both Proc.new and lambda in this example are ways to create a closure, but semantics of the closures thus created are different with respect to the return statement.

In Scheme, definition and scope of the return control statement is explicit (and only arbitrarily named 'return' for the sake of the example). The following is a direct translation of the Ruby sample.

(define call/cc call-with-current-continuation)

(define (foo)
  (call/cc 
   (lambda (return)
     (define (f) (return "return from foo from inside proc"))
     (f) ; control leaves foo here
     (return "return from foo"))))

(define (bar)
  (call/cc
   (lambda (return)
     (define (f) (call/cc (lambda (return) (return "return from lambda"))))
     (f) ; control does not leave bar here
     (return "return from bar"))))

(display (foo)) ; prints "return from foo from inside proc"
(newline)
(display (bar)) ; prints "return from bar"

Implementation and theory

Closures are typically implemented with a special data structure that contains a pointer to the function code, plus a representation of the function's lexical environment (e.g., the set of available variables and their values) at the time when the closure was created.

A language implementation cannot easily support full closures if its run-time memory model allocates all local variables on a linear stack. In such languages, a function's local variables are deallocated when the function returns. However, a closure requires that the free variables it references survive the enclosing function's execution. Therefore, those variables must be allocated so that they persist until no longer needed. This explains why, typically, languages that natively support closures also use garbage collection. The alternative is for the language to accept that certain use cases will lead to undefined behaviour, as in the proposal for lambda expressions in C++.^[3]

In ML, local variables are allocated on a linear stack. When a closure is created, it copies the values of those variables that are needed by the closure into the closure's data structure.

A typical modern Scheme implementation allocates local variables that might be used by closures dynamically and stores all other local variables on the stack.

Closures are closely related to Actors in the Actor model of concurrent computation where the values in the function's lexical environment are called acquaintances. An important issue for closures in concurrent programming languages is whether the variables in a closure can be updated and, if so, how these updates can be synchronized. Actors provide one solution.^[4]

Closure-like constructs in other languages

In C, libraries that support callbacks sometimes allow a callback to be registered using two values: a function pointer and a separate void* pointer to arbitrary data of the user's choice. Each time the library executes the callback function, it passes in the data pointer. This allows the callback to maintain state and to refer to information captured at the time it was registered. The idiom is similar to closures in functionality, but not in syntax.

Several object-oriented techniques and language features simulate some features of closures. For example:

Eiffel

Eiffel includes inline agents defining closures. An inline agent is an object representing a routine, defined by giving the code of the routine in line. For example, in

OK_button.click_event.subscribe(
	agent(x, y: INTEGER) do
		country := map.country_at_coordinates(x, y)
		country.display
	end
)

the argument to subscribe is an agent, representing a procedure with two arguments; the procedure finds the country at the corresponding coordinates and displays it. The whole agent is "subscribed" to the event type click_event for a certain button, so that whenever an instance of the event type occurs on that button - because a user has clicked the button - the procedure will be executed with the mouse coordinates being passed as arguments for x and y.

The main limitation of Eiffel agents, which distinguishes them from true closures, is that they cannot reference local variables from enclosing scope, but this can easily be worked around by providing additional closed operands to the agent. Only Current (a reference to current object, analogous to this in Java), its features, and arguments of the agent itself can be accessed from within the agent body.

C++

C++ allows defining function objects by overloading operator(). These objects behave somewhat like functions in a functional programming language. They may be created at runtime and may contain state, but they do not implicitly capture local variables as closures do. Two proposals to introduce C++ language support for closures (both proposals call them lambda functions) are being considered by the C++ Standards Committee [1], [2]. The main difference between these proposals is that one stores a copy of all the local variables in a closure by default, and another stores references to original variables. Both provide functionality to override the default behaviour. If some form of these proposals is accepted, one would be able to write

void foo(string myname) {
	typedef vector<string> names;
	int y;
	names n;
	// ...
	names::iterator i =
	 find_if(n.begin(), n.end(), <>(const string& s){return s != myname && s.size() > y;});
	// <tt>i</tt> is now either <tt>n.end()</tt> or points to the first string in <tt>n</tt>
	// which is not equal to <tt> myname</tt> and which length is greater than <tt>y</tt>
}

Java

Java allows defining "anonymous classes" inside a method; an anonymous class may refer to names in lexically enclosing classes, or final (i.e., read-only) variables in the lexically enclosing method, as long as they are not shadowed by members of the anonymous class.

class CalculationWindow extends JFrame {
	private JButton btnSave;
	...

	public final void calculateInSeparateThread(final URI uri) {
		// The expression "new Runnable() { ... }" is an anonymous class.
		Runnable runner = new Runnable() {
			void run() {
				// It can access final local variables:
				calculate(uri);
				// It can access private fields of the enclosing class:
				// Always update the Graphic components into the Swing Thread
				SwingUtilities.invokeLater(new Runnable() {
					public void run() {
						btnSave.setEnabled(true);
					}
				});
			}
		};
		new Thread(runner).start();
	}
}

Note that some features of closures can be emulated by using a final reference to a single-element array. The inner class will not be able to change the value of the reference itself, but it will be able to change the value of the element in the array referenced. This technique is not limited to Java, and will work in other languages with similar restrictions.

A language extension providing fully featured closures for Java is under consideration for Java SE 7.

References

^ "filter". Mozilla Developer Center. 29 November 2007. Retrieved 2008-12-23.
^ "Re: FP, OO and relations. Does anyone trump the others?". 29 December 1999. Retrieved 2008-12-23.
^ Lambda Expressions and Closures C++ Standards Committee. 29 February 2008.
^ Foundations of Actor Semantics Will Clinger. MIT Mathematics Doctoral Dissertation. June 1981.

External links

The Original "Lambda Papers": A classic series of papers by Guy Steele and Gerald Sussman discussing, among other things, the versatility of closures in the context of Scheme (where they appear as lambda expressions).
Closures: An article about closures in dynamically-typed imperative languages, by Martin Fowler.
Collection closure methods: An example of a technical domain where using closures is convenient, by Martin Fowler.
The beauty of closures: An article about using closures in Java and .NET
What are closures: A post on closures in Javascript.

Closure in Delphi

Nick Hodges, "Delphi 2009 Reviewers Guide," October 2008, CodeGear Developer Network, CodeGear.
Craig Stuntz, "Understanding Anonymous Methods," October 2008
Dr. Bob, "Delphi 2009 Anonymous Methods"

[1] "filter". Mozilla Developer Center. 29 November 2007. Retrieved 2008-12-23.

[2] "Re: FP, OO and relations. Does anyone trump the others?". 29 December 1999. Retrieved 2008-12-23.

[3] Lambda Expressions and Closures C++ Standards Committee. 29 February 2008.

[4] Foundations of Actor Semantics Will Clinger. MIT Mathematics Doctoral Dissertation. June 1981.

[1]

[2]

[3]

[4]

@@ Line 33: / Line 33: @@
 function bestSellingBooks(threshold) {
   return bookList.filter(
-      function(book) { return book.sales >= threshold; }
+      function (book) { return book.sales >= threshold; }
     );
 }
@@ Line 40: / Line 40: @@
 The <code>function</code> keyword is used here instead of <code>lambda</code>, and an <code>Array.filter</code> method<ref>{{cite web | url = http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Objects:Array:filter | title = filter | work = Mozilla Developer Center | date = 29 November 2007 | accessdate = 2008-12-23}}</ref> instead of a global <code>filter</code> function, but otherwise the structure and the effect of the code are the same.
+A function may create a closure and return it, as in the following example:
-And in [[Scala (programming language)|Scala]]:
-<source lang="scala">
-//# Return a list of all books with at least 'threshold' copies sold.
-def bestSellingBooks(threshold: Int) = bookList.filter(book => book.sales >= threshold)
-//# or
-def bestSellingBooks(threshold: Int) = bookList.filter(_.sales >= threshold)
-</source>
-Scala's type inference system automatically recognizes the argument to <code>filter</code> to be a function that takes a book and returns a Boolean value and turns it into a closure.[[Scala (programming language)|Scala]] always returns the last expression in a block, which in this case is the list returned by the filter method and since this block contains only one expression the curly brackets can be omitted. The second method is the same as the first one but written more concisely.
-A function may create a closure and return it.  The following example is a function that returns a function.
-In Scheme:
-<source lang="scheme">
-; Return a function that approximates the derivative of f
-; using an interval of dx, which should be appropriately small.
-(define (derivative f dx)
-  (lambda (x) (/ (- (f (+ x dx)) (f x)) dx)))
-</source>
-In [[C Sharp (programming language)|C#]]:
-<source lang="csharp">
-public static class Example
-{
-   /// <summary>
-   /// Return a function that approximates the derivative of f
-   /// using an interval of dx, which should be appropriately small.
-   /// </summary>
-   /// <param name="f">Function for which a derivative is returned.</param>
-   /// <param name="dx">Step interval.</param>
-   /// <returns>Fuction representing the derivative.</returns>
-   private static Func<double, double> Derivative(Func<double, double> f, double dx)
-   {
-      return (x) => (f(x + dx) - f(x)) / dx;
-   }
-}
-</source>
-In [[Scala (programming language)|Scala]]:
-<source lang="scala">
-//# Return a function that approximates the derivative of f
-//# using an interval of dx, which should be appropriately small.
-def derivative(f: Double => Double, dx: Double) = (x: Double) => (f(x + dx) - f(x)) / dx
-</source>
-In ECMAScript:
 <source lang="javascript">
 // Return a function that approximates the derivative of f
 // using an interval of dx, which should be appropriately small.
 function derivative(f, dx) {
-  return function(x) {
+  return function (x) {
     return (f(x + dx) - f(x)) / dx;
   };
 }
 </source>
-In Erlang:
-<source lang="text">
-% Return a function that approximates the derivative of F
-% using an interval of Dx, which should be appropriately small.
-derivative(F, Dx) ->
-  fun(X) ->
-    (F(X + Dx) - F(X)) / Dx
-  end.
-</source>
-In PHP (5.3.0+):
-<source lang="php">
-// Return a function that approximates the derivative of f
-// using an interval of dx, which should be appropriately small.
-function derivative($f, $dx) {
-  return function($x) use($f, $dx) {
-    return ($f($x + $dx) - $f($x)) / $dx;
-  };
-}
-</source>
 Because the closure in this case outlives the [[Scope (programming)|scope]] of the function that creates it, the variables <code>f</code> and <code>dx</code> live on after the function <code>derivative</code> returns.  In languages without closures, the lifetime of a local variable coincides with the execution of the scope where that variable is declared.  In languages with closures, variables must continue to exist as long as any existing closures have references to them.
 This is most commonly implemented using some form of [[garbage collection (computer science)|garbage collection]].
-While this is not always clarified, a closure need not be formed using an [[anonymous function]]. The [[Python (programming language)|Python]] programming language, for example, has very limited support for anonymous functions but fully supports closures. For example, one way the above ECMAScript example could be implemented in Python is:
-<source lang="python">
-# Return a function that approximates the derivative of f
-# using an interval of dx, which should be appropriately small.
-def derivative(f, dx):
-    def gradient(x):
-        return (f(x + dx) - f(x)) / dx
-    return gradient
-</source>
-In this example, the function named ''gradient'' forms a closure together with the variables ''f'' and ''dx''. This closure is then returned by the outer function named ''derivative''.
-Note: Python also provides lambda forms. Although these are restricted to a single expression, they can completely replace named functions in the above example like this:
-<source lang="python">
-# Return a function that approximates the derivative of f
-# using an interval of dx, which should be appropriately small.
-derivative = lambda f, dx: lambda x: (f(x + dx) - f(x)) / dx
-</source>
 ==Uses of closures==