Jump to content

Function object

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Elibarzilay (talk | contribs) at 07:33, 23 August 2011 (→‎In Lisp). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A function object, also called a functor, functional, or functionoid,[1] is a computer programming construct allowing an object to be invoked or called as though it were an ordinary function, usually with the same syntax.

Description

A typical use of a function object is in writing callback functions. A callback in procedural languages, such as C, may be performed by using function pointers. However it can be difficult or awkward to pass a state into or out of the callback function. This restriction also inhibits more dynamic behavior of the function. A function object solves those problems since the function is really a façade for a full object, thus it carries its own state.

Many modern languages (and some older) e.g. C++, Lisp, Perl, PHP, Python, Ruby, and many others, support first-class function objects and may even make significant use of them. Functional programming languages additionally support closures, i.e. first-class functions which can 'close over' variables in their surrounding environment at creation time. During compilation, a transformation known as lambda lifting converts the closures into function objects.

In C and C++

Consider the example of a sorting routine which uses a callback function to define an ordering relation between a pair of items. A C program using function pointers may appear as:

#include <stdlib.h>

/* Callback function */
int compare_ints_function(void *A, void *B) {
  return *((int *)(A)) < *((int *)(B));
}
...
/* Declaration of C sorting function */
void sort(void *first_item, size_t item_size, void *last_item, int (*cmpfunc)(void *, void *) );
...
int main(void) {
    int items[] = {4, 3, 1, 2};
    sort((void *)(items), sizeof(int), (void *)(items + 3), compare_ints_function);
    return 0;
}

In C++ a function object may be used instead of an ordinary function by defining a class which overloads the function call operator by defining an operator() member function. In C++ this is called a class type functor, and may appear as follows:

struct compare_class {
  bool operator()(int A, int B) const {
    return A < B;
  }
};
...
// Declaration of C++ sorting function.
template <class ComparisonFunctor> 
void sort_ints(int* begin_items, int num_items, ComparisonFunctor c);
...
int main() {
    int items[] = {4, 3, 1, 2};
    sort_ints(items, sizeof(items)/sizeof(items[0]), compare_class());
}

Notice that the syntax for providing the callback to the sort_ints() function is identical, but an object is passed instead of a function pointer. When invoked, the callback function is executed just as any other member function, and therefore has full access to the other members (data or functions) of the object.

It is possible to use function objects in situations other than as callback functions (although the shortened term functor is normally not used). Continuing the example,

  functor_class Y;
  int result = Y( a, b );

In addition to class type functors, other kinds of function objects are also possible in C++. They can take advantage of C++'s member-pointer or template facilities. The expressiveness of templates allows some functional programming techniques to be used, such as defining function objects in terms of other function objects (like function composition). Much of the C++ Standard Template Library (STL) makes heavy use of template-based function objects.

Performance

An advantage of function objects in C++ is performance because, unlike a function pointer, a function object can be inlined.[2] For example, consider a simple function which increments its argument implemented as a function object:

struct IncrementFunctor {
  void operator()(int &i) { ++i; }
};

and as a free function:

void increment_function(int &i) { ++i; }

Recall the standard library function std::for_each():

template<typename InputIterator, typename Function>
Function for_each(InputIterator first, InputIterator last, Function f) {
  for ( ; first != last; ++first)
    f(*first);
  return f;
}

Suppose we apply std::for_each() like so:

int A[] = {1, 4, 2, 8, 5, 7};
const int N = sizeof(A) / sizeof(A[0]);
for_each(A, A + N, IncrementFunctor());
for_each(A, A + N, increment_function);

Both calls to for_each() will work as expected. The first call will be to this version:

IncrementFunctor for_each<int*,IncrementFunctor>(int*, int*, IncrementFunctor)

the second will be to this version:

void (*for_each<int*,void(*)(int&)>(int*, int*, void(*)(int&)))(int&)

Within for_each<int*,IncrementFunctor>(), the compiler will be able to inline the function object because the function is known at compile time whereas within for_each<int*,void(*)(int&)>() the function cannot be known at compile time because the pointer has not been assigned and so cannot be inlined.

Maintaining state

Another advantage of function objects is their ability to maintain a state which affects operator() between calls. Inconveniently, copies of a function object must share a state to work correctly. STL algorithms are allowed to instantiate copies. For example, the following code defines a generator counting from 10 upwards and is invoked 11 times.

#include <iostream>
#include <iterator>
#include <algorithm>

class countfrom {
private:
  int &count;
public:
  countfrom(int &n) : count(n) {}
  int operator()() { return count++; }
};

int main() {
  int state(10);
  std::generate_n(std::ostream_iterator<int>(std::cout, "\n"), 11, countfrom(state));
  return 0;
}

In C#

In C#, function objects are declared via delegates.

In D

D provides several ways to declare function objects: Lisp/Python-style via closures or C#-style via delegates, respectively:

bool find(T)(T[] haystack, bool delegate(T) needle_test) {
  foreach ( straw; haystack ) {
    if ( needle_test(straw) )
      return true;
  }
  return false;
}

void main() {
    int[] haystack = [345,15,457,9,56,123,456];
    int   needle = 123;
    bool needleTest(int n) {
      return n == needle;
    }
    assert(
      find(haystack, &needleTest)
    );
}

The difference between a delegate and a closure in D is automatically and conservatively determined by the compiler. D also supports function literals, that allow a lambda-style definition:

void main() {
    int[] haystack = [345,15,457,9,56,123,456];
    int   needle = 123;
    assert(
       find(haystack, (int n) { return n == needle; })
    );
}

To allow the compiler to inline the code (see above), function objects can also be specified C++-style via operator overloading:

bool find(T,F)(T[] haystack, F needle_test) {
  foreach ( straw; haystack ) {
    if ( needle_test(straw) )
      return true;
  }
  return false;
}

void main() {
    int[] haystack = [345,15,457,9,56,123,456];
    int   needle = 123;
    class NeedleTest {
      int needle;
      this(int n) { needle = n; }
      bool opCall(int n) {
        return n == needle;
      }
    }
    assert(
      find(haystack, new NeedleTest(needle))
    );
}

In Eiffel

In the Eiffel software development method and language, operations and objects are seen always as separate concepts. However, the agent mechanism facilitates the modeling of operations as runtime objects. Agents satisfy the range of application attributed to function objects, such as being passed as arguments in procedural calls or specified as callback routines. The design of the agent mechanism in Eiffel attempts to reflect the object-oriented nature of the method and language. An agent is an object which generally is a direct instance of one of the two library classes which model the two types of routines in Eiffel: PROCEDURE and FUNCTION. These two classes descend from the more abstract ROUTINE.

Within software text, the language keyword agent allows agents to be constructed in a compact form. In the following example, the goal is to add the action of stepping the gauge forward to the list of actions to be executed in the event that a button is clicked.

            my_button.select_actions.extend (agent my_gauge.step_forward)

The routine extend referenced in the example above is a feature of a class in a graphical user interface (GUI) library to provide event-driven programming capabilities.

In other library classes, agents are seen to be used for different purposes. In a library supporting data structures, for example, a class modeling linear structures effects universal quantification with a function for_all of type BOOLEAN which accepts an agent, an instance of FUNCTION, as an argument. So, in the following example, my_action is executed only if all members of my_list contain the character '!':

    my_list: LINKED_LIST [STRING]
        ...
            if my_list.for_all (agent {STRING}.has ('!')) then
                my_action
            end
        ...

When agents are created, the arguments to the routines they model and even the target object to which they are applied can be either closed or left open. Closed arguments and targets are given values at agent creation time. The assignment of values for open arguments and targets is deferred until some point after the agent is created. The routine for_all expects as an argument an agent representing a function with one open argument or target which conforms to actual generic parameter for the structure (STRING in this example.)

When the target of an agent is left open, the class name of the expected target, enclosed in braces, is substituted for an object reference as shown in the text agent {STRING}.has ('!') in the example above. When an argument is left open, the question mark character ('?') is coded as a placeholder for the open argument.

The ability to close or leave open targets and arguments is intended to improve the flexibility of the agent mechanism. Consider a class that contains the following procedure to print a string on standard output after a new line:

    print_on_new_line (s: STRING)
            -- Print `s' preceded by a new line
        do
            print ("%N" + s)
        end

The following snippet, assumed to be in the same class, uses print_on_new_line to demonstrate the mixing of open arguments and open targets in agents used as arguments to the same routine.

    my_list: LINKED_LIST [STRING]
        ...
            my_list.do_all (agent print_on_new_line (?))
            my_list.do_all (agent {STRING}.to_lower)
            my_list.do_all (agent print_on_new_line (?))
        ...

This example uses the procedure do_all for linear structures, which executes the routine modeled by an agent for each item in the structure.

The sequence of three instructions prints the strings in my_list, converts the strings to lowercase, and then prints them again.

Procedure do_all iterates across the structure executing the routine substituting the current item for either the open argument (in the case of the agents based on print_on_new_line), or the open target (in the case of the agent based on to_lower).

Open and closed arguments and targets also allow the use of routines which call for more arguments than are required by closing all but the necessary number of arguments:

            my_list.do_all (agent my_multi_arg_procedure (closed_arg_1, ?, closed_arg_2, closed_arg_3)

The Eiffel agent mechanism is detailed in the Eiffel ISO/ECMA standard document.

In Java

Java has no first-class functions, so function objects are usually expressed by an interface with a single method (most commonly the Callable interface), typically with the implementation being an anonymous inner class.

For an example from Java's standard library, java.util.Collections.sort() takes a List and a functor whose role is to compare objects in the List. But because Java does not have first-class functions, the function is part of the Comparator interface. This could be used as follows.

List<String> list = Arrays.asList("10", "1", "20", "11", "21", "12");
		
Comparator<String> numStringComparator = new Comparator<String>() {
    public int compare(String o1, String o2) {
        return Integer.valueOf(o1).compareTo(Integer.valueOf(o2));
    }
};

Collections.sort(list, numStringComparator);

In JavaScript

In JavaScript, functions are first class objects. JavaScript also supports closures.

Compare the following with the subsequent Python example.

function Accumulator(start) {
  var current = start;
  return function (x) {
    current += x;
    return current;
  };
}

An example of this in use:

var a = Accumulator(4);
var x = a(5); //x has value 9
x = a(2);     //x has value 11
 
var b = Accumulator(42);
x = b(7);    //x has value 49 (current=42 in closure b)
x = a(7);    //x has value 18 (current=11 in closure a)

In Lisp and Scheme

In Common Lisp, Scheme and other languages in the Lisp family, functions are objects, just like strings, vectors, lists, and numbers. A closure-constructing operator creates a function object from a piece of the program itself: the piece of code given as an argument to the operator is part of the function, and so is the lexical environment: the bindings of the lexically visible variables are "captured" and stored in the function object, which is more commonly called a closure. The captured bindings play the role of "member variables", and the code part of the closure plays the role of the "anonymous member function", just like operator () in C++.

The closure constructor has the syntax (lambda (parameters ...) code ...). The (parameters ...) part allows an interface to be declared, so that the function takes the declared parameters. The code ... part consists of expressions that are evaluated when the functor is called.

Many uses of functors in languages like C++ are simply emulations of the missing closure constructor. Since the programmer cannot directly construct a closure, he or she must define a class which has all of the necessary state variables, and also a member function. Then, construct an instance of that class instead, ensuring that all the member variables are initialized through its constructor. The values are derived precisely from those local variables that ought to be captured directly by a closure.

A function-object using the class system, no use of closures:

(defclass counter ()
  ((value :initarg :value :accessor value-of)))

(defmethod functor-call ((c counter))
  (incf (value-of c)))

(defun make-counter (initial-value)
  (make-instance 'counter :value initial-value))

;;; use the counter:
(defvar *c* (make-counter 10))
(functor-call *c*) --> 11
(functor-call *c*) --> 12

Since there is no standard way to make funcallable objects in Lisp, we fake it by defining a generic function called FUNCTOR-CALL. This can be specialized for any class whatsoever. The standard FUNCALL function is not generic; it only takes function objects.

It is this FUNCTOR-CALL generic function which gives us function objects, which are a computer programming construct allowing an object to be invoked or called as if it were an ordinary function, usually with the same syntax. We have almost the same syntax: FUNCTOR-CALL instead of FUNCALL. Some Lisps provide "funcallable" objects as a simple extension. Making objects callable using the same syntax as functions is a fairly trivial business. Making a function call operator work with different kinds of "function things", whether they be class objects or closures is no more complicated than making a + operator that works with different kinds of numbers, such as integers, reals or complex numbers.

Now, a counter implemented using a closure. This is much more brief and direct. The INITIAL-VALUE argument of the MAKE-COUNTER factory function is captured and used directly. It does not have to be copied into some auxiliary class object through a constructor. It is the counter. An auxiliary object is created, but that happens "behind the scenes".

(defun make-counter (value)
  (lambda () (incf value)))

;;; use the counter
(defvar *c* (make-counter 10))
(funcall *c*) ; --> 11
(funcall *c*) ; --> 12

Scheme makes closures even simpler, and Scheme code tends to use such higher-order programming somewhat more idiomatically.

(define (make-counter value)
  (lambda () (set! value (+ value 1)) value))
;;; use the counter
(define c (make-counter 10))
(c) ; --> 11
(c) ; --> 12

More than one closure can be created in the same lexical environment. A vector of closures, each implementing a specific kind of operation, can quite faithfully emulate an object that has a set of virtual operations. That type of single dispatch object-oriented programming can be done entirely with closures.

Thus there exists a kind of tunnel being dug from both sides of the proverbial mountain. Programmers in OOP languages discover function objects by restricting objects to have one "main" function to "do" that object's functional purpose, and even eliminate its name so that it looks like the object is being called! While programmers who use closures are not surprised that an object is called like a function, they discover that multiple closures sharing the same environment can provide a complete set of abstract operations like a virtual table for single dispatch type OOP.

In Objective-C

In Objective-C a function object can be created from the NSInvocation class. Construction of a function object requires a method signature, the target object, and the target selector. Here is an example for creating an invocation to the current object's myMethod:

// Construct a function object
SEL sel = @selector(myMethod);
NSInvocation* inv = [NSInvocation invocationWithMethodSignature:
                     [self methodSignatureForSelector:sel]];
[inv setTarget:self];
[inv setSelector:sel];

// Do the actual invocation
[inv invoke];

An advantage of NSInvocation is that the target object can be modified after creation. A single NSInvocation can be created and then called for each of any number of targets, for instance from an observable object. An NSInvocation can be created from only a protocol, but it is not straightforward. See here.

In Perl

In Perl, a function object can be created either from a class's constructor returning a function closed over the object's instance data, blessed into the class:

package Acc1;
sub new {
        my $class = shift;
        my $arg = shift;
        my $obj = sub {
                my $num = shift;
                $arg += $num;
        };
        bless $obj, $class;
}
1;

or by overloading the &{} operator so that the object can be used as a function:

package Acc2;
use overload 
   '&{}' => 
        sub {
           my $self = shift;
           sub {
                $num = shift;
                $self->{arg} += $num;
           }
        };
           
sub new {
        my $class = shift;
        my $arg = shift;
        my $obj = { arg =>  $arg };
        bless $obj, $class;
}
1;

In both cases the function object can be used either using the dereferencing arrow syntax $ref->(@arguments):

use Acc1;
my $a = Acc1->new(42);
# prints '52'
print $a->(10), "\n";
# prints '60'
print $a->(8), "\n";

or using the coderef dereferencing syntax &$ref(@arguments):

use Acc2;
my $a = Acc2->new(12);
# prints '22'
print &$a(10), "\n";
# prints '30'
print &$a(8), "\n";

In PowerShell

In the Windows PowerShell programming language, a script block is a collection of statements or expressions that can be used as a single unit. A script block can accept arguments and return values. A script block is an instance of a Microsoft .NET Framework type System.Management.Automation.ScriptBlock.

Function Get-Accumulator($x) {
    { 
        param($y)        
        return $script:x += $y         
    }.GetNewClosure()
}

PS C:\> $a = Get-Accumulator 4

PS C:\> & $a 5

9

PS C:\> & $a 2

11

PS C:\> $b = Get-Accumulator 32

PS C:\> & $b 10

42

In Python

In Python, functions are first-class objects, just like strings, numbers, lists etc. This feature eliminates the need to write a function object in many cases. Any object with a __call__() method can be called using function-call syntax.

An example is this Accumulator class (based on Paul Graham's study on programming language syntax and clarity):[3]

class Accumulator(object):
    def __init__(self, n):
        self.n = n
    def __call__(self, x):
        self.n += x
        return self.n

An example of this in use (using the interactive interpreter):

>>> a = Accumulator(4)
>>> a(5)
9
>>> a(2)
11
>>> b = Accumulator(42)
>>> b(7)
49

A function object using a closure in Python 3:

def accumulator(n):
    def inc(x):
        nonlocal n
        n += x
        return n
    return inc

In Ruby

In Ruby, several objects can be considered function objects, in particular Method and Proc objects. Ruby also has two kinds of objects that can be thought of as semi-function objects: UnboundMethod and block. UnboundMethods must first be bound to an object (thus becoming a Method) before they can be used as a function object. Blocks can be called like function objects, but to be used in any other capacity as an object (eg. passed as an argument) they must first be converted to a Proc. More recently, symbols (accessed via the literal unary indicator :) can also be converted to Procs. Using Ruby's unary & operator—equivalent to calling to_proc on an object, and assuming that method exists—the Ruby Extensions Project created a simple hack.

class Symbol
   def to_proc
      proc { |obj, *args| obj.send(self, *args) }
   end
end

Now, method foo can be a function object, i.e. a Proc, via &:foo and used via takes_a_functor(&:foo). Symbol.to_proc was officially added to Ruby on June 11, 2006 during RubyKaiga2006. [1]

Because of the variety of forms, the term Functor is not generally used in Ruby to mean a Function object. Rather it has come to represent a type of dispatch delegation introduced by the Ruby Facets project. The most basic definition of which is:

class Functor
  def initialize(&func)
    @func = func
  end
  def method_missing(op, *args, &blk)
    @func.call(op, *args, &blk)
  end
end

This usage is more akin to that used by functional programming languages, like ML, and the original mathematical terminology.

Other meanings

In a more theoretical context a function object may be considered to be any instance of the class of functions, especially in languages such as Common Lisp in which functions are first-class objects.

In some functional programming languages, such as ML and Haskell, the term functor has a different meaning; it represents a mapping from modules to modules, or from types to types and is a technique for reusing code. Functors used in this manner are analogous to the original mathematical meaning of functor in category theory, or to the use of generic programming in C++, Java or Ada.

In Prolog and related languages, functor is a synonym for function symbol.

See also

References

Further reading

  • David Vandevoorde & Nicolai M Josuttis (2006). C++ Templates: The Complete Guide, ISBN 0-201-73484-2 : Specifically, chapter 22 is entirely devoted to function objects.