Corecursion

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, corecursion is a type of operation that is dual to recursion. Corecursion and codata allow total languages to work with infinite data structures such as streams. Corecursion is often used in conjunction with lazy evaluation. Corecursion can produce both finite and infinite data structures as result, and may employ self-referential data structures. Put simply, it is about algorithms using the data which they themselves produce, bit by bit, as they become available, and needed, to produce further bits of data.

Contents

[edit] Definition

Initial data types can be defined as being the least fixpoint (up to isomorphism) of some type equation; the isomorphism is then given by an initial algebra. Dually, final (or terminal) data types can be defined as being the greatest fixpoint of a type equation; the isomorphism is then given by a final coalgebra.

If the domain of discourse is the category of sets and total functions, then final data types may contain infinite, non-wellfounded values, whereas initial types do not.[1][2] On the other hand, if the domain of discourse is the category of complete partial orders and continuous functions, which corresponds roughly to the Haskell programming language, then final types coincide with initial types, and the corresponding final coalgebra and initial algebra form an isomorphism.[3]

Corecursion is then a technique for recursively defining functions whose range (codomain) is a final data type, dual to the way that ordinary recursion recursively defines functions whose domain is an initial data type.[4]

The discussion below provides several examples in Haskell that distinguish corecursion. Roughly speaking, if one were to port these definitions to the category of sets, they would still be corecursive. This informal usage is consistent with existing textbooks about Haskell.[5] Also note that the examples used in this article predate the attempts to define corecursion and explain what it is.

[edit] Discussion

The rule for primitive corecursion on codata is the dual to that for primitive recursion on data. Instead of descending on the argument by pattern-matching on its constructors (that were called up before, somewhere, so we receive a ready-made datum and get at its constituent sub-parts, i.e. "fields"), we ascend on the result by filling-in its "destructors" (or "observers", that will be called afterwards, somewhere - so we're actually calling a constructor, creating another bit of the result to be observed later on). Thus corecursion creates (potentially infinite) codata, whereas ordinary recursion analyses (necessarily finite) data. Ordinary recursion might not be applicable to the codata because it might not terminate. Conversely, corecursion is not strictly necessary if the result type is data, because data must be finite.

Here is an example in Haskell. The following definition produces the list of Fibonacci numbers in linear time:

fibs = 0 : 1 : next fibs
  where
    next (a: t@(b:_)) = (a+b):next t

This infinite list depends on lazy evaluation; elements are computed on an as-needed basis, and only finite prefixes are ever explicitly represented in memory. This feature allows algorithms on parts of codata to terminate; such techniques are an important part of Haskell programming.

This example employs a self-referential data structure. Ordinary recursion makes use of self-referential functions, but does not accommodate self-referential data. However, this is not essential to the Fibonacci example. It can be rewritten as follows:

fibs = fibgen (0,1)
fibgen (x,y) = x : fibgen (y,x+y)

This employs only self-referential function to construct the result. If it were used with strict list constructor it would be an example of runaway recursion, but with non-strict list constructor it (corecursively) produces an indefinitely defined list.

This shows how it can be done in Python:[6]

from itertools import tee, chain, islice, imap
 
def add(x,y): return (x + y)
 
def fibonacci():
    def deferred_output():
        for i in output:
            yield i
    result, c1, c2 = tee(deferred_output(), 3)
    paired = imap(add, c1, islice(c2, 1, None))
    output = chain([0, 1], paired)
    return result
 
for i in islice(fibonacci(),20):
  print i

Corecursion need not produce an infinite object; a corecursive queue[7] is a particularly good example of this phenomenon. The following definition produces a breadth-first traversal of a binary tree in linear time:

data Tree a b = Leaf a  |  Branch b (Tree a b) (Tree a b)
 
bftrav :: Tree a b -> [Tree a b]
bftrav tree = queue
  where
    queue = tree : trav 1 queue
 
    trav  0   q                 =         []
    trav len (Leaf   _     : q) =         trav (len-1) q
    trav len (Branch _ l r : q) = l : r : trav (len+1) q

This definition takes an initial tree and produces list of subtrees. This list serves a dual purpose as both the queue and the result. It is finite if and only if the initial tree is finite. The length of the queue must be explicitly tracked in order to ensure termination; this can safely be elided if this definition is applied only to infinite trees. Even if the result is finite, this example depends on lazy evaluation due to the use of self-referential data structures.

Another particularly good example gives a solution to the problem of breadth-first labelling.[8] The function label visits every node in a binary tree in a breadth first fashion, and replaces each label with an integer, each subsequent integer is bigger than the last by one. This solution employs a self-referential data structure, and the binary tree can be finite or infinite.

label :: Tree a b -> Tree Int Int 
label t = t'
    where
          (t',ns) = label' t (1:ns)
 
label' :: Tree a b    -> [Int]  -> (Tree Int Int, [Int])
label'    (Leaf   _    ) (n:ns) = (Leaf   n       , n+1 : ns  )
label'    (Branch _ l r) (n:ns) = (Branch n l' r' , n+1 : ns'')
                                where
                                  (l',ns' ) = label' l ns
                                  (r',ns'') = label' r ns'

An apomorphism is a form of corecursion in the same way that a paramorphism is a form of recursion.

The Coq proof assistant supports corecursion and coinduction using the CoFixpoint command.

[edit] See also

[edit] Notes

  1. ^ Barwise and Moss 1996.
  2. ^ Moss and Danner 1997.
  3. ^ Smyth and Plotkin 1982.
  4. ^ Gibbons and Hutton 2005.
  5. ^ Doets and van Eijck 2004.
  6. ^ Hettinger 2009.
  7. ^ Allison 1989; Smith 2009.
  8. ^ Jones and Gibbons 1992.

[edit] References

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages