Algebraic data type

An algebraic data type is a datatype each of whose values is data from other datatypes wrapped in one of the constructors of the datatype. Any wrapped data is arguments to the constructor. In contrast to other datatypes, the constructor is not executed and the only way to operate on the data is to unwrap the constructor using pattern matching.

The most common algebraic data type is a list with two constructors: Nil or [] for an empty list, and Cons (an abbreviation of constructor), ::, or : for the combination of a new element with a shorter list (for example (Cons 1 '(2 3 4)) or 1:[2,3,4]).

Special cases of algebraic types are product types (only one constructor) and enumeration types (many constructors with no arguments). Algebraic types are one kind of constructed type (i.e. a type formed by combining other types).

An algebraic data type may also be an abstract data type (ADT) if it is exported from a module without its constructors. Values of such a type can only be manipulated using functions defined in the same module as the type itself.

In set theory the equivalent of an algebraic data type is a discriminated union - a set whose elements consist of a tag (equivalent to a constructor) and an object of a type corresponding to the tag (equivalent to the constructor arguments).

An example

For example, in Haskell we can define a new algebraic data type, Tree:

data Tree = Empty 
          | Leaf Int 
          | Node Tree Tree

or in OCaml syntax:

type tree = Empty 
          | Leaf of int 
          | Node of tree * tree

Here, Empty, Leaf and Node are the constructors. Somewhat similar to a function, a constructor is applied to arguments of an appropriate type, then yielding an instance of the data type to which the constructor belongs. For instance, Leaf has the something like a "functional type" Int -> Tree meaning that giving an integer as an argument to Leaf produces a value of the type Tree. As Node takes two arguments of the type Tree itself, the datatype is recursive.

Operations on algebraic data types can be defined by using pattern matching to retrieve the arguments. For example, consider a function to find the depth of a Tree:

depth :: Tree -> Int
depth Empty	  = 0
depth (Leaf n)	  = 1
depth (Node l r) = 1 + max (depth l) (depth r)

Thus, a Tree given to depth can be constructed using any of Empty, Leaf or Node and we must match for any of them respectively to deal with all cases. In case of Node, the pattern extracts the subtrees l and r for further processing.

Theory

A general algebraic data type is a (possibly recursive) sum type of product types. Each constructor tags a product type to separate it from others, or if there is only one constructor, the data type is a product type. Further, the parameter types of a constructor are the factors of the product type. A parameterless constructor corresponds to the empty product. If a datatype is recursive, the entire sum of products is wrapped in a recursive type, and each constructor also rolls the datatype into the recursive type.

For example, the Haskell datatype:

  data List a = Nil | Cons a (List a)

is represented in type theory as $\lambda a.\mu \beta .1+\alpha \times \beta$ with constructors $\mathrm {nil} _{\alpha }=\mathrm {roll} \ (\mathrm {inl} \ \langle \rangle )$ and $\mathrm {cons} _{\alpha }\ x\ l=\mathrm {roll} \ (\mathrm {inr} \ \langle x,l\rangle )$ .

Reference

Algebraic data type in The Free On-line Dictionary of Computing, Editor Denis Howe.

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.

An example

Theory

See also

Reference