# Church encoding

In mathematics, Church encoding is a means of representing data and operators in the lambda calculus. The Church numerals are a representation of the natural numbers using lambda notation. The method is named for Alonzo Church, who first encoded data in the lambda calculus this way.

Terms that are usually considered primitive in other notations (such as integers, booleans, pairs, lists, and tagged unions) are mapped to higher-order functions under Church encoding. The Church-Turing thesis asserts that any computable operator (and its operands) can be represented under Church encoding. In the untyped lambda calculus the only primitive data type is the function. The Church-Turing thesis is that lambda calculus is Turing complete.

The Church encoding is not intended as a practical implementation of primitive data types. It's use is to show that other primitives data types are not required to represent any calculation. The completeness is representational. Additional functions are needed to translate the representation into common data types, for display to people. It is not possible in general to decide if two functions are extrinsically equal due to the undecidability of equivalence from Church's theorem. The translation may apply the function in some way to retrieve the value it represents, or look up its value as a literal lambda term.

## Church numerals

Church numerals are the representations of natural numbers under Church encoding. The higher-order function that represents natural number $n$ is a function that maps any function $f$ to its n-fold composition. In simpler terms, the "value" of the numeral is equivalent to the number of times the function encapsulates its argument.

$f^n = \underbrace{f \circ f \circ \cdots \circ f}_{n\text{ times}}.\,$

All Church numerals are functions that take two parameters. Church numerals 0, 1, 2, ..., are defined as follows in the lambda calculus:

Then starting with zero, being not applying the function, then one being applying the function, ...

Number Function definition Lambda expression
0 $0\ f\ x = x$ $0 = \lambda f.\lambda x.x$
1 $1\ f\ x = f\ x$ $1 = \lambda f.\lambda x.f\ x$
2 $2\ f\ x = f\ (f\ x)$ $2 = \lambda f.\lambda x.f\ (f\ x)$
3 $3\ f\ x = f\ (f\ (f\ x))$ $3 = \lambda f.\lambda x.f\ (f\ (f\ x))$
...
n $n\ f\ x = f^n\ x$ $n = \lambda f.\lambda x.f^n\ x$

The Church numeral 3 represents the action of applying any given function three times to a value. The supplied function is first applied to a supplied parameter and then successively to its own result. The end result is not the numeral 3 (unless the supplied parameter happens to be 0 and the function is a successor function). The function itself, and not its end result, is the Church numeral 3. The Church numeral 3 means simply to do anything three times. It is an ostensive demonstration of what is meant by "three times".

### Calculation with Church numerals

Arithmetic operations on numbers may be represented by functions on Church numerals. These functions may be defined in lambda calculus, or implemented in most functional programming languages (see converting lambda expressions to functions).

The addition function $\operatorname{plus}(m, n)= m+n$ uses the identity $f^{(m+n)}(x)=f^m(f^n(x))$.

$\operatorname{plus} \equiv \lambda m.\lambda n.\lambda f.\lambda x. m\ f\ (n\ f\ x)$

The successor function $\operatorname{succ}(n)=n+1$ is β-equivalent to $(\operatorname{plus}\ 1)$.

$\operatorname{succ} \equiv \lambda n.\lambda f.\lambda x. f\ (n\ f\ x)$

The multiplication function $\operatorname{mult}(m, n) = m*n$ uses the identity $f^{(m*n)}(x) = (f^n)^m(x)$.

$\operatorname{mult} \equiv \lambda m.\lambda n.\lambda f. m\ (n\ f)$

The exponentiation function $\operatorname{exp}(m, n) = m^n$ is given by the definition of a Church numerals; $n\ f\ x = f^n\ x$. In the definition substitute $f \to m, x \to f$ to get $n\ m\ f = m^n\ f$ and,

$\operatorname{exp}\ m\ n = m^n = n\ m$

which gives the lambda expression,

$\operatorname{exp} \equiv \lambda m.\lambda n. n\ m$

The $\operatorname{pred}(n)$ function is more difficult to understand.

$\operatorname{pred} \equiv \lambda n.\lambda f.\lambda x. n\ (\lambda g.\lambda h. h\ (g\ f))\ (\lambda u. x)\ (\lambda u. u)$

A Church numeral applies a function n times. The predecessor function must return a function that applies it's parameter n - 1 times. This is achieved by building a container around f and x, which is initialized in a way that omits the application of the function the first time. See predecessor for a more detailed explanation.

The subtraction function can be written based on the predecessor function.

$\operatorname{minus} \equiv \lambda m.\lambda n. (n \operatorname{pred})\ m$

### Table of functions on Church numerals

Function Algebra Identity Function definition Lambda expressions
Successor $n + 1$ $f^{n+1}\ x = f (f^n x)$ $\operatorname{succ}\ n\ f\ x = f\ (n\ f\ x)$ $\lambda n.\lambda f.\lambda x.f\ (n\ f\ x)$ ...
Addition $m + n$ $f^{m+n}\ x = f^m (f^n x)$ $\operatorname{plus}\ m\ n\ f\ x = m\ f\ (n\ f\ x)$ $\lambda m.\lambda n.\lambda f.\lambda x.m\ f\ (n\ f\ x)$ $\lambda m.\lambda n.n \operatorname{succ} m$
Multiplication $m * n$ $f^{m*n}\ x = (f^m)^n\ x$ $\operatorname{multiply}\ m\ n\ f\ x = m\ (n\ f) \ x$ $\lambda m.\lambda n.\lambda f.\lambda x.m\ (n\ f) \ x$ $\lambda m.\lambda n.\lambda f.m\ (n\ f)$
Exponentiation $m^n$ $n\ m\ f = m^n\ f$[1] $\operatorname{exp} \ m\ n\ f\ x = (n\ m)\ f\ x$ $\lambda m.\lambda n.\lambda f.\lambda x.(n\ m)\ f\ x$ $\lambda m.\lambda n.n\ m$
Predecessor* $n - 1$ $\operatorname{inc}^n \operatorname{con} = \operatorname{val} (f^{n-1} x)$ $\operatorname{pred}$ ...

$\lambda n.\lambda f.\lambda x.n\ (\lambda g.\lambda h.h\ (g\ f))\ (\lambda u.x)\ (\lambda u.u)$

Subtraction* $m - n$ $f^{m-n}\ x = (f^{-1})^n (f^{m} x)$ $\operatorname{minus}\ m\ n = (n \operatorname{pred})\ m$ ... $\lambda m.\lambda n.n \operatorname{pred} m$

* Note that in the Church encoding,

• $\operatorname{pred}(0) = 0$
• $m < n \to m - n = 0$

### Translation with other representations

Most real-world languages have support for machine-native integers; the church and unchurch functions convert between nonnegative integers and their corresponding church numerals. The functions are given here in Haskell, where the \ corresponds to the λ of Lambda calculus. Implementations in other languages are similar.

type Church a = (a -> a) -> (a -> a)

church :: Integer -> Church Integer
church 0 = \f -> \x -> x
church n = \f -> \x -> f (church (n-1) f x)

unchurch :: Church Integer -> Integer
unchurch n = n (\x -> x + 1) 0


## Church Booleans

Church Booleans are the Church encoding of the Boolean values true and false. Some programming languages use these as an implementation model for Boolean arithmetic; examples are Smalltalk and Pico.

Boolean logic may be considered as a choice. The Church encoding of true and false are functions of two parameters;

• true chooses the first parameter.
• false chooses the second parameter.

The two definitions are known as Church Booleans;

$\operatorname{true} \equiv \lambda a.\lambda b.a$
$\operatorname{false} \equiv \lambda a.\lambda b.b$

This definition allows predicates (i.e. functions returning logical values) to directly act as if-clauses. A function returning a Boolean, which is then applied to two parameters, returns either the first or the second parameter;

$\operatorname{predicate}\ x\ \operatorname{then-clause}\ \operatorname{else-clause}$

evaluates to then-clause if predicate x evaluates to true, and to else-clause if predicate x evaluates to false.

Because true and false choose the first or second parameter they may be combined to provide logic operators,

$\operatorname{and} = \lambda p.\lambda q.p\ q\ p$
$\operatorname{or} = \lambda p.\lambda q.p\ p\ q$
$\operatorname{not}_1 = \lambda p.\lambda a.\lambda b.p\ b\ a$ - This is only a correct implementation if the evaluation strategy is applicative order.
$\operatorname{not}_2 = \lambda p.\lambda a.\lambda b.p\ (\lambda a.\lambda b. b)\ (\lambda a.\lambda b. a)$ - This is only a correct implementation if the evaluation strategy is normal order.
$\operatorname{xor} = \lambda a.\lambda b.a\ (\operatorname{not}\ b)\ b$
$\operatorname{if} = \lambda p.\lambda a.\lambda b.p\ a\ b$

Some examples:

$\operatorname{and} \operatorname{true} \operatorname{false} = (\lambda p.\lambda q.p\ q\ p)\ \operatorname{true}\ \operatorname{false} = \operatorname{true} \operatorname{false} \operatorname{true} = (\lambda a.\lambda b.a) \operatorname{false} \operatorname{true} = \operatorname{false}$
$\operatorname{or} \operatorname{true} \operatorname{false} = (\lambda p.\lambda q.p\ p\ q)\ (\lambda a.\lambda b.a)\ (\lambda a.\lambda b.b) = (\lambda a.\lambda b.a)\ (\lambda a.\lambda b.a)\ (\lambda a.\lambda b.b) = (\lambda a.\lambda b.a) = \operatorname{true}$
$\operatorname{not1}\ \operatorname{true} = (\lambda p.\lambda a.\lambda b.p\ b\ a) (\lambda a.\lambda b.a) = \lambda a.\lambda b.(\lambda a.\lambda b.a)\ b\ a = \lambda a.\lambda b.(\lambda x.b)\ a = \lambda a.\lambda b.b = \operatorname{false}$
$\operatorname{not2}\ \operatorname{true} = (\lambda p.p\ (\lambda a.\lambda b. b) (\lambda a.\lambda b. a)) (\lambda a.\lambda b. a) = (\lambda a.\lambda b. a) (\lambda a.\lambda b. b) (\lambda a.\lambda b. a) = (\lambda b. (\lambda a.\lambda b. b))\ (\lambda a.\lambda b. a) = \lambda a.\lambda b.b = \operatorname{false}$

## Predicates

A predicate is a function that returns a Boolean value. The most fundamental predicate is $\operatorname{IsZero}$, which returns $\operatorname{true}$ if its argument is the Church numeral $0$, and $\operatorname{false}$ if its argument is any other Church numeral:

$\operatorname{IsZero} = \lambda n.n\ (\lambda x.\operatorname{false})\ \operatorname{true}$

The following predicate tests whether the first argument is less-than-or-equal-to the second:

$\operatorname{LEQ} = \lambda m.\lambda n.\operatorname{IsZero}\ (\operatorname{minus}\ m\ n)$,

Because of the identity,

$x = y \equiv (x <= y \and y <= x)$

The test for equality may be implemented as,

$\operatorname{EQ} = \lambda m.\lambda n.\operatorname{and}\ (\operatorname{LEQ}\ m\ n)\ (\operatorname{LEQ}\ n\ m)$

## Church pairs

Church pairs are the Church encoding of the pair (two-tuple) type. The pair is represented as a function that takes a function argument. When given its argument it will apply the argument to the two components of the pair. The definition in lambda calculus is,

$\operatorname{pair} \equiv \lambda x.\lambda y.\lambda z.z\ x\ y$
$\operatorname{first} \equiv \lambda p.p\ (\lambda x.\lambda y.x)$
$\operatorname{second} \equiv \lambda p.p\ (\lambda x.\lambda y.y)$

For example,

$\operatorname{first}\ (\operatorname{pair}\ a\ b)$
$= (\lambda p.p\ (\lambda x.\lambda y.x))\ ((\lambda x.\lambda y.\lambda z.z\ x\ y)\ a\ b)$
$= (\lambda p.p\ (\lambda x.\lambda y.x))\ (\lambda z.z\ a\ b)$
$= (\lambda z.z\ a\ b)\ (\lambda x.\lambda y.x)$
$= (\lambda x.\lambda y.x)\ a\ b = a$

## List encodings

An (immutable) list is constructed from list nodes. The basic operations on the list are;

Function Description
nil Construct an empty list.
isnil Test if list is empty.
cons Prepend a given value to a (possibly empty) list.
head Get the first element of the list.
tail Get the rest of the list.

Three different representations of lists are given.

• Build each list node from two pairs (to allow for empty lists).
• Build each list node from one pair.
• Represent the list using the right fold function.

### Two pairs as a list node

A nonempty list can implemented by a Church pair;

• Second contains the tail.

However this does not give a representation of the empty list, because there is no "null" pointer. To represent null, the pair may be wrapped in another pair, giving free values,

• First - Is the null pointer (empty list).
• Second.Second contains the tail.

Using this idea the basic list operations can be defined like this:[2]

Expression Description
$\operatorname{nil} \equiv \operatorname{pair}\ \operatorname{true}\ \operatorname{true}$ The first element of the pair is true meaning the list is null.
$\operatorname{isnil} \equiv \operatorname{first}$ Retrieve the null (or empty list) indicator.
$\operatorname{cons} \equiv \lambda h.\lambda t.\operatorname{pair} \operatorname{false}\ (\operatorname{pair} h\ t)$ Create a list node, which is not null, and give it a head h and a tail t.
$\operatorname{head} \equiv \lambda z.\operatorname{first}\ (\operatorname{second} z)$ second.first is the head.
$\operatorname{tail} \equiv \lambda z.\operatorname{second}\ (\operatorname{second} z)$ second.second is the tail.

In a nil node second is never accessed, provided that head and tail are only applied to nonempty lists.

### One pair as a list node

Alternatively, define [3]

$\operatorname{cons} \equiv \operatorname{pair}$
$\operatorname{head} \equiv \operatorname{first}$
$\operatorname{tail} \equiv \operatorname{second}$
$\operatorname{nil} \equiv \operatorname{false}$
$\operatorname{isnil} \equiv \lambda l.l (\lambda h.\lambda t.\lambda d.\operatorname{false}) \operatorname{true}$

where the last definition is a special case of the general

$\operatorname{process-list} \equiv \lambda l.l (\lambda h.\lambda t.\lambda d. \operatorname{head-and-tail-clause}) \operatorname{nil-clause}$

### Represent the list using right fold

As an alternative to the encoding using Church pairs, a list can be encoded by identifying it with its right fold function. For example, a list of three elements x, y and z can be encoded by a higher-order function which when applied to a combinator c and a value n returns c x (c y (c z n)).

$\operatorname{nil} \equiv \lambda c.\lambda n.n$
$\operatorname{isnil} \equiv \lambda l.l\ (\lambda h.\lambda t.\operatorname{false})\ \operatorname{true}$
$\operatorname{cons} \equiv \lambda h.\lambda t.\lambda c.\lambda n.c\ h\ (t\ c\ n)$
$\operatorname{head} \equiv \lambda l.l\ (\lambda h.\lambda t.h)\ \operatorname{false}$
$\operatorname{tail} \equiv \lambda l. \operatorname{first}\ (l\ (\lambda x.\lambda p.\operatorname{pair}\ (\operatorname{second}\ p)\ (\operatorname{cons}\ x\ (\operatorname{second}\ p)))\ (\operatorname{pair}\ \operatorname{nil}\ \operatorname{nil}))$

## Derivation of predecessor function

The predecessor function used in the Church Encoding is,

$\operatorname{pred}(n) = \begin{cases} 0 & \mbox{if }n=0, \\ n-1 & \mbox{otherwise}\end{cases}$.

To build the predecessor we need a way of applying the function 1 less time. A numeral n applies the function f n times to x. The predecessor function must use the numeral n to apply the function n-1 times.

Before implementing the predecessor function, here is a scheme that wraps the value in a container function. We will define new functions to use in place of f and x, called inc and init. The container function is called value. The left hand side of the table shows a numeral n applied to inc and init.

Number Using init Using const
0 $\operatorname{init} = \operatorname{value}\ x$
1 $\operatorname{inc}\ \operatorname{init} = \operatorname{value}\ (f\ x)$ $\operatorname{inc}\ \operatorname{const} = \operatorname{value}\ x$
2 $\operatorname{inc}\ (\operatorname{inc}\ \operatorname{init}) = \operatorname{value}\ (f\ (f\ x))$ $\operatorname{inc}\ (\operatorname{inc}\ \operatorname{const}) = \operatorname{value}\ (f\ x)$
3 $\operatorname{inc}\ (\operatorname{inc}\ (\operatorname{inc}\ \operatorname{init})) = \operatorname{value}\ (f\ (f\ (f\ x)))$ $\operatorname{inc}\ (\operatorname{inc}\ (\operatorname{inc}\ \operatorname{const})) = \operatorname{value}\ (f\ (f\ x))$
...
n $n \operatorname{inc}\ \operatorname{init} = \operatorname{value}\ (f^n\ x) = \operatorname{value}\ (n\ f\ x)$ $n \operatorname{inc}\ \operatorname{const} = \operatorname{value}\ (f^{n-1}\ x) = \operatorname{value}\ ((n-1)\ f\ x)$

The general recurrence rule is,

$\operatorname{inc}\ (\operatorname{value}\ v) = \operatorname{value}\ (f\ v)$

If there is also a function to retrieve the value from the container (called extract),

$\operatorname{extract}\ (\operatorname{value}\ v) = v$

Then extract may be used to define the samenum function as,

$\operatorname{samenum} = \lambda n.\lambda f.\lambda x.\operatorname{extract}\ (n \operatorname{inc} \operatorname{init}) = \lambda n.\lambda f.\lambda x.\operatorname{extract}\ (\operatorname{value}\ (n\ f\ x)) = \lambda n.\lambda f.\lambda x.n\ f\ x = \lambda n.n$

The samenum function is not intrinsically useful. What we want is the initial value to skip an application of f. Call this new initial container const. The right hand side of the above table shows the expansions of n inc const. Then by replacing init with const in the expression for the same function we get the predecessor function,

$\operatorname{pred} = \lambda n.\lambda f.\lambda x.\operatorname{extract}\ (n \operatorname{inc} \operatorname{const}) = \lambda n.\lambda f.\lambda x.\operatorname{extract}\ (\operatorname{value}\ ((n-1)\ f\ x)) = \lambda n.\lambda f.\lambda x.(n-1)\ f\ x = \lambda n.(n-1)$

As explained below the functions inc, init, const, value and extract may be defined as,

$\operatorname{value}$ $\operatorname{extract}\ k$ $\operatorname{inc}$ $\operatorname{init}$ $\operatorname{const}$
$\lambda v.(\lambda h.h\ v)$ $k\ \lambda u.u$ $\lambda g.\lambda h.h\ (g\ f)$ $\lambda h.h\ x$ $\lambda u.x$

Which gives the lambda expression for pred as,

$\operatorname{pred} = \lambda n.\lambda f.\lambda x.n\ (\lambda g.\lambda h.h\ (g\ f))\ (\lambda u.x)\ (\lambda u.u)$

### Value container

The value container applies a function to its value. It is defined by,

$\operatorname{value}\ v\ h = h\ v$

so,

$\operatorname{value} = \lambda v.(\lambda h.h\ v)$

### Inc

The inc function should take a value containing v, and return a new value containing f v.

$\operatorname{inc}\ (\operatorname{value}\ v) = \operatorname{value}\ (f\ v)$

Letting g be the value container,

$g = \operatorname{value}\ v$

then,

$g\ f = \operatorname{value}\ v\ f = f\ v$

so,

$\operatorname{inc}\ g = \operatorname{value}\ (g\ f)$
$\operatorname{inc} = \lambda g.\lambda h.h\ (g\ f)$

### Extract

The value may be extracted by applying the identity function,

$I = \lambda u.u$

Using I,

$\operatorname{value}\ v\ I = v$

so,

$\operatorname{extract}\ k = k\ I$

### Const

To implement pred the init function is replaced with the const that does not apply f. We need const to satisfy,

$\operatorname{inc}\ \operatorname{const} = \operatorname{value}\ x$
$\lambda h.h\ (\operatorname{const}\ f) = \lambda h.h\ x$

Which is satisfied if,

$\operatorname{const}\ f = x$

Or as a lambda expression,

$\operatorname{const} = \lambda u.x$