# User:WillWare/Rete algorithm

The Rete algorithm is an efficient pattern matching algorithm for implementing production rule systems. The Rete algorithm was designed by Dr Charles L. Forgy of Carnegie Mellon University, first published in a working paper in 1974, and later elaborated in his 1979 Ph.D. thesis and a 1982 paper (see References). Rete has become the basis for many popular expert system shells, including CLIPS, Jess, Drools, BizTalk, Rules Engine and Soar.

This description of the Rete algorithm is distilled from R. Doorenbos's PhD dissertation, Production Matching for Large Learning Systems which also describes a variant named Rete/UL, optimized for large systems.

A Rete is a dataflow network in two sections, an Alpha network and a Beta network.

The Alpha network populates working memories for each pattern of each rule, exploiting repetitions of patterns where possible. When a new fact is presented to the Alpha network, the fact is matched to each pattern, and if a match is possible, the resulting variable bindings are recorded. Pattern matching is sped up using a decision tree, with decisions based on constants in the pattern. Because the nodes of the decision tree must represent constants in the pattern, and the triplet's second element (its predicate) is often a constant, the decision tree's first branchpoint is often the predicate. A pattern with no constants, e.g. (?x ?y ?z), will match all facts.

The Beta part of the network primarily contains join nodes and beta memories. Join nodes test for consistency of variable bindings between conditions. Beta memories store "tokens" which match some but not all of the conditions of a production.

## The Alpha network

The mechanism for populating the working memory for a particular pattern works like this.

```fact ------> binary search
on first variable --(no match)---> discard fact
|
(match found)
|
V
binary search
on second variable --(no match)---> discard fact
|
(etc)
+--------> save binding in working memory
```

The operation of the Alpha network is then to present facts sequentially to these mechanisms. They operate entirely independently so this presents an opportunity for parallelism.

## The Beta network

### Bindings, compatibility, and merging

The Beta network works with bindings, which are sets of variables and the values assigned to them. Here are some examples of bindings.

```{'?a': 'Al', '?b': 'eats', '?c': 'burgers'}
{'?a': 'Bob', '?b': 'golf'}
{'?b': 'wife', '?c': 'Deb'}
```

Two bindings are compatible if they do not assign the same variable to two different values. The bindings above are incompatible because ?a is assigned to Al, Bob, and Deb in different places. Likewise ?b and ?c are assigned different values. Here are a set of bindings that are all mutually compatible because the values assigned to the variables are consistent throughout.

```{'?a': 'Al', '?b': 'eats', '?c': 'burgers'}
{'?a': 'Al', '?d': 'wife', '?e': 'Deb'}
{'?e': 'Deb', '?b': 'eats', '?f': 'salad'}
```

Once a group of bindings has been determined to be compatible, they can be merged into a single binding. This simply involves grouping all the assignments into one binding without repetition. With the previous example, the result of a merge would look like this.

```{'?a': 'Al', '?b': 'eats', '?c': 'burgers',
'?d': 'wife', '?e': 'Deb', '?f': 'salad'}
```

The Beta network requires only the compatibility and merging of two bindings at a time.

### Machinery

Nodes marked "Cn" represent the working memories compiled by the Alpha network, and nodes marked "Dn" represent the results of joining pairs of working memories. Each "Join" node has the job of taking the contents of two working memories (each being a collection of bindings), and merging each compatible pair into a single binding.

```Rule1: C1, C2, C3
Rule2: C3
Rule3: C2, C4

C1 --- Join ---> D1 --- Join --> Rule1
/                /
C2 ----*-----*          /
\        /
C3 ------------\------*--------> Rule2
\
C4 ----------- Join -----------> Rule3
```

Joining is a potential performance problem. Given two working memories with N bindings each, the most obvious approach would require O(N2) time, which is likely to be unacceptable. In fact it is possible to join two sets of bindings in O(N) time, as follows.

Let the two sets of bindings be {Xi} and {Yi} respectively....

Give some kind of overview of how the Beta stuff is working below.

## Opportunities for parallel computation

Two sorts of parallel computation are currently in common use. One is a networked cluster, usually of Linux boxes for ease of administration. This approach is appropriate when tasks can be broken into reasonably large subtasks. Some work has been done in modifying the Linux kernel to accelerate relevant network protocols, but much of this work is not publicly available.

An increasingly popular approach to parallel computing, arguably more cost-effective, is the use of GPU boards. Costing usually only a few hundred dollars, these boards offer dozens or hundreds of processor cores. There are usually mildly esoteric constraints and restrictions to be considered when programming in this domain. These typically involve the size and organization of memory available to the GPU cores, and the overhead of moving data between CPU memory and GPU memory.

It is of considerable interest to review the Rete algorithm code with the goal in mind of parallelizing it for execution on a GPU board. Because GPUs are programmed in C, not Python, it will be necessary to identify pieces of code that can be translated to C, and then to provide a C extension callable from the remaining Python code.

## Source code: rete.py

```"""The Rete algorithm is interesting. I've been studying it and
keeping notes at
http://en.wikipedia.org/wiki/User:WillWare/Rete_algorithm
I plotted a few points and it looks like this is about O(N**1.2), way
better than O(N**2).
"""

import prodsystem

def mark(foo=[ ]):
import time
import inspect
if len(foo) == 0:
foo.append(time.time())
diff = 0
else:
diff = time.time() - foo[0]
c = inspect.currentframe().f_back
print diff, c.f_lineno

######################################################
# Alpha network

class AlphaNetwork:
def processFact(self, fact):
#
#  TODO: figure out how the alpha network will derive
#  all this stuff from its rules
#  * what C lists it needs to have
#  * how to set up the decision tree
#
self.C1 = [ ]
self.C2 = [ ]
self.C3 = [ ]
S, P, O = fact.args
if P == 'is':
if O == 'Chevy':
binding = { '?car': S }
self.C1.append(binding)
elif P == 'vintage':
binding = { '?car': S, '?year': O }
self.C2.append(binding)
elif P == 'color':
if O == 'blue':
binding = { '?car': S }
self.C3.append(binding)

######################################################
# Beta network

class BetaNetwork:
def join(self, Yset, Xset):
"""join(Xset, Yset) -> Zset

Xset and Yset are lists of bindings, Zset is the list of bindings
gotten by merging every compatible (x,y) pair. A more
understandable reference implementation looks like this:

def join(Yset, Xset):
def compatible(x, y):
for k in x.keys():
if y.has_key(k) and y[k] != x[k]:
return False
return True
result = [ ]
for x in Xset:
for y in Yset:
if compatible(x, y):
result.append(merge(x, y))
return result
"""
# first determine varlist, the list of all variables used in
# all the bindings in Xset
# then determine all its non-empty subsets, including itself
varlist = { }
for x in Xset:
for var in x.keys():
varlist[var] = 1
varlist = varlist.keys()
varlist.sort()
subsets = [ ]
for i in range(1, 1 << len(varlist)):
subset = [ ]
for j in range(len(varlist)):
if (1 << j) & i:
subset.append(varlist[j])
subsets.append(subset)

def getValuesKey(binding, subset):
r = ""
for var in subset:
if binding.has_key(var):
r += binding[var]
else:
r += '<>'
return r

# each subset gets its own lookup table
d = [ ]
for subset in subsets:
dpiece = { }
for x in Xset:
vkey = getValuesKey(x, subset)
if not dpiece.has_key(vkey):
dpiece[vkey] = [ x ]
else:
dpiece[vkey].append(x)
d.append(dpiece)

def search(y):
result = [ ]
for dpiece, subset in map(None, d, subsets):
vkey = getValuesKey(y, subset)
if dpiece.has_key(vkey):
for x in dpiece[vkey]:
if x not in result:
result.append(x)
return result

def merge(dct1, dct2):
dct3 = dct1.copy()
for x, y in dct2.items():
dct3[x] = y
return dct3

result = [ ]
for y in Yset:
for x in search(y):
result.append(merge(y, x))
return result

def go(self, alpha, factList):
#
#  TODO: figure out how the beta network will derive
#  a dataflow diagram from its rules
#
D1 = self.join(alpha.C1, alpha.C2)
D2 = self.join(D1, alpha.C3)
for binding in D2:
rule.applyBinding(binding, factList)

##################################
# Type: python gencars.py N > manycars.txt
# for some interestingly large N

RETE = True

rule = prodsystem.Rule(
"""Old blue Chevys are rusty,
?car is Chevy,
?car vintage ?year,
?car color blue,
?year <= 1965
-->
assert ?car is rusty
""")

print rule
factList = [ ]
inf = open("manycars.txt")
L = L.strip()[1:-1]
if L:
f = prodsystem.Fact(L)
factList.append(f)
inf.close()

if not RETE:
rule.apply(factList)
else:
alpha = AlphaNetwork()
beta = BetaNetwork()

for f in factList:
alpha.processFact(f)
beta.go(alpha, factList)
```