# Alpha algorithm

The α-algorithm is an algorithm used in process mining, aimed at reconstructing causality from a set of sequences of events. It was first put forward by van der Aalst, Weijters and Măruşter.[1] Several extensions or modifications of it have since been presented, which will be listed below.

It constructs P/T nets with special properties (workflow nets) from event logs (as might be collected by an ERP system). Each transition in the net corresponds to an observed task.

## Short description

The algorithm takes a workflow log ${\displaystyle W\subseteq T^{*}}$ as input and results in a workflow net being constructed.

It does so by examining causal relationships observed between tasks. For example, one specific task might always precede another specific task in every execution trace, which would be useful information.

### Definitions used

• A workflow trace or execution trace is a string over an alphabet ${\displaystyle T}$ of tasks.
• A workflow log is a set of workflow traces.

## Description

Declaratively, the algorithm can be presented as follows. Three sets of tasks are determined:

• ${\displaystyle T_{W}}$ is the set of all tasks which occur in at least one trace
• ${\displaystyle T_{I}}$ is the set of all tasks which occur trace-initially
• ${\displaystyle T_{O}}$ is the set of all tasks which occur trace-terminally

Basic ordering relations are determined (${\displaystyle \succ _{W}}$ first, the latter three can be constructed therefrom)

• ${\displaystyle a\succ _{W}b}$ iff ${\displaystyle a}$ directly precedes ${\displaystyle b}$ in some trace
• ${\displaystyle a\rightarrow _{W}b}$ iff ${\displaystyle a\succ _{W}b\wedge b\not \succ _{W}a}$
• ${\displaystyle a\#{}_{W}b}$ iff ${\displaystyle a\not \succ _{W}b\wedge b\not \succ _{W}a}$
• ${\displaystyle a\Vert _{W}b}$ iff ${\displaystyle a\succ _{W}b\wedge b\succ _{W}a}$

Places are discovered. Each place is identified with a pair of sets of tasks, in order to keep the number of places low.

• ${\displaystyle Y_{W}}$ is the set of all pairs ${\displaystyle (A,B)}$ of maximal sets of tasks such that
• Neither ${\displaystyle A\times A}$ and ${\displaystyle B\times B}$ contain any members of ${\displaystyle \succ _{W}}$ and
• ${\displaystyle A\times B}$ is a subset of ${\displaystyle \rightarrow _{W}}$
• ${\displaystyle P_{W}}$ contains one place ${\displaystyle p_{(A,B)}}$ for every member of ${\displaystyle Y_{W}}$, plus the input place ${\displaystyle i_{W}}$ and the output place ${\displaystyle o_{W}}$

The flow relation ${\displaystyle F_{W}}$ is the union of the following:

• ${\displaystyle \{(a,p_{(A,B)})|(A,B)\in Y_{W}\wedge a\in A\}}$
• ${\displaystyle \{(p_{(A,B)},b)|(A,B)\in Y_{W}\wedge b\in B\}}$
• ${\displaystyle \{(i_{W},t)|t\in T_{I}\}}$
• ${\displaystyle \{(t,i_{O})|t\in T_{O}\}}$

The result is

• a petri net structure ${\displaystyle \alpha (W)=(P_{W},T_{W},F_{W})}$
• with one input place ${\displaystyle i_{W}}$ and one output place ${\displaystyle o_{W}}$
• because every transition of ${\displaystyle T_{W}}$ is on a ${\displaystyle F_{W}}$-path from ${\displaystyle i_{W}}$ to ${\displaystyle o_{W}}$, it is indeed a workflow net.

## Properties

It can be shown [2] that in the case of a complete workflow log generated by a sound SWF net, the net generating it can be reconstructed. Complete means that its ${\displaystyle \succ _{W}}$ relation is maximal. It is not required that all possible traces be present (which would be countably infinite for a net with a loop).

## Limitations

General workflow nets may contain several types of constructs [3] which the α-algorithm cannot rediscover.

Constructing ${\displaystyle Y_{W}}$ takes exponential time in the number of tasks, since ${\displaystyle \succ _{W}}$ is not constrained and arbitrary subsets of ${\displaystyle T_{W}}$ must be considered.

## Extensions

for example [4] [5]

## References

1. ^ van der Aalst, W M P and Weijters, A J M M and Maruster, L (2003). "Workflow Mining: Discovering process models from event logs", IEEE Transactions on Knowledge and Data Engineering, vol 16
2. ^ van der Aalst et al. 2003
3. ^ A. de Medeiros, A K and van der Aalst, W M P and Weijters, A J M M (2003). "Workflow Mining: Current Status and Future Directions". in: "volume 2888 of Lecture Notes in Computer Science", Springer-Verlag
4. ^ A. de Medeiros, A K and van Dongen, B F and van der Aalst, W M P and Weijters, A J M M (2004). "Process mining: extending the α-algorithm to mine short loops"
5. ^ Wen, L and van der Aalst, W M P and Wang, J and Sun, J (2007). "Mining process models with non-free-choice constructs", "Data Mining and Knowledge Discovery" vol 15, p. 145--180, Springer-Verlag