E-graph

In computer science, an e-graph is a data structure that stores an equivalence relation over terms of some language.

Definition and operations

Let $f,g,\ldots$ range over the function symbols of the term language, and let $t_{1},\ldots$ range over the terms. Let $\mathbb {id}$ be a countable set of opaque identifiers that may be compared for equality, called e-class IDs. Then an e-node is an n-ary function symbol together with $n$ e-class IDs. An e-class is a set of e-nodes. An e-graph contains a union-find structure $U$ over e-class IDs with standard operations $\mathrm {add}$ , $\mathrm {find}$ , and $\mathrm {merge}$ .

An e-class ID $e$ is canonical if $\mathrm {find} (U,e)=e$ . An e-node $f(i_{1},\ldots ,i_{n})$ (with $i_{1},\ldots ,i_{n}\in \mathbb {id}$ ) is canonical if each $i_{j}$ is canonical ( $j$ in $1,\ldots ,n$ ).

An e-graph is the combination of:^[1]

the union-find structure $U$ ,
a hashcons $H$ (i.e. a mapping) from canonical e-nodes to e-class IDs, and
an e-class map $M$ that maps e-class IDs to e-classes, such that $M$ maps equivalent IDs to the same set of e-nodes: $\forall i,j\in \mathbb {id} ,M[i]=M[j]\Leftrightarrow \mathrm {find} (U,i)=\mathrm {find} (U,j)$

Invariants

In addition to the above structure, a valid e-graph conforms to several data structure invariants.^[2] Two e-nodes are equivalent if they are in the same e-class. The congruence invariant states that an e-graph must ensure that equivalence is closed under congruence, where two e-nodes $f(i_{1},\ldots ,i_{n}),f(j_{1},\ldots ,j_{n})$ are congruent when $\mathrm {find} (U,i_{k})=\mathrm {find} (U,j_{k}),k\in \{1,\ldots ,n\}$ . The hashcons invariant states that the hashcons maps canonical e-nodes to their e-class ID.

Operations

E-graphs expose wrappers around the $\mathrm {add}$ , $\mathrm {find}$ , and $\mathrm {merge}$ operations from the union-find that preserve the e-graph invariants. Additionally, the e-matching operation maps "patterns" (terms with variables) to tuples of substitutions (from patterns to e-class IDs) and e-class IDs such that the returned e-classes contain nodes that "match" the pattern after applying the substitution.

Equality saturation

Equality saturation is a technique for building optimizing compilers using e-graphs.^[3] It operates by applying a set of rewrites using e-matching until the e-graph is saturated, a timeout is reached, an e-graph size limit is reached, a fixed number of iterations is exceeded, or some other halting condition is reached. After rewriting, an optimal term is extracted from the e-graph according to some cost function, usually related to AST size or performance considerations.

Applications

E-graphs are used in automated theorem proving. They are a crucial part of modern SMT solvers such as Z3^[4] and CVC4.

Equality saturation is used in specialized optimizing compilers,^[5] e.g. for deep learning^[6] and linear algebra.^[7] Equality saturation has also been used for translation validation applied to the LLVM toolchain.^[8]

References

^ (Willsey 2021)
^ (Willsey 2021)
^ (Tate 2009)
^ de Moura, Leonardo; Bjørner, Nikolaj (2008). Ramakrishnan, C. R.; Rehof, Jakob (eds.). "Z3: An Efficient SMT Solver". Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science. 4963. Berlin, Heidelberg: Springer: 337–340. doi:10.1007/978-3-540-78800-3_24. ISBN 978-3-540-78800-3.
^ Joshi, Rajeev; Nelson, Greg; Randall, Keith (2002-05-17). "Denali: a goal-directed superoptimizer". ACM SIGPLAN Notices. 37 (5): 304–314. doi:10.1145/543552.512566. ISSN 0362-1340.
^ Yang, Yichen; Phothilimtha, Phitchaya Mangpo; Wang, Yisu Remy; Willsey, Max; Roy, Sudip; Pienaar, Jacques (2021-03-17). "Equality Saturation for Tensor Graph Superoptimization". arXiv:2101.01332 [cs.AI].
^ Wang, Yisu Remy; Hutchison, Shana; Leang, Jonathan; Howe, Bill; Suciu, Dan (2020-12-22). "SPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra". arXiv:2002.07951 [cs.DB].
^ Stepp, Michael; Tate, Ross; Lerner, Sorin (2011). Gopalakrishnan, Ganesh; Qadeer, Shaz (eds.). "Equality-Based Translation Validator for LLVM". Computer Aided Verification. Lecture Notes in Computer Science. 6806. Berlin, Heidelberg: Springer: 737–742. doi:10.1007/978-3-642-22110-1_59. ISBN 978-3-642-22110-1.

de Moura, Leonardo; Bjørner, Nikolaj (2007). Pfenning, Frank (ed.). "Efficient E-Matching for SMT Solvers". Automated Deduction – CADE-21. Lecture Notes in Computer Science. 4603. Berlin, Heidelberg: Springer: 183–198. doi:10.1007/978-3-540-73595-3_13. ISBN 978-3-540-73595-3.
Willsey, Max; Nandi, Chandrakana; Wang, Yisu Remy; Flatt, Oliver; Tatlock, Zachary; Panchekha, Pavel (2021-01-04). "egg: Fast and extensible equality saturation". Proceedings of the ACM on Programming Languages. 5 (POPL): 23:1–23:29. arXiv:2004.03082. doi:10.1145/3434304. S2CID 226282597.
Tate, Ross; Stepp, Michael; Tatlock, Zachary; Lerner, Sorin (2009-01-21). "Equality saturation: a new approach to optimization". Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. POPL '09. Savannah, GA, USA: Association for Computing Machinery: 264–276. doi:10.1145/1480881.1480915. ISBN 978-1-60558-379-2. S2CID 2138086.

External links

[1] (Willsey 2021)

[2] (Willsey 2021)

[3] (Tate 2009)

[4] Moura, Leonardo; Bjørner, Nikolaj (2008). Ramakrishnan, C. R.; Rehof, Jakob (eds.). "Z3: An Efficient SMT Solver". Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science. 4963. Berlin, Heidelberg: Springer: 337–340. doi:10.1007/978-3-540-78800-3_24. ISBN 978-3-540-78800-3.

[5] Joshi, Rajeev; Nelson, Greg; Randall, Keith (2002-05-17). "Denali: a goal-directed superoptimizer". ACM SIGPLAN Notices. 37 (5): 304–314. doi:10.1145/543552.512566. ISSN 0362-1340.

[6] Yang, Yichen; Phothilimtha, Phitchaya Mangpo; Wang, Yisu Remy; Willsey, Max; Roy, Sudip; Pienaar, Jacques (2021-03-17). "Equality Saturation for Tensor Graph Superoptimization". arXiv:2101.01332 [cs.AI].

[7] Wang, Yisu Remy; Hutchison, Shana; Leang, Jonathan; Howe, Bill; Suciu, Dan (2020-12-22). "SPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra". arXiv:2002.07951 [cs.DB].

[8] Stepp, Michael; Tate, Ross; Lerner, Sorin (2011). Gopalakrishnan, Ganesh; Qadeer, Shaz (eds.). "Equality-Based Translation Validator for LLVM". Computer Aided Verification. Lecture Notes in Computer Science. 6806. Berlin, Heidelberg: Springer: 737–742. doi:10.1007/978-3-642-22110-1_59. ISBN 978-3-642-22110-1.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]