# Conjunctive grammar

Jump to navigation Jump to search

Conjunctive grammars are a class of formal grammars studied in formal language theory. They extend the basic type of grammars, the context-free grammars, with a conjunction operation. Besides explicit conjunction, conjunctive grammars allow implicit disjunction represented by multiple rules for a single nonterminal symbol, which is the only logical connective expressible in context-free grammars. Conjunction can be used, in particular, to specify intersection of languages. A further extension of conjunctive grammars known as Boolean grammars additionally allows explicit negation.

The rules of a conjunctive grammar are of the form

${\displaystyle A\to \alpha _{1}\And \ldots \And \alpha _{m}}$

where ${\displaystyle A}$ is a nonterminal and ${\displaystyle \alpha _{1}}$, ..., ${\displaystyle \alpha _{m}}$ are strings formed of symbols in ${\displaystyle \Sigma }$ and ${\displaystyle N}$ (finite sets of terminal and nonterminal symbols respectively). Informally, such a rule asserts that every string ${\displaystyle w}$ over ${\displaystyle \Sigma }$ that satisfies each of the syntactical conditions represented by ${\displaystyle \alpha _{1}}$, ..., ${\displaystyle \alpha _{m}}$ therefore satisfies the condition defined by ${\displaystyle A}$.

## Formal definition

A conjunctive grammar ${\displaystyle G}$ is defined by the 4-tuple ${\displaystyle G=(V,\Sigma ,R,S)}$ where

1. V is a finite set; each element ${\displaystyle v\in V}$ is called a nonterminal symbol or a variable. Each variable represents a different type of phrase or clause in the sentence. Variables are also sometimes called syntactic categories. Each variable defines a sub-language of the language defined by G.
2. Σ is a finite set of terminals, disjoint from V, which make up the actual content of the sentence. The set of terminals is the alphabet of the language defined by the grammar G.
3. R is a finite set of productions. The members of R are called the rules or productions of the grammar.
4. S is the start variable (or start symbol), used to represent the whole sentence (or program). It must be an element of V.

It is common to list all right-hand sides for the same left-hand side on the same line, using | (the pipe symbol) to separate them. Rules ${\displaystyle A\rightarrow \alpha _{1}\&\ldots \&\alpha _{m}}$ and ${\displaystyle A\rightarrow \beta _{1}\&\ldots \&\beta _{m}}$ can hence be written as ${\displaystyle A\rightarrow \alpha _{1}\&\ldots \&\alpha _{m}\ |\ \beta _{1}\&\ldots \&\beta _{m}}$.

Two equivalent formal definitions of the language specified by a conjunctive grammar exist. One definition is based upon representing the grammar as a system of language equations with union, intersection and concatenation and considering its least solution. The other definition generalizes Chomsky's generative definition of the context-free grammars using rewriting of terms over conjunction and concatenation.

### Definition by derivation

For any strings ${\displaystyle u,v\in (V\cup \Sigma )^{*}}$, we say u directly yields v, written as ${\displaystyle u\Rightarrow v\,}$, if * either there is a rule ${\displaystyle A\rightarrow \alpha _{1}\&\ldots \&\alpha _{m}\in R}$ such that ${\displaystyle u\,=u_{1}Au_{2}}$ and ${\displaystyle v\,=u_{1}(\alpha _{1}\&\ldots \&\alpha _{m})u_{2}}$, * or there exists a string ${\displaystyle w\in (V\cup \Sigma )^{*}}$ such that ${\displaystyle u\,=u_{1}(w\&\ldots \&w)u_{2}}$ and ${\displaystyle v\,=u_{1}wu_{2}}$.

For any string ${\displaystyle w\in (V\cup \Sigma )^{*},}$ we say G generates w, written as ${\displaystyle S{\stackrel {*}{\Rightarrow }}w}$ if ${\displaystyle \exists k\geq 1\,\exists \,u_{1},\cdots ,u_{k}\in (V\cup \Sigma \cup \{(,)\})^{*}}$ such that ${\displaystyle S=\,u_{1}\Rightarrow u_{2}\Rightarrow \cdots \Rightarrow u_{k}\,=w}$.

The language of a grammar ${\displaystyle G=(V,\Sigma ,R,S)}$ is the set of all strings it generates.

### Example

The grammar ${\displaystyle G=(\{S\},\{a,b\},P,S)}$, with productions

${\displaystyle S\rightarrow AB\&DC}$,
${\displaystyle A\rightarrow aA\ |\ \epsilon }$,
${\displaystyle B\rightarrow bBc\ |\ \epsilon }$,
${\displaystyle C\rightarrow cC\ |\ \epsilon }$,
${\displaystyle D\rightarrow aDb\ |\ \epsilon }$,

is conjunctive. A typical derivation is :${\displaystyle S\Rightarrow (AB\&DC)\Rightarrow (aAB\&DC)\Rightarrow (aB\&DC)\Rightarrow (abBc\&DC)\Rightarrow (abc\&DC)\Rightarrow (abc\&aDbC)\Rightarrow (abc\&abC)\Rightarrow (abc\&abcC)\Rightarrow (abc\&abc)\Rightarrow abc}$ This makes it clear that ${\displaystyle L(G)=\{a^{n}b^{n}c^{n}:n\geq 0\}}$. The language is not context-free, proved by the pumping lemma.

## Parsing algorithms

Though the expressive power of conjunctive grammars is greater than those of context-free grammars, conjunctive grammars retain some practically useful properties of the latter. Most importantly, there are generalizations of the main context-free parsing algorithms, including the linear-time recursive descent, the cubic-time generalized LR, the cubic-time Cocke-Kasami-Younger, as well as Valiant's algorithm running as fast as matrix multiplication.

## Theoretical properties

A number of theoretical properties of conjunctive grammars have been researched, including the expressive power of grammars over a one-letter alphabet and numerous undecidable decision problems. This work provided a basis for the study language equations of a more general form.

## Synchronized alternating pushdown automata

Aizikowitz and Kaminski[1] introduced a new class of pushdown automata (PDA) called synchronized alternating pushdown automata (SAPDA). They proved it to be equivalent to conjunctive grammars in the same way as nondeterministic PDAs are equivalent to context-free grammars.

## References

1. ^ Aizikowitz, Tamar; Kaminski, Michael (2011). "LR(0) Conjunctive Grammars and Deterministic Synchronized Alternating Pushdown Automata". 6651: 345–358. doi:10.1007/978-3-642-20712-9_27. ISSN 0302-9743.