Dyck language

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Lattice of the 14 Dyck words of length 8 - [ and ] interpreted as up and down

In the theory of formal languages of computer science, mathematics, and linguistics, the Dyck language is the language consisting of balanced strings of parentheses [ and ]. It is important in the parsing of expressions that must have a correctly nested sequence of parentheses, such as arithmetic or algebraic expressions. It is named after the mathematician Walther von Dyck.

Formal definition[edit]

Let \Sigma = \{ [, ] \} be the alphabet consisting of the symbols [ and ] and let \Sigma^{*} denote its Kleene closure. For any element u \in \Sigma^{*} with length | u | we define partial functions insert : \Sigma^{*} \times \mathbb{N} \cup \{ 0 \} \rightarrow \Sigma^{*} and delete : \Sigma^{*} \times \mathbb{N} \rightarrow \Sigma^{*} by

insert(u, j) is u with "[]" inserted into the jth position
delete(u, j) is u with "[]" deleted from the jth position

with the understanding that insert(u, j) is undefined for j > |u| and delete(u, j) is undefined if j > |u| - 2. We define an equivalence relation R on \Sigma^{*} as follows: for elements a, b \in \Sigma^{*} we have (a, b) \in R if and only if there exists a finite sequence of applications of the insert and delete functions starting with a and ending with b, where the empty sequence is allowed. That the empty sequence is allowed accounts for the reflexivity of R. Symmetry follows from the observation that any finite sequence of applications of insert to a string can be undone with a finite sequence of applications of delete. Transitivity is clear from the definition.

The equivalence relation partitions the language \Sigma^{*} into equivalence classes. If we take \epsilon to denote the empty string, then the language corresponding to the equivalence class \operatorname{Cl}(\epsilon) is called the Dyck language.

Alternative definition[edit]

An alternative definition of the Dyck language can be formulated when we introduce the imbalance : \Sigma^{*} \rightarrow \left(\mathbb{N} \cup \{0\}\right) function.

imbalance ( u ) = |u|_{[} - |u|_{]} for any u \in \Sigma^{*}.

where |u|_{[} and |u|_{]} are respectively the number of [ and ] in u. I.e. imbalance counts the imbalance of [ over ]. If imbalance(u) is positive then u has more [ than ].

Now, the Dyck language can be defined as the language

\{ u \in \Sigma^{*} \vert imbalance(u) = 0 \text{ and } imbalance(v) \ge 0 \text{ for all prefixes } v \text{ of } u \}


  • The Dyck language is closed under the operation of concatenation.
  • By treating \Sigma^{*} as an algebraic monoid under concatenation we see that the monoid structure transfers onto the quotient \Sigma^{*} / R, resulting in the syntactic monoid of the Dyck language. The class \operatorname{Cl}(\epsilon) will be denoted 1.
  • The syntactic monoid of the Dyck language is not commutative: if u = \operatorname{Cl}([) and v = \operatorname{Cl}(]) then uv = \operatorname{Cl}([]) = 1 \ne \operatorname{Cl}(][) = vu.
  • With the notation above, uv = 1 but neither u nor v are invertible in \Sigma^{*} / R.
  • The syntactic monoid of the Dyck language is isomorphic to the bicyclic semigroup by virtue of the properties of \operatorname{Cl}([) and \operatorname{Cl}(]) described above.
  • By the Chomsky–Schützenberger representation theorem, any context-free language is a homomorphic image of the intersection of some regular language with a homomorphic preimage of the Dyck language on two parentheses.
  • The Dyck language with two distinct types of parentheses can be recognized in the complexity class TC^{0}.


For Example : As you move from left to right , insert a '(' for every times it goes up and a ')'for every times go down you end up with a string such as (()(())). Two diagrams are connected if swapping one set of two adjacent parentheses produces the other. i.e. in the top, (((()))) is connected to ((()())) and vice versa by swapping the middle two. It's major uses are in combinatorics.

See also[edit]