Parikh's theorem

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Parikh's theorem in theoretical computer science says that if one looks only at the number of occurrences of each terminal symbol in a context-free language, without regard to their order, then the language is indistinguishable from a regular language.[1] It is useful for deciding whether or not a string with a given number of some terminals is accepted by a context-free grammar.[2] It was first proved by Rohit Parikh in 1961[3] and republished in 1966.[4]

Definitions and formal statement[edit]

Let be an alphabet. The Parikh vector of a word is defined as the function , given by[1]

, where denotes the number of occurrences of the letter in the word .

A subset of is said to be linear if it is of the form

for some vectors .

A subset of is said to be semi-linear if it is a union of finitely many linear subsets.

The formal statement of Parikh's theorem is as follows. Let be a context-free language. Let be the set of Parikh vectors of words in , that is, . Then is a semi-linear set.

Two languages are said to be commutatively equivalent if they have the same set of Parikh vectors. If is any semi-linear set, the language of words whose Parikh vectors are in is commutatively equivalent to some regular language. Thus, every context-free language is commutatively equivalent to some regular language.

Significance[edit]

The theorem has multiple interpretations. It shows that a context-free language over a singleton alphabet must be a regular language and that some context-free languages can only have ambiguous grammars[further explanation needed]. Such languages are called inherently ambiguous languages. From a formal grammar perspective, this means that some ambiguous context-free grammars cannot be converted to equivalent unambiguous context-free grammars.

References[edit]

  1. ^ a b Kozen, Dexter (1997). Automata and Computability. New York: Springer-Verlag. ISBN 3-540-78105-6. 
  2. ^ Håkan Lindqvist. "Parikh's theorem" (PDF). Umeå Universitet. 
  3. ^ Parikh, Rohit (1961). "Language Generating Devices". Quartly Progress Report, Research Laboratory of Electronics, MIT. 
  4. ^ Parikh, Rohit (1966). "On Context-Free Languages". Journal of the Association for Computing Machinery. 13 (4).