Pumping lemma for context-free languages

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The pumping lemma for context-free languages, also known as the Bar-Hillel lemma, is a lemma that gives a property shared by all context-free languages.

Contents

[edit] Formal statement

If a language L is context-free, then there exists some integer p ≥ 1 such that any string s in L with |s| ≥ p (where p is a pumping length) can be written as

s = uvxyz

with substrings u, v, x, y and z, such that

1. |vxy| ≤ p,
2. |vy| ≥ 1, and
3. uv nxy nz is in L for every integer n ≥ 0.

[edit] Informal statement and explanation

The pumping lemma for context-free languages (called just "the pumping lemma" for the rest of this article) describes a property that all context-free languages are guaranteed to have.

The property is a property of all strings in the language that are of length at least p, where p is a constant—called the pumping length -- that varies between context-free languages.

Say s is a string of length at least p that is in the language.

The pumping lemma states that s can be split into five substrings, s = uvxyz, where vy is non-empty and the length of vxy is at most p, such that repeating v and y any (and the same) number of times in s produces a string that is still in the language (it's possible and often useful to repeat zero times, which removes v and y from the string). This process of "pumping up" additional copies of v and y is what gives the pumping lemma its name.

Note that finite languages (which are regular and hence context-free) obey the pumping lemma trivially by having p equal to the maximum string length in L plus one. As there are no strings of this length the pumping lemma is not violated.

The pumping lemma is often used to prove that a given language is non-context-free by showing that for each p, we can find some string s of length at least p in the language that does not have the properties outlined above, i.e. that it cannot be "pumped" without producing some strings that are not in the language.

[edit] Usage of the lemma

The pumping lemma for context-free languages can be used to show that certain languages are not context-free.

For example, we can show that language L = \lbrace a^ib^ic^i|i>0\rbrace is not context-free by using the pumping lemma in a proof by contradiction. First, assume that L is context free. By the pumping lemma, there exists an integer p which is the pumping length of language L. Consider the string s = a^pb^pc^p in L. The pumping lemma tells us that s can be written in the form s=uvxyz, where u, v, x, y, and z are substrings, such that |vxy| \le p, |vy|\ge 1, and uv^ixy^iz is in L for every integer i\ge 0. By our choice of s and the fact that |vxy| \le p, it is easily seen that the substring vxy can contain no more than two distinct letters. That is, we have one of five possibilities for vxy:

  1. vxy = a^j for some j \le p.
  2. vxy = a^jb^k for some j and k with  j+k \le p.
  3. vxy = b^j for some j \le p.
  4. vxy = b^jc^k for some j and k with  j+k \le p.
  5. vxy = c^j for some j \le p.

For each case, it is easily verified that uv^ixy^iz does not contain equal numbers of each letter for any i\ne 1. Thus, uv^2xy^2z does not have the form a^ib^ic^i. This contradicts the definition of L. Therefore, our initial assumption that L is context free must be false.

While the pumping lemma is often a useful tool to prove that a given language is not context-free, it does not give a complete characterization of the context-free languages. If a language does not satisfy the condition given by the pumping lemma, we have established that it is not context-free. On the other hand, there are languages that are not context-free, but still satisfy the condition given by the pumping lemma. There are more powerful proof techniques available, such as Ogden's lemma, but also these techniques do not give a complete characterization of the context-free languages.

[edit] See also

[edit] References

  • Bar-Hillel, Y.; Perles, M. and Shamir, E. (1961). "On formal properties of simple phrase-structure grammars". Zeitschrift fur Phonetik, Sprachwissenschaft, und Kommunikationsforschung 14: 143–177. 
  • Michael Sipser (1997). Introduction to the Theory of Computation. PWS Publishing. ISBN 0-534-94728-X.  Section 1.4: Nonregular Languages, pp. 77–83. Section 2.3: Non-context-free Languages, pp. 115–119.
Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages