Indexed grammar
An indexed grammar is a formal grammar which describes indexed languages. They have three disjoint sets of symbols: the usual terminals and nonterminals, as well as index symbols, which appear only in intermediate derivation steps on a stack associated with the non-terminals of that step.
Contents |
[edit] Definition
Formally, an indexed grammar[1] is a 5-tuple
where
is a finite alphabet of variables or nonterminal symbols
is a finite alphabet of terminal symbols
is the finite set of so-called flags (each flag is itself a set of so-called index productions)
is the finite set of productions
is the sentence symbol
Direct derivations are as follows:
- A production
from
matches a nonterminal
followed by its (possibly empty) string of flags
. In context,
, via
, derives to
, where
if
was a nonterminal and the empty word otherwise. The old flags of
are therefore copied to each new nonterminal produced by 
- An index production
matches
(the flag it comes from must match the first symbol following the nonterminal) and copies the remaining index string
to each new nonterminal:
derives to
, where
is the empty word when
is a nonterminal and
otherwise.
[edit] Description
A string with a non-terminal symbol
with some index symbols
on its stack, can be denoted as
(using the [ and ] as metasymbols to denote the stack). In an indexed grammar, the application of a production rule such as
would rewrite the string by replacing
with
, which copies the stack symbols
from
to each non-terminal in its replacement —
and
in this case. In the process, a symbol can be pushed to, or popped from, the stack before the stack is copied to the introduced non-terminals, which would be specified in the rule for the rewriting operation.
[edit] Example
In practice, stacks of indices can count and remember what rules were applied and in which order. For example, indexed grammars can describe the context-sensitive language of word triples
:
![S[\sigma] \to S[\sigma f] ~|~ S[\sigma g]](http://upload.wikimedia.org/math/4/c/a/4ca1a3f756b787d2d2689d14168d9d99.png)
![S[\sigma] \to T[\sigma]T[\sigma]T[\sigma]](http://upload.wikimedia.org/math/8/e/1/8e158ba04af5544c9e0b4057aa6167e4.png)
![T[\sigma f] \to T[\sigma]a](http://upload.wikimedia.org/math/7/f/a/7fa329c35caa652646e7b2ef9f315b06.png)
![T[\sigma g] \to T[\sigma]b](http://upload.wikimedia.org/math/3/9/8/3981f9e8012c3305f21e89411a2218e7.png)
![T[] \to \epsilon](http://upload.wikimedia.org/math/1/5/d/15d9b2a23ed055db450ed6bcb46ec984.png)
The derivation of
is then
![S[] \to S[f] \to S[fg] \to S[fgg] \to T[fgg]T[fgg]T[fgg]](http://upload.wikimedia.org/math/8/5/2/852c258ca89534afbb7f2e494452a4b2.png)
[edit] Linear indexed grammars
Gerald Gazdar has defined a second class, the linear indexed grammars, by requiring that at most one nonterminal in each production be specified as receiving the stack, whereas in a normal indexed grammar, all nonterminals receive copies of the stack. He showed that this new class of grammars defines a strictly smaller class of languages, the mildly context sensitive languages. Membership in a linear indexed grammar can be decided in polynomial time.[2]
The language
is generable by an indexed grammar, but not by a linear indexed grammar, while
is generable by a linear indexed grammar.
[edit] Example
Letting
denote an arbitrary collection of stack symbols, we can define a grammar for the language
[3] as
![S[\sigma] \to aS[\sigma f]c](http://upload.wikimedia.org/math/c/8/e/c8edcad24d74402ae2d4943f158e659f.png)
![S[\sigma] \to T[\sigma]](http://upload.wikimedia.org/math/7/2/e/72e7506fe8529af0ef22fb7ec842e67a.png)
![T[\sigma f] \to T[\sigma]b](http://upload.wikimedia.org/math/e/9/3/e93801dda482de38994123b664cc5951.png)
![T[] \to \epsilon](http://upload.wikimedia.org/math/1/5/d/15d9b2a23ed055db450ed6bcb46ec984.png)
To derive the string
we have the steps
.
Similarly, for
:
.
[edit] Computational Power
The linearly indexed languages are a subset of the indexed languages, and thus all LIGs can be recoded as IGs, making the LIGs strictly less powerful than the IGs. A conversion from a LIG to an IG is relatively simple. LIG rules in general look approximately like
, modulo the push/pop part of a rewrite rule. The symbols
and
represent strings of terminal and/or non-terminal symbols, and any non-terminal symbol in either must have an empty stack, by the definition of a LIG. This is, of course, counter to how IGs are defined: in an IG, the non-terminals whose stacks are not being pushed to or popped from must have exactly the same stack as the rewritten non-terminal. Thus, somehow, we need to have non-terminals in
and
which, despite having non-empty stacks, behave as if they had empty stacks.
Let's consider the rule
as an example case. In converting this to an IG, the replacement for
must be some
that behaves exactly like
regardless of what
is. To achieve this, we can simply have a pair of rules that takes any
where
is not empty, and pops symbols from the stack. Then, when the stack is empty, it can be rewritten as
.
![Y^{\prime}[\sigma f] \to Y^{\prime}[\sigma]](http://upload.wikimedia.org/math/6/f/c/6fcc167dc9c56fb65c9e43b622e93164.png)
![Y^{\prime}[] \to Y[]](http://upload.wikimedia.org/math/2/4/e/24e4bc0ecf837f1ef8bdbafa660b25a9.png)
We can apply this in general to derive an IG from an LIG. So for example if the LIG for the language
is as follows:
![S[\sigma] \to T[\sigma]V[]](http://upload.wikimedia.org/math/f/1/e/f1e7dcb45f5d9346f6efe166a6874f7a.png)
![V[] \to d ~|~ dV[]](http://upload.wikimedia.org/math/a/5/e/a5ebeda3baae5ba9420c44a8d43210d5.png)
![T[\sigma] \to aT[\sigma f]c ~|~ U[\sigma]](http://upload.wikimedia.org/math/b/3/6/b368c590b011bf1a65021edcd4593911.png)
![U[\sigma f] \to bU[\sigma]](http://upload.wikimedia.org/math/d/6/9/d69757528a489e3025f103ed26d45f3f.png)
![U[] \to \epsilon](http://upload.wikimedia.org/math/d/1/8/d1841f74b630dae9b80debf7b3f9c94c.png)
The sentential rule here is not an IG rule, but using the above conversion algorithm, we can define new rules for
, changing the grammar to:
![S[\sigma] \to T[\sigma]V^{\prime}[\sigma]](http://upload.wikimedia.org/math/9/a/8/9a89b24f2e10ca2997176c15a9fb1a96.png)
![V^{\prime}[\sigma f] \to V^{\prime}[\sigma]](http://upload.wikimedia.org/math/c/2/c/c2cd6c173082eec0b2156234bc7cea96.png)
![V^{\prime}[] \to V[]](http://upload.wikimedia.org/math/0/4/b/04bf72ac90c36e3704d2fc14da7a6adc.png)
![V[] \to d ~|~ dV[]](http://upload.wikimedia.org/math/a/5/e/a5ebeda3baae5ba9420c44a8d43210d5.png)
![T[\sigma] \to aT[\sigma f]c ~|~ U[\sigma]](http://upload.wikimedia.org/math/b/3/6/b368c590b011bf1a65021edcd4593911.png)
![U[\sigma f] \to bU[\sigma]](http://upload.wikimedia.org/math/d/6/9/d69757528a489e3025f103ed26d45f3f.png)
![U[] \to \epsilon](http://upload.wikimedia.org/math/d/1/8/d1841f74b630dae9b80debf7b3f9c94c.png)
Each rule now fits the definition of an IG, in which all the non-terminals in the right hand side of a rewrite rule receive a copy of the rewritten symbol's stack. The indexed grammars are therefore able to describe all the languages that linearly indexed grammars can describe.
[edit] Equivalencies
Vijay-Shanker and Weir (1994)[4] demonstrates that Linear Indexed Grammars, Combinatory Categorial Grammars, Tree-adjoining Grammars, and Head Grammars are weakly equivalent formalisms, in that they all define the same string languages.
[edit] Distributed Index (DI) grammars
Another form of indexed grammars, introduced by Staudacher (1993),[5] is the class of Distributed Index grammars (DIGs). What distinguishes DIGs from Aho's Indexed Grammars is the propagation of indexes. Unlike Aho's IGs, which distribute the whole symbol stack to all non-terminals during a rewrite operation, DIGs divide the stack into substacks and distributes the substacks to selected non-terminals.
The general rule schema for a binarily distributing rule of DIG is the form
![X[f_1 \dotso f_i f_j \dotso f_n] \to \alpha Y[f_1 \dotso f_i] \beta Z[f_j \dotso f_n] \gamma](http://upload.wikimedia.org/math/f/a/d/fad0ca56c89f1a6d2495647df17f33aa.png)
Where
,
, and
are arbitrary terminal strings. For a ternarily distributing string:
![X[f_1 \dotso f_i f_j \dotso f_k f_l \dotso f_n] \to \alpha Y[f_1 \dotso f_i] \beta Z[f_j \dotso f_k] \gamma W[f_l \dotso f_n] \eta](http://upload.wikimedia.org/math/1/2/6/1262c3e4524d57be68a8fa456c99d2e3.png)
And so forth for higher numbers of non-terminals in the right hand side of the rewrite rule. In general, if there are
non-terminals in the right hand side of a rewrite rule, the stack is partitioned
ways and distributed amongst the new non-terminals. Notice that there is a special case where a partition is empty, which effectively makes the rule a LIG rule. The Distributed Index languages are therefore a superset of the Linearly Indexed languages.
[edit] See also
[edit] References
- ^ Aho, Alfred (1968). "Indexed grammars—an extension of context-free grammars". Journal of the ACM 15 (4): 647–671. doi:10.1145/321479.321488.
- ^ Gazdar, Gerald (1988). "Applicability of Indexed Grammars to Natural Languages". In U. Reyle and C. Rohrer. Natural Language Parsing and Linguistic Theories. pp. 69–94.
- ^ Hopcroft, John; Jeffrey Ullman (1979). Introduction to automata theory, languages, and computation. Addison-Wesley. p. 390. ISBN 0-201-02988-X.
- ^ Vijay-Shanker, K. and Weir, David J. 1994. The Equivalence of Four Extensions of Context-Free Grammars. Mathematical Systems Theory 27(6): 511-546.
- ^ Staudacher, Peter. 1993. New frontiers beyond context-freeness: DI-grammars and DI-automata. In Sixth Conference of the European Chapter of the Association for Computational Linguistics (EACL '93).
[edit] External links
|
||||||||
is a finite alphabet of terminal symbols
is the finite set of so-called flags (each flag is itself a set of so-called index productions)
is the finite set of
is the sentence symbol
from
matches a nonterminal
followed by its (possibly empty) string of flags
. In context,
, via
, derives to
, where
if
was a nonterminal and the empty word otherwise. The old flags of
are therefore copied to each new nonterminal produced by
matches
(the flag it comes from must match the first symbol following the nonterminal) and copies the remaining index string
to each new nonterminal:
derives to
, where
is the empty word when ![\to T[fg]bT[fgg]T[fgg] \to T[f]bbT[fgg]T[fgg] \to T[]abbT[fgg]T[fgg]](http://upload.wikimedia.org/math/9/3/6/93602206ed9de9b828461e3659fab5f9.png)
![\to abbT[fgg]T[fgg] \to \dotsb \to abbabbT[fgg] \to abbabbabb](http://upload.wikimedia.org/math/8/7/4/87474bf94339129af33b726129524817.png)