|This article is of interest to the following WikiProjects:|
|Sources for development of this article may be located at|
|To-do list for Context-sensitive grammar:|
(Contrary to a previous version of this article, the decision problem is not undecidable.)
- 1 Natural Language?
- 2 Contradicts definition?
- 3 Wrong grammar for language
- 4 Confusing
- 5 can you please explain this textbook problem
- 6 Is context-dependent the same thing as context-sensitive?
- 7 Formal Definition is not accurate
- 8 On definitions and types of equivalence
- 9 Continuing issue With the listed grammar
- 10 Missing info and possibly the missing link
- 11 Duplication of properties etc. with the CSL page
- 12 Kuroda normal form
- 13 Is the grammar in the example a context-sensitive one?
- 14 External links modified
The article says that Chomsky invented CSG for natural languages. Are CSGs really used in linguistics? I've only seen context-free grammars (or some mild extensions) in that context.
- Yes, context-sensitive rewrite rules have been used in linguistics, but I do not know whether they are still in use today. Rp 13:36, 25 July 2007 (UTC)
Added by Paul Ogilvie: in computer science only algorithms have been developed that can easily parse context free languages. These are now common, such as the yacc compiler-generator, an algorithm that can parse a CFL-definition and generate a program that can regognizse the CF language. See Aho, Sethi, Ullman, Compilers - Principles, tools and technisques, 1986. No algorithms have been developed (to my knowledge) that can parse a CSL, except heuristically.
- Yacc doesn't parse arbitrary context-free languages, but only LALR(1) ones. An example of a general context-free parsing framework is ASF+SDF. Rp 13:36, 25 July 2007 (UTC)
Yes, there are natural languages that are not context free (take for example verbs in Swiss German). The algorithm for parsing CSL is quite straightforward (the complexity is high, obviously).
The example has the rule
cB → Bc
but that means α = c on the left-hand side, but α is empty on the right-hand side? Or is the example simply using the alternative definition of context-sensitive? As you can see, I've only studied context-free grammars so far :)
- I think the example is using a monotonic grammar for the sake of simplicity (as an equivalent context-sensitive grammar can be constructed, but it probably wouldn't be as simple).
- A monotonic grammar and a context-sensitive one isn't necessarily the same, so maybe the difference between the grammar and the generated language should be further illuminated. --Bernhard Bauer 00:46, 28 July 2006 (UTC)
- I think that the following grammar will work:
- S -> aRc
- R -> aRT | b
- bTc -> bbcc
- bTT -> bbUT
- UT -> UU
- UUc -> VUc -> Vcc
- UV -> VV
- bVc -> bbcc
- bVV -> bbWV
- WV -> WW
- WWc -> TWc -> Tcc
- WT -> TT
- As you can see, it is rather more complicated than the one in the article. 184.108.40.206 00:39, 20 January 2007 (UTC)
There is a more standard definition of context sensitive rules (used in most textbooks): a rule x -> y is context sensitive iff |x| <= |y|.
- Why should people believe those two definitions are equivalent? Liberulo (talk) 21:06, 30 December 2012 (UTC)
According to this definition, Aa -> aA, a is terminal and A non terminal symbol, is context sensitive. Moreover, the authors claim that these 2 definitions are equivalent! How can this be possible? —Preceding unsigned comment added by 220.127.116.11 (talk) 21:01, 11 September 2008 (UTC)
- I second this question: I'd like to know if and how these two definitions of CSGs are equivalent. Liberulo (talk) 21:06, 30 December 2012 (UTC)
- This is the definition I have seen, for example in Peter Linz, 'An Introduction to Formal Languages and Automata (2nd Ed.)', Chapter 11.3 (1997). Note that if the empty word is to be in the language that needs to be directly specified as a special case, as 'non-contracting' rules obviously can't specify it. The reference Linz cites 'for example' to show that such a grammar is indeed 'context-sensitive' in the sense of the current state of this article is A. Salomaa, 'Formal Languages' (1973). If anyone can track this down it might be the best reference. 18.104.22.168 (talk) 05:08, 18 September 2013 (UTC)
Wrong grammar for language 
I suggest the following context-sensitive grammar which does apply to the definition given.
Again the derivation for "aaa bbb ccc" is:
This might be a dumb question, but is a context-sensitive grammar allowed to "crash"? As in, end up with non-terminals and have no valid rules to follow. If it is not, then I think I found a derivation that would cause it to crash. Again, sorry if this is legal, I'm just learning about these now and it was not mentioned in class; my assumption was that a valid grammar had to always return a string of all terminals. Here's the derivation that would "crash" it:
Again, sorry if this is a dumb question, I just had to answer this question for a problem set and came up with a different answer (that I believe is correct), that does not ever "crash" like this one.
- It seems that the grammar does not produce only . For example this grammar can produce "aaa bb cccc":
- <--- ("H" in question below was introduced here)
- <--- "H" stemming from a "B" ...
- <--- but transformed to "C"
(Or am I missing something?)
It seems however that the following grammar works:
but it does not follow the rule given in the page. Ref: http://www.cs.cmu.edu/~./FLAC/pdf/ContSens-6up.pdf
- I agree with your derivation of aaabbcccc. My informal understanding of rules 3-5 is that they are used to swap BC to CB, and that the total count of Bs and Cs can't be changed by these rules if H is counted as B or C in an appropriate way (i.e. 3: CB→H
CB, 4: H
BC, 5: H
BC→BC). Based on that, I tried to locate the "error", and came up with my flagging of your derivation above. The orginal idea of the grammar's author might have been that the "meaning" of an H (i.e. whether it is to be counted as B or as C) is always determined from the nonterminal immediately right to it: count H as B in HC, count H as C in HB. However, in your derivation, the nonterminal immediately right to the "critical" H is changed from C to B due to some unexpected swapping.
- I wonder what the source of the flawed grammar is; it doesn't appear in Hopcroft+Ullman 1979, which is the only text on CSG I have. If it remains unsourced, it should eventually be removed, anyway. When I've time, I could elaborate H+U's example as a replacement.
- I had a look at your Ref: Sutner has (on slide 4-->p.1) the same restriction as the wikipedia article, and his rule cB→Bc doesn't satisfy them (the left and right embedding context of B on the rule's left-hand side, viz. c and ε, respectively, should reappear on its right-hand side, but c doesn't). Maybe that is what Sutner expects as an answer to his question "Right?" on slide 7-->p.2. Moreover, not even the Kuroda normal form (slide 11-->p.2) fits into the scheme. Probably Sutner implicitly used the notion of a Noncontracting grammar. The wikipedia article contains his grammar as well as the Kuroda NF, and claims equivalence to CSG. - Jochen Burghardt (talk) 09:14, 12 February 2014 (UTC)
- I got a look into the Mateescu & Salomaa (1997) cited by the Noncontracting grammar article and explained their transformation of noncontracting grammars to context-sensitive grammars, using the language as an example. The resulting grammar is different from that you revealed as flawed. - Jochen Burghardt (talk) 16:38, 12 February 2014 (UTC)
- Today, I changed the grammar to the grammar from Noncontracting_grammar#Transforming into context-sensitive grammar, simplified by
- replacing [a] by a, [b] by b, [c] by c, Z1 by W, Z2 by X, and
- contracting the last four rules into bB → bb.
- I hope the simplifications (and the source grammar) are correct; please cross-check. Apparently, Metaxal's above derivation of aaabbcccc doesn't work any longer now. - Jochen Burghardt (talk) 20:33, 1 April 2014 (UTC)
Using a grammar that contradicts the definition is highly confusing. What is a monotonic grammar?
- I'm moving the definition to "monotonic" grammar to its own page. It's wrong to include it here, since this page is not about classes of grammars that happen to describe the contest-sensitive languages, but about context-sensitive grammers proper. Rp 13:38, 25 July 2007 (UTC)
HERE IS THE EASY ANSWER:
S → aSBC | abc
CB → BC
aB → ab
bB → bb
bC → bc
cC → cc
- The problem with your solution lies in the clause "S -> abc". As far as "abc" terminates your recursion, the structure "aabcBC" always gets created whenever you try to use recursion. The way to go is the usage of "S -> aBC", which is totally equivalent with the grammar in the Wiki page. --AdamDi (talk) 11:11, 29 April 2012 (UTC)
can you please explain this textbook problem
Is context-dependent the same thing as context-sensitive?
In the Turing completeness article, there is a redlink for context-dependent grammar. If that is the same thing as context-sensitive grammar, please fix the link. Paul Foxworthy (talk) 06:51, 10 June 2010 (UTC)
Formal Definition is not accurate
Specifically, "...and S does not appear on the right-hand side of any rule..." It seems the definition is not accurate. I would like to propose to recommend that the definition changes to a more accurate definition. A more accurate definition would be that the length of the left hand side of the formula is less than or equal to the length of the right hand side of the formula and the grammar cannot be represented in Chomsky Normal Form (i.e. there must be at least one string on the right that is non-contracting and has at least three symbols). Thus, the start symbol, can still appear on the right side of the rule as long as those conditions are met. Being able to include the start symbol on the right of the grammar would be able to simplify many essentially non-contracting context sensitive grammars with equivalent constructions where the the start symbol would not be allowed on the right.
Consider this quick and dirty example I thought of below, as, to write it without the start symbol would create many more production rules, but the start symbol on the right does not effect the fact it the grammar is essentially non-contracting and context sensitive.
Not having the S symbol on the right is more of a rule of thumb or maybe a notation convention, not a formal definition. As it might be beneficial for the student to not write it as such for confusion resulting from the following situation:
Which might appear to be context sensitive, but isn't, because it could be rewritten in Chomsky Normal Form.
Thus, I suggest we change or clarify the definition in this article.
Beginning of Formalism:
A context-sensitive grammar G is a quadruple (V, , R, S) where V is a finite set of symbols, is the subset of V which contains only the terminal symbols and S is the start symbol in V, V.
R is a finite set of production rules in the form such that and are members of V and and || || where |x| is the length of x.
End of Formalism
Also, we should note that an essentially non-contracting non-context sensitive grammar is that context sensitive grammar which can be represented in such a way where no Start symbol appears on the right of the production Rule set. This is in contrast to what is written now, which states that a grammar cannot be context sensitive if there is a Start symbol on the right hand side of the production rule set which can go to the empty string. The actual formal definition of context sensitive grammars is broader based on the references cited.
- "Membership for Growing Context Sensitive Grammars is Polynomial" Dahlhaus and Warmuth, Journal of Computer and System Sciences, 1986
- "Uniform Recognition for Acyclic Context Sensitive Grammars is NP-Complete" Erik Aarts
- "Membership for Growing Context Sensitive Grammars is Polynomial" Dahlhaus and Warmuth, Journal of Computer and System Sciences, 1986
- Hi Jmark13.
- I don't think it's obvious why your references should be used to change the definition. Here are some other search results supporting the tendency of the current definition: , , .
- However, I agree that the exceptional rule concerning is a bit informal (and even inaccurate, because it does not clearly state that S may appear on the right side, if there is no rule ). In addition, there are more quite important authors who defined context-sensitive grammars as non-contracting (Aho and Ullman, for example , and Ullman and John Hopcroft in Introduction to Automata Theory, Languages, and Computation). So it seems legitimate to adopt their definition in this article.
- Still, since there are authors who distinguish context-sensitive grammars and monotonous grammars (e.g. Grzegorz Rozenberg and Arto Salomaa in ), I would object to replacing one definition by another. Rather than claiming that some authors additionally require non-contracting rules, it should be mentioned that the current definition implies non-contracting rules (except the exception) and that the generative power is the same (see my first link).
- There's one thing I don't understand: Why do you mention Chomsky Normal Form as a negative criterion? Consider a grammar with two rules, , and , the grammar clearly is context-sensitive according to both definitions, yet it is in CNF – or am I missing something?
- --Zahnradzacken (talk) 22:57, 19 August 2013 (UTC)
Thank You, Zahnradzacken, I don't think you are missing anything, but I do think something in the formal definition of context-sensitive grammars is a bit ambiguous. And no, we should definitely not change the formal definition in so far as it agrees with the sources mentioned.
However, a less ambiguous definition of CSG would be one that would be formalized in terms of LBA, and be those re-writing rules that form the languages accepted by an LBA, since it has been proven that the languages produced are equivalent.
In terms of CNF, I was incorrect in my assertion, as languages produced by context free grammars is a strict subset of those produced by context sensitive grammars. I apologize for any confusion.
All this said, essentially non-contracting context sensitive grammars is itself a subset of uniform context sensitive grammars, which may be "mildly contracting" (not to be confused with mildly context sensitive) in that there is a contraction, but the length of the right hand side of every rule, even after contraction, is still strictly greater than or equal to the left... The languages produced by this definition should be obviously equivalent to those languages produced by essentially non-contracting grammars.
In my opinion, a set of grammar production rules should be in it's simplest form, i.e. the fewest amount of rules that produce all strings in the language. However, this often requires "mild contraction" on a set of rules that would otherwise be context-sensitive and essentially non-contracting. And while this might seem like splitting hairs here, there are a lot of results to theorems that depend on grammars that are deemed essentially non-contracting and context sensitive that would have to be re-proved for a "mildly contracting" context sensitive grammar, but despite the fact there is an S on the right, it should be obvious that these mildly contracting context sensitive grammars can be re-written as essentially non-contracting context sensitive grammars (just add more rules and replace each S symbol accordingly), and have an equivalent expressive power.
So, "What is true?" And if it is true that mildly contracting CSGs produces the same languages as essentially non-contracting CSGs, then the source definition, which includes that S cannot be on the right hand side, isn't a uniform or optimal definition, but applies only to essentially non-contracting CSGs. By eliminating this extra rule, and observing the class of difference between Recursively Enumerable languages with unrestricted grammars that are not CSGs with mildly contracting CSGs, we may simplify our Rule sets when appropriate and still have a CSG.
Anyway, if you see the logic above, and the benefits of being able to reduce some grammar rule sets to a mildly contracting CSG, then I do propose to at least add to the article that only some definitions add the extra requirement that S not occur on the right hand side and that these grammars are called "essentially non-contracting".
On definitions and types of equivalence
There seems to be some confusion about the equivalence between context-sensitive grammars and noncontracting grammars. It's true that CSGs and noncontracting grammars are equivalent in the sense that they can describe the same sets of languages. But the definitions of the grammars aren't equivalent.
A definition is basically a sentence that talks about mathematical objects (formally speaking, it's a formula, as sentences are formulas without free variables). An example of a definition is "an integer x such that there exists integer y such that y*2 = x". The defined concept (even numbers) consists of those objects from the universe of discourse which yield a true sentence when substituted for x in the definition. Another definition of even numbers is "an integer x such that there exists integer y such that y + y = x". Those two definitions are equivalent. Any object either satisfies both definitions or doesn't satisfy either of them.
The definitions of CSGs and noncontracting grammars aren't equivalent, because e.g. the grammar with productions (S -> Bc; Bc -> Bd; Bd -> bd) satisfies the definition of noncontracting grammars but doesn't satisfy the definition of context-sensitive grammars, as the middle production changes a terminal symbol. That's that.
CSGs and nocontracting grammars are equivalent in their ability to describe languages. For any language L which is generated by a context sensitive grammar G, there exists a noncontracting grammar G' which generates language L. For any language L generated by a noncontracting grammar G, there exists a CSG G' which also generates G.
Some writers may equivocate between those two kinds of equivalence and say that two definitions are equivalent when in fact they define distinct concepts with equal expressive power. However, that should be limited to situations when the definitions differ in minor details and the expressive equivalence of the defined concepts is trivial to see. This is not quite the case here. 22.214.171.124 (talk) 18:19, 28 June 2014 (UTC)
- Thank you for your explanations; I agree with you. We should distinguish between equality of grammars and equality of their languages. The notion of weak equivalence (formal languages) could be used for the latter relation.
- However, I wonder why you changed the grammar in section "Examples". The former version was obtained from Noncontracting grammar#Example and Noncontracting grammar#Transforming into context-sensitive grammar which is based on Mateescu & Salomaa (1997, see full ref. at Noncontracting grammar#References); however I'd forgotten to mention the source up to some minutes ago. There was a lot of confusion about that example (see section "Wrong grammar for language " above), which we shouldn't repeat. Also, following Mateescu & Salomaa (1997), sect.3.1, p.29-30, there is no need to forbid S in a production's rhs, unless a production S→ε exists. - Jochen Burghardt (talk) 17:27, 29 June 2014 (UTC)
- Today, I changed "equivalent" to "weakly equivalent" where appropriate in the article, but restored the original grammar version. - Jochen Burghardt (talk) 11:43, 9 July 2014 (UTC)
Continuing issue With the listed grammar
There is a problem with the grammar listed on the site as of now. As I'm a new user, I'm reluctant to edit the page without the consensus of the group.
This leads to problems like:
There is no available non-terminal to fix this; it looks like this could be fixed with the additional rule:
- The definitions of Context-sensitive grammar, derivation, and language don't require every derivation to result in a string of only terminal symbols, so your example is not an issue of the given grammar. However, the notion of a derivation isn't mentioned at all in the article, nor is the notion of the language of a grammar explained there, so your example reveals an issue of the article.
- I intend to fix this in the next time, also hinting at Garden path sentences, a related phenomenon known from natural languages (e.g. the sentence "The horse raced past the barn fell" tempts to build a derivation that gets stuck, similar to your example; however, there is another one that properly derives the sentence, the same applies to your example). - Jochen Burghardt (talk) 08:52, 8 August 2014 (UTC)
- On second thought, sentences like "The horse raced past the barn fell" are quite a different kind of garden path, since there a terminal symbol string is given that should be derived from S; in your example, no such string is given. So, I didn't refer to Garden path sentences in the article, but just defined "⇒", "⇒*", and "L(G)", and explicitly stated that derivation that get stuck in a mixed string of nonterminal and terminal symbols are allowed, but don't contribute to L(G). - Jochen Burghardt (talk) 10:29, 10 August 2014 (UTC)
I just noticed too that the example contradicts the one given (from a reliable source) in Noncontracting grammar which doesn't have the extra non-terminals and rules. I don't have time right now to figure it all out, but I don't see anything wrong with simpler grammar right now. JMP EAX (talk) 00:05, 16 August 2014 (UTC)
- I see the distinction now between grammar and language, but I do have wonder if everyone defines CSG like this. This article is basically citing only one source... JMP EAX (talk) 00:22, 16 August 2014 (UTC)
It seems that the distinction between Context-sensitive grammar and Noncontracting grammar is a source of confusion for many readers. Probably, many authors use the name "context-sensitive grammar" for what wikipedia calls a "noncontracting grammar"; the sentence "Some definitions of a context-sensitive grammar only require that for any production rule of the form u → v, the length of u shall be less than or equal to the length of v." in Context-sensitive grammar#Formal definition tries to make that clear, but it might be necessary to rephrase it (e.g. to "Some authors define ...") to give it more emphasis. Another possibility could be to merge the articles Noncontracting grammar and Context-sensitive grammar. Hopcroft+Ullman define (on p.223-224) a CSG as wikipedia does in Noncontracting grammar, mentioning in their next sentence that the definition at Context-sensitive grammar#Formal definition is a normal form for them, and leaving the proof as excercise 9.9 (p.230); I think that is a reasonable treatment of the issue.
I would like to discuss about the example issues, but I didn't understand which example you found to contradict to which other one. You didn't mean Noncontracting grammar#Example (which is simpler, but not context-sensitive in the wikipedia sense) vs. Context-sensitive grammar#Examples, did you? Jochen Burghardt (talk) 09:22, 16 August 2014 (UTC)
- I think it would help to move/copy to the lead the equivalence to non-contracting grammars and the "some authors [consequently] define CSG this way". A brief survey of the textbooks that Google Books indexes finds that it's not uncommon to have CSG defined as non-contracting: some examples (from the 1st page of hits)  . This includes authors like Martin Davis who are normally very scrupulous about historical accuracy an such (def as contracting on p. 189 the Chomskyan def given way later on p. 330). JMP EAX (talk) 12:03, 16 August 2014 (UTC)
The so-called  left-context and right-context grammars, which have rules on the form -> (and the dual) are [weakly-only I assume] equivalent to CSG. I do have wonder if you use "forced swaps" like -> what do you get. Probably the same thing. JMP EAX (talk) 00:42, 16 August 2014 (UTC)
- By duplicating each nonterminal A to A and A2 and transforming each rule αA→αγ to αA→A2α and A2α→αγ it should at least be possible to establish that you get no less expressive power by forced swapping. Vice versa, since the forced swapping rules are noncontracting (I assume), you can't get more, either. - Jochen Burghardt (talk) 09:30, 16 August 2014 (UTC)
On a slightly different tack, It would be interesting to find and add historical info about: when (1) Chomsky defined his CSG, (2) who[ever] gave the non-contracting def, (3) equivalence to LBA was proven. JMP EAX (talk) 11:44, 16 August 2014 (UTC)
- I put Hopcroft+Ullman's "Bibliographic notes" (p.232) here, in order not to interfere with your article editing. Please insert it where appropriate.
The Chomsky hierarchy was defined in Chomsky (1956,  1959).  (...) Kuroda (1964)  showed the equivalence of LBA's and CSG's. Previously, Myhill (1960)  had defined deterministic LBA's, and Landweber (1963)  showed that deterministic LBA languages are contained in the CSL's. Chomsky (1959) showed that the r.e. sets are equivalent to the languages generated by type-0 grammars. (...)
- Mateescu+Salomaa prove the equivalence of noncontracting and context-sensitive grammars on p.187; they refer to Salomaa (1973) for details.
- In Chomsky (1956), I found on p.118 (=p.6 in the pdf file) the quote: "A rule of the form Z X W → Z Y W indicates that X can be rewritten as Y only in the context Z--W." That seems to indicate that Chomsky had wikipedia's CSG definition in mind, not the noncontrating grammar definition.
- Noam Chomsky (1956). "Three models for the description of language" (PDF). IRE Transactions on Information Theory (PGIT). 2: 113––124.
- Noam Chomsky (1959). "On Certain Formal Properties of Grammars" (PDF). Information and Control. 2: 137––167.
- Sige-Yuki Kuroda (1964). "Classes of languages and linear-bounded automata". Information and Control. 7 (2): 207––223. Unknown parameter
- J. Myhill (1960). Linear Bounded Automata (Technical Report). Wright Patterson AFB, Ohio. pp. 60––165.
- P.S. Landweber (1963). "Three Theorems on Phrase Structure Grammars of Type 1" (PDF). Information and Control. 6 (2): 131––136.
- Arto Salomaa (1973). Formal Languages. New York: Academic Press.
- - Jochen Burghardt (talk) 17:55, 16 August 2014 (UTC)
- In 1963 Chomsky gave the non-contracting def too. See the history section I added to noncontracting grammar. JMP EAX (talk) 00:50, 17 August 2014 (UTC)
- Alas his notion of strong equivalence appears to have little practical relevance (per ), so I'm not sure is worth pegging down equivalence with weakly every time, because hardly anyone seems to consider the strong one interesting. JMP EAX (talk) 01:24, 17 August 2014 (UTC)
Duplication of properties etc. with the CSL page
I'm not really sure what to do about that; there's more at context-sensitive language, but the two pages evolved independently so they aren't really a super-set of each other. But except for the normal forms, I'm not sure there are really any properties that are of CSGs per se but don't belong to the CSL page (too). JMP EAX (talk) 13:20, 16 August 2014 (UTC)
- The CFG vs CFL page have the same issue. I've started a centralized discussion at  JMP EAX (talk) 15:24, 16 August 2014 (UTC)
Kuroda normal form
Maybe I'm missing something again, but it seems to me that the "Kuroda normal form" is not really a normal form for CSG as defined by Chomsky. The first rule AB → CD doesn't seem to fit the CSG template of expanding a single non-terminal. JMP EAX (talk) 13:24, 16 August 2014 (UTC)
Is the grammar in the example a context-sensitive one?
Hello fellow Wikipedians,
I have just modified one external link on Context-sensitive grammar. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
- Added archive https://web.archive.org/web/20110708224600/https://danielmattosroberts.com/earley/context-sensitive-earley.pdf to http://danielmattosroberts.com/earley/context-sensitive-earley.pdf
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at
You may set the
|checked=, on this template, to true or failed to let other editors know you reviewed the change. If you find any errors, please use the tools below to fix them or call an editor by setting
|needhelp= to your help request.
- If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
- If you found an error with any archives or the URLs themselves, you can fix them with this tool.
If you are unable to use these tools, you may set
|needhelp=<your help request> on this template to request help from an experienced user. Please include details about your problem, to help other editors.