Talk:Backus–Naur Form

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Philosophy (Rated Start-class)
WikiProject icon This article is within the scope of WikiProject Philosophy, a collaborative effort to improve the coverage of content related to philosophy on Wikipedia. If you would like to support the project, please visit the project page, where you can get more details on how you can help, and where you can join the general discussion about philosophy content on Wikipedia.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.

The Proper Title[edit]

BNF stands for Backus Normal Form, not Backus-Naur Form Here is a discussion on the topic:

The bottom line is that Peter Naur simply made a few syntax changes to the language format to rid it of characters not available on a common keyboard. —Preceding unsigned comment added by Paulsnx2 (talkcontribs) 04:48, 9 February 2010 (UTC)

Wikipedia should list terms as they are commonly used; and both interpretations are. Rp (talk) 10:25, 14 August 2010 (UTC)
After researching BNF it is really unclear how much Naur contributed. True. He only made minor changes to two meta symbols of the language. But it was Naur that made the changes to the ALGOL report.
Who decided that meta variables should identify language constructs and be used in textual descriptions. Who decided to use the patten replacement form of non-context free grammars in defining the comment grammar rules. We do not know how much the actual use of BNF was Naur. We do not know what Backus gave Naur in the direction of how BNF was actually to be used. It may have been Naur's decision to describe comment syntax with patten replacement rules. Who decided that metalinguistic variable names should be descriptive Specifically to be able to be used in the textual descriptions or specifications. Naur explained this in his ALGOL class notes in 1963.

Not really BNF[edit]

After doing research into the original use of BNF in the ALGOL 60 report I have to say that this topic totally ignores how it was actually used in the ALGOL 60 report. It was definitely, as used in the ALGOL report not a Chomsky grammar. This was a surprise to me. As described by Peter Naur it is a reduction grammar. Mostly. It was used in a pattern replacement(non-context free) form describing comment grammar. There is no record as to how Backus defined the language. There are two documents by Naur describing2 BNF. In the ALGOL 60 report and in class material developed by Naur the ::= metasymbol is described, to be inturpeted as "is defined as". A BNF rule defines a "metalangustic variable", on the left of the ::= to be a sequence of metalangustic symbols and/or variable on its right. This is the description of an analytical or reductive rule. Further in some cases metalangustic variables are not defined by a rule. They are nonterminal symbols yet never appear on the left of a ::=. From historical document we know that Backus introduced the linguistic rule concept, in written form, to the ALGOL working group. Naur applied them to then existing language specification. The outcome is that the metalangustic variables, that later were called classes, by Naur, were specified to be meaningful language construct descriptive names as also used it the natural language descriptions of said language construct. The defination of a reductive grammar can be argued as indeterminate as to describing an analytical or productive grammar. Perhaps that is the intent. The ALGOL report is intended to both specify the language implementation and use.

But in describing comments at the end of section 2.3 of the ALGOL 60 report we have comments described by BNF patterns and their equilavant replacement string:

The sequence of basic symbols: is equivalent to
 ; comment <any sequence not containing ;>;  ;
begin comment <any sequence not containing ;>; begin
end <any sequence not containing end or ; or else> end

Specifically, in table form, in the ALGOL 60 report, comments are described using a pattern replacement rules that are like a context sensitive grammar rule. Also of note: "<any sequence not containing ;>" an obvious self-defining "Chomsky non-terminal" never appears on the left of ::=. There are other cases were a series of language symbols is described within bounding < >, never to be defined by a rule.

It is lost history as to who actually created the form of BNF actually used in the ALGOL report. Naur is the one who actually wrote up the BNF ALGOL 60 specifications. How much of its potential use was actually comunacated to Naur by Backus isnt known. It may be that Naur contributed more then the minor symbol changes.

I think that as used in the ALGOL 60 report, BNF is a publication metalanguage. That is, it is not a proper or formal grammar. BNF as used to describe ALGOL 60 describes concepts. In section 2.3 we have BNF defining delimiters:

<arithmetic operators> ::= +|-|×|/|÷|^

<arithmetic operators> lists all arithmetic operators regardless of hiarchy. But in section 3.3.1 we have separate definations:

<adding operators> ::= +|-


<multiping operators> ::= ×|/|÷

In the one section the interest is simply to talk about delimiters. In the other we are describing an arithmetic expression were hiarchy is important. I already mentioned the comment rules defined in a table form of BNF equilavance. These and others examples can be found if one studies the ALGOL 60 reports. In some instances it is an analytical or reductive grammar. Others a production grammar. After studying ALGOL's BNF it seams that today's BNF is a different language. Today's BNF is a subset of the BNF metalanguage used in the ALGOL report.

As Naur is the one who wrote the ALGOL BNF specifications, it may be that the actual use in in the report results from converting natural language descriptions to rule forms. The comment rules for example do not fit the simple BNF rule form described. BNF was still used in a pattern equilavances form.

It does seam to be the case that later BNF descriptions have ignored the features that do not fit the Chomsky description.

Why is that?

Sense at some point the language has been substantially changed to match a Chomsky CFG. It should be called BNCF. Backus Naur Chomsky Form. As used today it is not the same language used in the ALGOL 60 report.

I first learned BNF in the late 60's from ALGOL manuals. I attended many L.A. ACM SegPLAN meetings back then. BNF was commonly used by SegPLAN members. Chomsky was never mentioned by anybody back then. UCI (University of California Irvine) had a leading computer science department. I knew many of the professors and deta center people in the late 60's. We installed a DEC-SYSTEM-10 at Cerritos Collage were I was working at the time. I used the UCI DEC-10 developing a Honeywell H200 COBOL to DEC-10 COBOL conversion program. My interest was compilers and operating systems. I knew many professors at UCI and we talked about compilers. Chomsky never was mentioned. This was in the early 70's. When did Chomsky first become associated with BNF and computer science?

Steamerandy (talk) 16:31, 23 April 2015 (UTC)


The BNF of BNF is redundant. <Expression> is unnecessary, as its only production is <whitespace> and an <Or-Expression>. <Or-Expression> also begins with <whitespace>. Correct me if I'm wrong, but I think <Expression> should be eliminated and <Or-Expression> should be retitled <Expression>.

--Michael A. Ball 18:47, 17 July 2006 (UTC)

Hirzel, I think the "see also" is a bit redundant, but since you put it in, I'll leave it. It is my opinion that a "see also X" is nice but embedding the link in the article and explaining there *why* one should "also see X", is better. Having both seems a bit redundant to me, but that's just me. -- Jan Hidders 03:44 Aug 14, 2002 (PDT)

This translates into English as:

A postal address consists of a name-part, followed by a street-address part, followed by a zip-code part.
A personal-part consists of either a first name or an initial followed by a dot.
A name-part consists of either: a personal-part followed by a last name followed by an optional "jr-part" (Jr., Sr., or dynastic number) and end-of-line, or a personal part followed by a name part (this rule illustrates the use of recursion in BNFs, covering the case of people who use multiple first and middle names and/or initials).
A street address consists of an optional apartment specifier, followed by a street number, followed by a street name.
A zip-part consists of a town-name, followed by a comma, followed by a state code, followed by a ZIP-code followed by an end-of-line."

Note that many things (such as the format of a personal-part, apartment specifier, or ZIP-code) are left unspecified here. If necessary, they may be described using additional BNF rules, or left as abstraction if irrelevant for the purpose at hand.

Common US Postal rules (Publication 28, put the apartment number at the end of the line, not the front as shown here. Also, they specifically request that the last line not include a comma between the city and state. And there are various other nitpicky errors or omissions.
But this is all getting into the detail of postal addresses, which is not the point here. This is all included just as an illustration of BNF, using an example that is common enough for people to easily understand it. T-bonham (talk) 07:45, 29 March 2008 (UTC)
OK, I'm taking the comma out and putting the apt number at the end, as requested by the US Post Office. I agree that *talking* about these details in the article is unnecessary, but I think fixing these errors leaves us with an example that is just as easy to understand. -- (talk) 22:06, 12 April 2009 (UTC)

Isn't personal-part specified? I didn't want to remove the mention of it being unspecified because I'm a beginner, but this one seems pretty obvious...

The "See Also" link Syntax Diagram page has been deleted, page history shows multiple deletes (with rubbish content) Should this link be kept?

The formal syntax is under "Further Examples"[edit]

The formal syntax is specified in thechapter "Further Examples", without exlanation.

I would prefer to have a formal syntax definition in one seperate chapter (named as such), and an additional informal explanation.

The informal explanation could be similar to what I found in the UML2.0 spec:

  • All non-terminals are in italics and enclosed between angle brackets (e.g. <non-terminal>)
  • All terminals (keywords, strings, etc.), are enclosed between single quotes (e.g., ‘or’)
  • Non-terminal production rule definitions are signified with the ‘::=’ operator
  • Repetition of an item is signified by an asterisk placed after that item: ‘*’
  • Alternative choices in a production are separated by the ‘|’ symbol (e.g., <alternative-A> | <alternative-B>)
  • Items that are optional are enclosed in square brackets (e.g., [<item-x>])
  • Where items need to be grouped they are enclosed in simple parenthesis; for example:
(<item-1> | <item-2>) *
signifies a sequence of one or more items, each of which is <item-1> or <item-2>

—The preceding unsigned comment was added by (talkcontribs) .

Missing EOL in Example?[edit]

Shouldn't both options of the name branch include EOLs, or should the EOL be after the end of both options?
~ender 2007-05-12 19:40:PM MST —The preceding unsigned comment was added by (talk) 02:35, 13 May 2007 (UTC).

I agree. I have now added the EOL. Hendrik. 16:47, 20 June 2007 (UTC)
At first glance, it looked like a typo to me also -- the name-part should produce an EOL no matter which option is chosen.
But after closer inspection, I see that one branch includes a literal EOL, while the other branch of the name-part ends in a rule-name -- a rule-name that eventually produces a EOL. Adding *another* EOL at the end of that branch of the name-part (after that final rule-name) would produce a double EOL. -- (talk) 22:06, 12 April 2009 (UTC)
Yes. Because the rule is using recursion (ie the "name-parts" definition includes "name-part" itself), it might be misleading to think of the rule as a simple branch of two options. The bottom branch recurses into the top branch, which does have the necessary EOL. This allows any number of passages through the bottom branch, but it always finishes through the top branch. For example, "John Schmidt" is a valid production through the rule that only requires one path through the top branch. "John Jacob Schmidt" needs one trip through the bottom branch for "John", and then a recursive restart from the beginning through the top branch to add the "Jacob Smith" and (just one) EOL Petershank (talk) 18:32, 25 February 2013 (UTC)

While it is Backus-Naur Form its proper name is Backus-Naur Format[edit]

w3, NASA --Adam1213 Talk 01:52, 2 August 2007 (UTC)

Your second link actually uses both terms. —RuakhTALK 02:15, 2 August 2007 (UTC)
Format is more accurate, perhaps, but hardly ever used (I've never seen or heard it before.) Rp (talk) 10:27, 14 August 2010 (UTC)

"<personal-part>" in Address Example[edit]

Unless the notation is counter-intuitive, the <personal-part> in the example given appears to me to be incorrect. I would expect that the definition would wish to apply either the whole of what appears after the OR "¦" symbol (an initial and a ".") or that which appears before the OR symbol (a first name). But not to glibly concatenate the "." to the result of what precedes this "." .

So <personal-part> (really a <first-name or initial part>) as I believe it was intended to be in the example, should not in my view be expressed: <personal-part> ::= <first-name> | <initial> "."

(Mail addresses are not yet specified the same way as some email addresses (e.g. christian-name.surname@emaildomain))

The <personal part>(= here 'first-name or initial part') of a snail mail address should surely be specified (in England anyway):

<personal-part> ::= <first-name> | <initial ".">

(or something like that? I'm (still) not knowledgable on this notation).

Perhaps its easier in words:

<personal-part> is:


( The Initial followed by "." )



The First name [not followed by  "."]


i.e. NOT


( (

The First name 



( The Initial )


followed by "." . The important part to read in the above is the bracketing level. The layout unfortunately doesn't come out as I typed it.

This admittedly doesn't allow for multiple initials, but neither does the example given. I think my version is less wrong than the original.

I appreciate that this Wiki topic is not really about mail addresses, but it does not aid the education process if something secondary to the main argument is not right as the learner may think she has not understood the prime argument.

Thinkact 11:48, 9 September 2007 (UTC)

Perhaps, as you say, the notation is counter-intuitive; but I do not see that we have a choice but to stick with it, as the article is in some sense about the notation. (Personally, I think it's quite intuitive that in a BNF a b | c d means (a b) | (c d), not a (b | c) d, but perhaps that's only because I'm used to it. *shrug*) —RuakhTALK 16:34, 9 September 2007 (UTC)

The BNF of BNF won't parse itself. The rule

<opt-whitespace> ::= " " <opt-whitespace> | "" 

contains a list of expressions. The rightmost expression ("") does not fit the definition of expression in the grammar. "" is not a literal.

It would be better to allow a list to be an empty list:

<list> ::= <term> | <term> <opt-whitespace> <list> | 

and then define opt-whitespace as:

<opt-whitespace> ::= " " <opt-whitespace> | 

--Sérgio Carvalho 16:25, 3 March 2008 (UTC)

BNF of BNF[edit]

In the BNF of BNF, does the rule:

<line-end> ::= <opt-whitespace> <EOL> | <line-end> <line-end>

make sense? Are two consecutive <line-end>s really an option?

Why not hoist the left branch up to the <rule> rule:

<rule> ::= <opt-whitespace> "<" <rule-name> ">" 
           <opt-whitespace> "::=" 
           <opt-whitespace> <expression>  
           <opt-whitespace> <EOL>

and delete the <line-end> rule? (talk) 01:03, 28 May 2008 (UTC)


In section "Variants", there is this item:

- Alternative choices in a production are separated by the ‘|’ symbol. E.g., <alternative-A> | <alternative-B>

I did not understand this sentence. The symbol '|' is not part of the BNF syntax? This symbol was used throughout the article, and I think it is in fact part of the original BNF syntax.

I think this sentence should be removed, or the article should be revised to avoid the use of '|'. —Preceding unsigned comment added by Sqmedeiros (talkcontribs) 13:50, 5 May 2009 (UTC)

I think you've misread the text. '|' is part of the syntax, and this sentence is trying to explain how to use it. When we want to represent alternatives (i.e. A or B), we write A | B. Thus, alternative choices are separated by '|'. Gareth Jones (talk) 17:05, 5 May 2009 (UTC)

I think the section "Variants" should only mention things that are not strictly BNF (like EBNF, ABNF etc). So, I think is misleading to talk about "|" (part of BNF) and "+" (not part of BNF) in the same section. Another approach would be to make a clear separation of the items in the section: BNF, EBNF, none of both. Sqmedeiros (talk) 16:54, 6 May 2009 (UTC)

Chomsky's influence[edit]

When talking about BNF history, some sources[1] say that Backus was influenced by Noam Chomsky's work. This is also the case of older versions of this article, which contained the statement "John Backus ... adopted Chomsky's generative rules".

However, the credibility of this statement should be examined better, maybe by creating new sub-chapter for this topic?

Contrary to the statement, we have these facts:

  • Backus had not cited Chomsky's work in his original article[2], but he acknowledged the cooperation and works of many other people - see page 20 of his article. If he had known Chomsky, what would be the reason for not citing him as well?
  • Chomsky's book Syntactic_Structures was published in 1957. Can we find whether Backus really had this book in 1958 before he came with the BNF? If not, what probability there is, given the fact that Chomsky was working in completely different area of science? Consider how many IT engineers/IT scientists are able to follow and study the works of "soft" sciences like linguistics - we are not able to follow even the IT science area...
  • Aside of this, the "metalinguistic formulas" used by Backus are similar to pre-Chomsky concepts, even the Pāṇini's Sanskrit grammar[3]

In the interview with Grady Booch in 2006[4], the 81-year old Backus explained the background of the "Chomsky influence" story:

Booch: How did you get drawn into the ALGOL process then? 
Backus: Well, I don’t know that I was very much drawn into it.
Booch: Your BNF work came from that of course when you were specifying the structure of the ALGOL.
Backus: Yeah, but I-- It was a big committee deal,
Booch: Sure.
Backus: --and I don’t think I contributed very much to it.
Booch: How did you and Peter get together, Peter Naur, to specify the syntax?
Backus: Well, I had come to one of these meetings with this BNF description. It was a little paper and I
        handed it- I hand carried it because it was so late in the game, so I had these copies that I dragged to the
        meeting and passed out to people and nobody paid any attention to it.
Booch: Really.
Backus: No. Except Peter Naur. When he came to write up the thing, he used this descriptive method,
        and improved it in the process.
Booch: In retrospect, I can’t imagine specifying a language without using BNF. 
       So what were they doing before that time frame? It must have been—
Backus: They were just writing English.
Booch: And giving examples, and stuff like that. 
       I had read you were influenced by some of the work of Noam Chomsky, that led you to that.
Backus: Yeah, well, that’s a funny story. 
        That’s what I said and what I believed, and yet… Who was it? Somebody sort of proved that I was wrong about it, 
        that I hadn’t got it from Noam Chomsky, because the dates were all wrong somehow. But-- God, who was that?

Unfortunatelly, Backus had died in 2007 so we should try to answer his question by ourselves.

The exact timing of the Backus form development can be derived from the Backus interview above and from Naur's description [5][6] of the events:

Naur is describing his participation in the December 1959 meeting of the European part of the ALGOL committee.
   "The use of Backus's notation could only be mentioned in passing; my recommendation to that effect appears as one of 55 brief notes 
   on revisions of the Zurich report as follows: '15) Change the syntactical  description'" 
   "The decisive action concerning the development of the new style of description was taken during the weeks following the Mainz meeting, 
   when I worked out the results of the meeting according to the notions of language description that I had formed. 
   In order to press the matter forward as much as possible, on January 2 I sent all other committee members a document... 
   [which] contains the first appearance of my slightly revised form of Backus's notation."

Using google, the actual dates can be added:

  • Zurich meeting was held on 27 to June 1, 1958. Here, the task for creating better syntax notation was formulated
  • Mainz meeting was held in November 1958[7]. Here, Naur probably noticed the Backus form of grammar and in coming weeks, he reshaped it into the current BNF.
  • January 2, 1959: Naur had sent his modified BNF document to the committee

Now, the question is: Could Backus read the Chomsky's work about rules and grammars in between June and November 1958? Will somebody help me with this question?

--Dulik (talk) 04:36, 11 May 2011 (UTC)

It isn't necessary for Backus to have read the Chomsky book itself. He could have heard about the Chomsky approach one way or another. Backus clearly says in the interview that he believed he was influenced by Chomsky ("That’s what I said and what I believed"). Then he says that someone else claimed that this wasn't possible. Can we find that someone else? --Macrakis (talk) 13:46, 11 May 2011 (UTC)
A quick Google search finds an interesting biography -- probably not a reliable source as it is apparently an unpublished student paper, but it contains some useful and well-referenced quotes indicating that Backus thought in the 60's that he'd been influenced by Emil Post's notation (not Chomsky's) and then another quotation indicating that he might have been mistaken about Post. --Macrakis (talk) 13:57, 11 May 2011 (UTC)
It seems that Backus was very humble guy, always trying the give the credits for BNF to someone else. I have found that document parallel to you and then had editing conflict when submitting this :-) - on page 18, Backus says he was inspired by Martin Davis's course of Emil Post "production" concept, but Martin Davis later declares that he made the course as late as in 1960 (2 years after BNF was born) and that he (Martin) did not know Chomsky even then. But that does not mean Backus could not encounter E. Post somewhere else.
If Backus tries to credit everybody around - including Chomsky - (and lessen his own credit for the BNF), we can't suspect him from trying to hide his source of inspiration.
From what I read, Backus is not sure about the true source which inspired him, and the IT giants of his time (like Donald Knuth) consider BNF as invention independent on other works (like the Chomsky's).
I also found this article discussing the dates and procedures of the Chomsky's first book in 1957 - eventhough it was published by small unknown publisher in Europe, it received very good review on a prominent place in one of the next issues of Language_(journal). It is possible that Backus had read the review there, but I think the more probable version is that Backus encountered the Chomsky's theory way after he created BNF, but because during the first years after, he didn't consider BNF as an invention, he could mixup the timeline?

Aside of these speculations, can we find the first article/book, which came with the statetment about the Chomsky's influence to the BNF? --Dulik (talk) 14:56, 11 May 2011 (UTC)

The BNF language is footnoted in the ALGOL 60 report: ACM-GAMM conference. Proc. Internat. Conf. Inf. Pro ., UNESCO, Paris, June 1959.

BNF was not used in the ALGOL 58 report. It was first used in the ALGOL 60 report. Read the ALGOL 58 report. I think it may be down loadable from the ACM. The history is wrong. It was created by John Backus and Peter Nuar only used it the the ALGOL 60 report. Nuar said that he only changed a couple of characters. BNF only uses 4 meta symbols ::=,|,<,> and , isn't one of them. I do not think the | or symbol was one he changed. So that leaves the < > that identify a class name or ::= "is defined as" operator. Why has this article been so polluted by linguists?
Ok after looking at the referanced the Naur changes are:
or was changed to the |.
 := (= with 3 bars) was changed to the ::=.
That does not seam a significant change. The 59 document is a preliminary working ALGOL 60 paper.
Steamerandy (talk) 06:27, 25 March 2015 (UTC)

Email with explanation from Peter Naur[edit]

Original question[edit]

Subject: For Mr. Naur: Question about Backus-Naur-Form for Wikipedia article
Datum: Wed, 11 May 2011 14:41:19 +0200
From: Tomas Dulik

Dear Mr. Naur,

I have one question and you are probably the only man on Earth who can answer it.
The answer is quite important for the future of education in the computer science, especially in the subjects like "Theory of Compilers" etc.

The question is:
Can you say anything about the statement that Backus's "metalinguistic formulas" (which later became BNF) were adopted from the the earlier work of Noam Chomsky?
Is it true or not?

Thanks for your reply,
Best regards,
Tomas Dulik

Reply from Peter Naur[edit]

Date: Mon, 16 May 2011 15:06:11 +0200
From: Erik Frøkjær
To: Tomas Dulik

Reply to Tomas Dulik on behalf of Peter Naur, no email address:

Tomas Dulik

About Question about Backus-Naur-Form.

In R. L. Wexelblat (ed.): Proceedings of the History of Programming Languages Conference, Los Angeles, Calif., 1-3 June 1978. Academic Press, New York, 1981, ISBN 0-12-745040-8, you find in ALGOL Session, Transcript of question and answer session, page 162, a question: 'Did the ideas of the formal linguists like Chomsky have a genuine effect ...?',

This question John Backus answered, saying: 'As to where the idea came from-it came from a class that I took from Martin Davis ... talking about the work of Emit Post and the idea of a production. It was only in trying to describe ALGOL 58 that I realized that there was trouble about syntax decription. It was obvious that Post's productions were just the thing, and I hastily adapted them to that use.'

(Rest of email contains contacts to Peter Naur and Erik Frøkjær) --Dulik (talk) 13:24, 16 May 2011 (UTC)

BNF as used in the ALGOL 60 report[edit]

Original description of BNF - Backus Normal Form from the ALGOL reports.

Source of BNF comes from ALGOL 60[edit]

The following is from "Programming Systems and Languages" Edited by Saul Rosen Professor of Mathematics and Computer Science Purdue University McGraw-Hill 1967.

The book is a wealth of information on early computer language and operatoring system development. It contains a colection of early published papers by many authors. Many from the ACM.

This article provides an incite as to the function of BNF as used in the early ALGOL reports.

"The ALGOL Programming Language" by Saul Rosen Pages 48 - 78 and "Revised Report on Algorithmic Language--ALGOL 60" Pages 79 - 118.

BNF was created by John Backus to specify the ALGOL programming language syntax. He at one time said it came out of meeting notes he had taken. On another it was from a course by Martin Davis. I can relate to that. I have a highly analytical mind but a poor memory. I retain things I use a lot. But I repeat things when I am writing forgotten what that I had already covered it. I forget what I have written. My spelling is atrocious. But what I do is know how things work. Call it scientific intuition. Anyway the point is Backus was a highly intelligent man. And may not of remembered where he got the idea. In fact if he had gotten the idea from Chomsky or were ever it was not a production grammar.

Anyway Naur had little to do with inventing BNF. He used it in the ALGOL 60 report to describe language constructs. It was not explained what could have been changed. BNF only has four metasymbols. ::=,|,<,> The commas are not a part of BNF but English puntuation marks. As Saul Rosen puts it they are part of the metalanguage English being used to describe BNF.

While Saul Rosen was not listed in the ALGOL 60 report. According to his biography he was a member of the ACM working group. His description of BNF is much more interesting in lite of his participation in ACM ALGOL working group.

The ALGOL 60 report was written by J. W. Backus, F.L. Bauer, J. Green, C. Katz, J. McCarthy, A. J. Perlis, H. Rutishauser, K. Samelson, B. Vauquios, J. H. Wegstein, A. Van Wijugaargden and M. Woodger, Peter Naur(Editor)

Where did the idea of BNF come from? The actual history is mostly lost. This book I think gives some insite on that question. Martin Davis was primarly a mathematician. Maybe as rummored it was a course by Davis. Soul Rosen also a mathematician discribs it as a analytical language. The ALGOL 60 report uses BNF as a reductive grammar metalanguage. The term reductive came from some ware. It is used in early meta compiler documents.

Saul Rosen became active in the Association for Computing Machinery (ACM) in 1947, first on the languages committee that eventually led to the ALGOL programming language, and then as first managing editor of the Communications of the ACM. He wrote extensively on practical systems programming and authored his major book, Programming Systems and Languages (McGraw-Hill, New York), in 1967.

Reading carfully the ALGOL 60 report and Saul Rosen's ALGOL article it seams that BNF is a reductive grammar. Reductive being the polar opsite of productive. Basically the original term for what some call an analytical grammar.[8]

That is a bit vague but maybe looking at Saul's explanation and the ALGOL 60 report it can be refined. BNF is used to describe ALGOL language constructs and to attach semantic meaning to them. A named language construct is defined by a BNF rule having the same name. Every BNF rule names a language construct.

<language construct name>::=<construct pattern>


<arithmetic expression>::=<arithmetic term>|<arithmetic expression><adding operator><arithmetic term>

The ALGOL paper uses descriptive names. The full <arithmetic expression> would be given including <arithmetic term> etc. Following the BNF syntax rules would be examples and a semantic description describing it's operation and valid value ranges etc.

They were not only used to describe the syntax but the allowable variable types etc. An integer rule specifically defined how a integer should be implemente hardware dependently. The number of bits or digits matching the hardware implementation.

<arithmetic expression> had semantic description describing conversion rules to higher types. Integer to floating. Upscaling etc.

Description of BNF - by Saul Rosen[edit]

The syntax of a language is a set of rules by means of which it is possible to determine whether any given string is a valid string in the language.[8]

The syntax rules define classes of admissible strings, and, in a language of any degree of complexity, will provide rules by means of which more complex classes can be built up from (defined in turns of) simpler class. The semantics of a language is a set of rules that assigns meaning to valid strings in the language.

In BNF a class name is delimited by angle brackets. i.e. <class name>. A class name defines the format of a valid string. Class names are unique and descriptions of their meaning were given in textual descriptions in the ALGOL 60 report.

The metasymbols used in BNF are ::=, |, <, >. There are only four symbols defined here. The , and . are part of the metalanguage, english, in which we are describing the Backus Normal Form.

We write:
<digit> ::= 0|1|2|3|4|5|6|7|8|9

The metasymbols < > are used as delimiters to enclose the name of a class. The metasymbol ::= may be read as "is defined as" or "consists of". The | is read as "or". The above phrase defines the class digit. There are ten alternatives which satisfy the defination and these alternatives are listed explicity and seperated by |. We could define a class <digit pair> as follows.

<digit pair> ::= <digit>,<digit>

Here <digit pair> is defined in terms of the previously defined class <digit>. The class name <digit> appearing to the right of the metasymbol ::= is read as "any member of the class <digit>". The comma between the two apperances of <digit> stands for itself. In words a <digit pair> consists of any member of the class <digit> followed by a comma followed by any member of the <digit> class. Thus 3,4 and 3,3 and 0,0 are all members of the <digit pair> class. 36 and 44 are not members of that class.

Note. The above descriibes a reductive grammar as described in the McGraw-Hill technical term dictionarry. [8]

ALGOL 60 report BNF description[edit]

The syntax will be described with the aid of metalinguistic formula. Their interpretation is best explained by an example.

<ab> ::= (|[|<ab>(|<ab><d>

Sequences of characters enclosed in brackets <> represent metalinguistic variables whose values are sequences of symbols. The ::= and | (the latter with the meaning of or) are metalinguistic connectives. Any mark in a formula, which ia not a variable or a conective, denotes itself (or the class of marks which are simular to it). Juxtaposition of marks and/or variables in a formula signifies a juxtaposition of the sequence denoted.Thus the formula above gives a recursive rule for the formation of values of the variable <ab.. It indicates that <ab> may have the value ( or [ or given some legitimate value of <ab>, another may be formed by following it with the character ( or by following it withe some value of the variable <d>. If the values of <d> are the decimal digits, some values of <ab> are:


The BNF for an arithmetic expression from the ALGOL 60 report below illistrates how classes are described using simpler classes.

<arithmetic expression> ::= <simple arithmetic expression>|
<if clause><simple arithmetic expression>else
<arithmetic expression>
<if clause> ::= if <boolean expression> then
<simple arithmetic expression> ::= <term>|
<adding operator><term>|<simple arithmetic expression>
<adding operator><term>
<term> ::= <factor>|<term> <multiplying operator><factor>
<factor> ::= <primary>|<factor>^<primary>
<primary> ::= <unsigned number>|<variable>|
<function designator>|(<arithmetic expression>)
<multiplying operator> ::= ×|/|÷
<adding operator> ::= +|-

Textual description of the semantic meaning were given for each major class. Alternative definations of a class are seperated by the | character.

An example from the ALGOL report:

2.4 Identifiers.
2.4.1 Syntax
2.4.2 Examples

2.4.3 Semantics

Identifiers have no inherent meaning, but serve for the identification of simple variables, arrays, labels, switchs, and procedures. They may be chosen freely (cf. section 2.7 quantities, kinds and scopes, and section 5. Declarations).

BNF origionally reductive/analytical[edit]

As described by Saul Rosen and used in the ALGOL 60 reports BNF was not using productive rules. It is not described in terms of terminal and nonterminals. In fact a rule defining a named class must be unique.

<class name> ::= <pattern>

There can be only one rule defining a <class name>. The rule is a specification of a valid language construct. The <pattern> specifying all possible language construct patterns of the named class.

In most cases it makes no difference which it is called. The rules would be identical.

Productive rewrite rules[edit]

Using production rules to generate a string in the language, one begins with a string consisting of only a single start symbol, and then successively applies the rules (any number of times, in any order) to rewrite this string.

In programming recursion is a function calling it self. In a production grammar rule that is not the case. There is no recursive calling of a rule. The grammar generation algorithm must be understood as a simple loop as explained. Consider the context sensitive grammar where , , is the start symbol, and consists of the following production rules:


This grammar defines the language where denotes a string of n consecutive 's. Thus, the language is the set of strings that consist of 1 or more 's, followed by the same number of 's, followed by the same number of 's.

Chomsky grammar rules have a pattern to be replaced by a string. i.e.

<pattern> <replacement string>

The context sensitive grammar above illistrates the major differance between a productive form above and reductive rules as explained in the ALGOL papers that are of the form:

<class name> ::= <pattern>
  • Chomsky productions allow multipal rules having the same pattern.
  • Chomsky production rules are string replacement rule. They are looking at an existing string matching a <pattern> that is replaced by the <replacement string> repeated until the string contains no non-termonals.

The question I put to you. Are Chomsky production rules an effective metalanguage in describing programming languages.

Do Chomsky production rules piovide a consise way of providing semantic descriptions of language constructs. Compare the <pattern> of Chomsky rules with the <descriptive language construct (class name)> in the BNF of the ALGOL specification.

For example in the ALGOL 60 specification paper we have:

<arithmetic expression> ::= ...

Then following we have examples and the semantic description of it implementation.

Same for:

<boolean expression> ::= ...

<arithmetic expression> and <boolean expression> are not non terminal symbols but language construct names that have semantic meaning attached. They have a single unique definition and rule describing the syntax.

Get the ALGOL 60 paper and check this out. The ALGOL compiler document we had for the DEC-10 described the language in the same way the ALGOL paper did.

Metalanguag usage[edit]

In ALGOL corrospondanse papers the ALGOL BNF is used in referencing language constructs. For instance talking about an <if_expression> construct. They are top-down descriptions.

<if_expression> ::= if <boolean_expression> then <expression> else <expression>

BNF was a metalanguage designed to talk about ALGOL.Steamerandy (talk) 19:05, 27 March 2015 (UTC)


I think the major problem of this BNF topic is not honering the creator of BNF. John Backus was the author of the Fortran language. He was a compiler writer. His interest was in describing the syntax in a way a compiler could be written from the specification of the language. FORTRAN stands for FORmula TRANslation. So maybe their was no outside influence. It could be that BNF came out of his experience writing compilers. Algebra and other math books described mathematical formula in a simular manner. Term and factor are older terms used in describing algebraic formule. In fact algebraic formula are a matalanguage used in talking about mathematical calculations.

BNF rules are more akin to algebraic formula then linguistic productions.

At the vary least it should be described as in the originating documents. Not in terms like terminal and non-terminal symbols. They are descriptive names of language constructs that can be described in a natural language. Otherwise this topic is not about BNF but some dialect and should be described as such. Steamerandy (talk) 21:51, 24 March 2015 (UTC)

Another source[edit]

In his book "Computers and Languages" (ISBN 0-444-70463-9), section "BNF vs. Contxt-Free" on p.207–210, Anton Nijholt gives another hint. Neither Chomsky nor Backus ever cite each other before the advent of an article by Saul Gorn in CACM (1962), p.62 mentioning a relationship between Chomsky's phrase structure languages (then not yet "context-free") and the languages specified by BNF. In letters to the editor of CACM, this point was discussed in the sequel. It looks as if Gorn was the first one to discover this relationship between the two concepts developed independently by Chomsky and Backus. Given the very short time-span both Chomsky and Backus had to notice each other, this looks plausible, and this the more so as their approaches were quite different.--Lantani (talk) 14:38, 14 February 2016 (UTC)

Canonical Normal Form[edit]

Further research shows another possible source. And that Knuth was wrong about it not being a normal form. In fact it was that comment that got me looking at where normal form might have came from if not from Chomsky.

Canonical normal form is another likely possibility. Searching ACM papers of that period and before Boolean Canonical normal form equations were commonly used in designing computer circuitry. There are many papers on using boolean algerbra in circuit design.

Canonical normal form and Backus Normal Form are boolean equations. A Canonical Normal Form equation is called a class defination in old documents. Naur described Backus equations as defining classes. Names in < > are classes. It is a possibility that BNF came from within IBM. Proprietary IBM research that he could not be disclose as his source. Although Canonical Normal Form has been around sense boolean algerbra was first developed.

I have to backtrack and acknowledge BNF as nether a productive or reductive grammar but linguistic logic equations that can be mathematically manipulated. They can be factored etc. The difference between Canonical Normal Form and Backus Normal Form is that "and" is implied and ordered.


In the above the classes a,b, and c are factors. <a><c> is an ordered "and" where a and c must both be true. If either is not than an alternatives must be true for x to be true.

Is was said that Peter Naur is the one who named it Backus Normal Form. Peter Naur was also a mathematician.

ALGOL is a mathematical algorithmic language. Does it not make more sense that BNF also had it's roots in mathmatics. Most of the ALGOL participants were mathematicians. They all quickly accepted and understud BNF.

As a math major in the late 60s I instantly recognized the BNF language as algebraic in nature. Steamerandy (talk) 04:24, 5 April 2015 (UTC)

Personal experience[edit]

I was born in 1947 and got interested in electronics in the late 50's. I learned about digital circuits around the time the ALGOL 60 report was being written. I learned boolean equations then. It was in electronic magazines of the day. My computer addiction started in 1965 when I took a FORTRAN course at Cerritos collage. At the time I was a high school senior taking Calculus in the evening. The next year when I started taking programming classes I got hold of an ALGOL manual. It basically was the ALGOL 60 report description with spicifics for the actual implementation. The BNF I recognized as boolean formule. My background was math, electronics, and physics. I had no linguistic background other than sentence diagrams. I noticed the simularaty of BNF to sentence diagrams.

My point is that Backus and Naur were mathamations having simular backgrounds. I learned most of my math from books my older brother gave me. He was 14 years my senior. The books were from the 40s and likely close to those Backus might have studied. Backus being a compiler developer for IBM would have worked close with the hardware engineers using boolean equations in circuit design. It is quite possible that BNF simply came out of his experience writing FORTRAN and knowing boolean algerbra.

It is not that far fetched.

I took CWIC to the next level. CWIC is an advanced meta compiler in the Schorre line of metacompilets. It was a real compiler writing system able to produce relocatable IBM 360 code. It was ment to be expandable to target other machines. But was locked to an 8 bit memory archaicteture. At that time we had a variety of addressable units on various computers. 6 bit character addressable memory to 48 bit words. Many different instruction formats etc.

I had been to an ACM meeting on the B-3500 that used microprogramed instructions that implemented machine instructions for spicifics languages. The hard ware implemented a bit addressable memory space.

I used that concept in creating SLIC. I designed a language to define machine instructions in a bit addressable memory space. Instructions we defined to start in some modular addressable unit. 8 bit byte aligned modular 8. Some computers like the TI-990 required instructions to be 16 bit word aligned and used word addresses in addressing instructions. instructions were modulo 16 while byte and character data and constants were modulo 8.

The point is that I took a hardware concept and applied it to a language design used to define assembly instruction and their translation to binary code. Actually they were procedures whose call was written in assembly format. These were used in pseudo code procedures that resembled high level assembly macros. These two sublanguages separated the hardware code production out of the parsing and tree crawling generator language of CWIC.

It's not that hard to see that BNF may of had no formal linguist roots and came solely from math and computer fields.

Naur named it Backus Normal Form. Could that have been because of it simularaty to Canonical normal form.

Steamerandy (talk) 21:14, 9 April 2015 (UTC)

The examples are BNF dialects[edit]

BNF has only 4 meta symbols ::=, |, < ,>. The commas are not BNF meta symbols but part of the metalanguage English being used to describe it. EBNF uses quote marks. BNF does not. The examples are wrong. BNF has no line break recognition ability or white space per say. That is part of the reasons it is wrong to clame BNF a production grammar and define it using linguistics terminology like terminal and non-terminal.

A BNF rule defines a named language construct class. The class name can then be used to talk about the rules meaning and other specifics not specified by the rule. White space and line breaks for example. A class may also be defined using natural language if necessary. BNF was a publication language for describing ALGOL. Steamerandy (talk) 00:22, 29 March 2015 (UTC)


->>There is something wrong here. The reference section is being generated. But index is messed up. Thought adding this section might fix it. But didn't. Can someone please take a look at this. Has to do with maybe the hidden refs below. Or maybe page as there is no edit at page level.


These references will appear in the article, but this list appears only on this page.
  1. ^ Fulton, Scott M., III (2007-03-20). "betanews". Retrieved August 14, 2010.  |contribution= ignored (help)
  2. ^ Backus, J.W. (1959). "Proceedings of the International Conference on Information Processing". UNESCO: 125–132.  |contribution= ignored (help)
  3. ^ Farrell, James A. (August 1995). "Extended Backus Naur Form". Retrieved May 11, 2011.  |contribution= ignored (help)
  4. ^ Booch, Grady (2006-09-05). "Oral History of John Backus" (PDF). Retrieved May 11, 2011.  |contribution= ignored (help)
  5. ^ Morrison, Kelly (2006-09-05). "Backus Normal Form vs. Backus Naur Form". Retrieved May 11, 2011.  |contribution= ignored (help)
  6. ^ Wexelblat, Richard L. (June 1981?). "History of Programming Languages". Retrieved May 11, 2011.  Check date values in: |date= (help); |contribution= ignored (help)
  7. ^ Naur, Peter (1958?). "Revised Report on the Algorithmic Language Algol 60". Retrieved May 11, 2011.  Check date values in: |date= (help); |contribution= ignored (help)
  8. ^ a b c
    Reductive grammar: (cogmputer science) A set of syntactic rules for the analysis of strings to determine whether the strings exist in a language. "Sci-Tech Dictionary McGraw-Hill Dictionary of Scientific and Technical Terms, 6th edition, published by The McGraw-Hill Companies, Inc.". 


please leave my entry alone. do not "competitively" delete it. if there is a list of software using bnf then this, fairly, is an item:

  • bnf2xml Markup input with XML tags using advanced BNF matching.

(cur | prev) 16:46, 24 May 2013‎ (talk)‎ . . (26,317 bytes) (-140)‎ . . (Undid revision 556521332 by (talk)) (undo)

deleted my little entry not just on this topic page but also on wholey different topic / page. not a coincidence.

The project you link to appears to be code written by a single contributor -- perhaps you? Please check WP:NOTADVERTISING and WP:COI before reinstating the links. Gareth Jones (talk) 13:24, 10 June 2013 (UTC)

bnf syntax highlighting lost[edit]

Since the switch from Geshi to Pygments for syntax highlighting (phab:T85794), support for 'bnf' was unfortunately dropped, as can be seen with the plain text formatting on this page and many others like Janus (programming language), IBM i Control Language, Flow chart language, Syntax diagram, Feed URI scheme and Talk:Terminal and nonterminal symbols. If you want specialised 'bnf' syntax highlight support again, it will need to be added to Pygments. I have put up a patch for 'bnf' to use the 'ebnf' handler, which does the right thing as far as I can see. Are there parts of bnf syntax which are not valid in ebnf? If there are any oddities with 'bnf' syntax which dont work with 'ebnf' handler, please let me know. John Vandenberg (chat) 11:42, 12 July 2015 (UTC)

A proper bnf lexer has been added to Pygments; it will be available in the next release of Pygments (no idea when that will be), which I'd expect to then be available to Wikipedia soon after that. John Vandenberg (chat) 20:08, 12 December 2015 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on Backus–Naur Form. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

Question? Archived sources still need to be checked

Cheers.—InternetArchiveBot (Report bug) 17:45, 13 September 2016 (UTC)