Talk:Regular expression

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 
WikiProject Computer science (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 

Cleanup needed[edit]

With all due respect to previous editors, I think the present state of this article is quite poor. I remember coming to this article several years ago, before I knew regexes, and leaving even more confused. It could almost do with a total rewrite in many places. Rather than trying to cover all possible aspects of regex syntax, it should present more information about regexes and defer to (say) Wikibooks (to which some of this content should be moved, if appropriate).

I might have a go at some stage, but the door is wide open for anyone keen for a spot of pruning and rewriting here. — This, that and the other (talk) 11:12, 19 July 2013 (UTC)

History[edit]

When did Ken Thompson build Kleene's notation into the editor QED? In other words, when were Regular Expressions first used in software? The QED page is more vague about that. Sam Tomato (talk) 20:33, 30 September 2013 (UTC)

I found this link: http://cm.bell-labs.com/who/dmr/qed.html The historical survey is written by Dennis Ritchie, a long-time colleague of Thompson. There it is noted that Thompson published the idea of converting regular expressions into IBM 7094 machine code (i.e., code corresponding to a nondeterministic finite automaton) in the CACM paper "Programming Techniques: Regular expression search algorithm", which appeared in June 1968. http://dl.acm.org/citation.cfm?doid=363347.363387

Hermel (talk) 19:45, 9 October 2013 (UTC)

Well, that makes three questions: (1) when did Thompson use regex's in QED, (2) when were regex's first used in software, and (3) does the link have anything to do with either of the first two questions (I don't see "QED" or "editor", in the paper) TEDickey (talk) 20:32, 9 October 2013 (UTC)

Examples[edit]

Why all the examples are in PHP? Shouldn't they be in pseudocode or in a more-common-syntax language? I find that the $ in the variables names can be confused with the regexs syntax. 186.136.108.233 (talk) 13:52, 11 February 2014 (UTC)

I'm adding links to search online regular expression testers and to one specific example, which are an excellent way to explore regular expressions with sufficiently equipped browsers, but require Wikipedia:EL#Rich_media exensions. - Tatzelbrumm (talk) 10:52, 22 May 2014 (UTC)

Thomson-kleene-star.svg is wrong[edit]

The file Thomson-kleene-star.svg is wrong. There is a transition missing between the state to the right of q, and the state to the left of f. Also this could be written as a determinalistic finite automata with two states. Using a NFA is unessesary and confusing to new readers.

Here is the offending file:

The Kleene star: "zero or more".

Rekahsoft (talk) 05:41, 30 May 2014 (UTC)

Thanks. Any chance of a fix? Or can you link to a webpage that has a correct state diagram? I have not thought about the issue, but in general you are correct that the article should have a simple and comprehensible diagram rather than some kind of optimisation. A complication is that the diagram is also used at Thompson's construction algorithm and that article is written confidently. Johnuniq (talk) 06:28, 30 May 2014 (UTC)
I agree the picture is difficult to understand; its caption is too brief, and I couldn't find an explaining reference to it in the running text. However, after reading Thompson's construction algorithm, I think the automaton is not wrong. The oval labelled "N(s)" is supposed to denote the subautomaton corresponding to the regular expression s, when the whole picture shows an NFA for s*, i.e., for "zero or more of s". So if s is just a reg.exp. consisting of a single alphabet letter (e.g. "a"), then N(s) would contain the transition (labelled "a") you missed. If s is e.g. "a|b", then N(s) is more complex. - Jochen Burghardt (talk) 07:44, 30 May 2014 (UTC)
I colored the subautomata in the pictures from Thompson's construction algorithm, and rephrased the lead of Regular expression accordingly. Hope it's better now. - Jochen Burghardt (talk) 10:18, 30 May 2014 (UTC)

Regular Expression development tools[edit]

I didn't see a section about regular expression tools. I recently found this interesting Windows program that will allow a person to click on fields to create an expression. http://www.ultrapico.com/Expresso.htm Does anyone know of other similar tools?SbmeirowTalk • 16:12, 1 July 2014 (UTC)