Jump to content

Talk:Regular expression

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 90.190.113.12 (talk) at 16:05, 22 February 2013 (→‎Problem with "not preceded by"?). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

WikiProject iconComputing C‑class High‑importance
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
CThis article has been rated as C-class on Wikipedia's content assessment scale.
HighThis article has been rated as High-importance on the project's importance scale.
WikiProject iconComputer science C‑class High‑importance
WikiProject iconThis article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
CThis article has been rated as C-class on Wikipedia's content assessment scale.
HighThis article has been rated as High-importance on the project's importance scale.
Things you can help WikiProject Computer science with:

Problem with "not preceded by"?

One of the examples given is:

* the word "car" when not preceded by the word "motor"

Then it says "These examples are simple."

I don't think that's a simple example - in fact I can't think of a way to match that. Either I'm being dim (quite possible) or that example should be removed. Jj Banana (talk) 15:12, 13 February 2012 (UTC)[reply]

Doesn't "[^motor]car" work? -- Technical 13 (talk) 23:30, 17 April 2012 (UTC)[reply]
That excludes things like "rotomcar", e.g., any permutation of m/o/t/r/ TEDickey (talk) 00:12, 18 April 2012 (UTC)[reply]
You can express this using a regular expression like "([^m]otor|^otor|[^o]tor|^tor|[^t]or|^or|[^o]r|^r|[^r]|^)(car)" or "((((([^m]|^)o|[^o]|^)t|[^t]|^)o|[^o]|^)r|[^r]|^)(car)" — tedious to type, but still polynomial (the latter even linear) in length with respect to the original words ("motor" and "car"). In Perl or Java etc., you can simply write "(?<!motor)car". 90.190.113.12 (talk) 16:05, 22 February 2013 (UTC)[reply]

Move formal definition to end

I think that the formal definition should be moved to the end of the article. Most of the people accessing this article want to see some example. Therefore the article should present the examples first and the formal definition later. — Preceding unsigned comment added by 89.23.239.59 (talk) 18:19, 25 March 2012 (UTC)[reply]

Regex as C standard library

It is said in the article that: "For yet other languages, such as Object Pascal(Delphi) and C and C++, non-core libraries are available"...

But regex does actually exist as a C core library; "man regex" gives me that on BSD OS:

REGEX(3) BSD Library Functions Manual REGEX(3)

NAME

    regcomp, regerror, regexec, regfree -- regular-expression library

LIBRARY

    Standard C Library (libc, -lc)

69.80.96.81 (talk) 04:48, 28 May 2012 (UTC)[reply]

The term "core library" means whether it is part of the C language standard (or analogous POSIX standards), not whether some system happens to store the feature in libc. It is easy to find nonstandard functions in most libc's TEDickey (talk) 09:28, 28 May 2012 (UTC)[reply]

Citation for original popularity of regular expressions following success of 'ed' and 'grep'

The third sentence of the article is flagged {citation needed}, but I don't know how to fix this within the article.

I would like to suggest this reference:

"Mastering Regular Expressions, 2nd Edition from O'Reilly, by Jeffrey E. F. Friedl; Chapter 3 "Overview of Regular Expression Features and Flavors", page 85, under the heading "The Origins of Regular Expressions", third and fourth paragraphs:

"Although there is evidence of earlier work, the first published computational use of regular expressions I have actually been able to find is Ken Thompson's 1968 article Regular Expression Search Algorithm in which he describes a regular-expression compiler that produced IBM 7094 object code. This led to his work on qed, an editor that formed the basis for the Unix editor ed.

ed's regular expressions were not as advanced as those in qed, but they were the first to gain widespread use in non-technical fields. ed had a command to display lines of the edited file that matched a given regular expression. The command, "g/Regular Expression/p", was read "Global Regular Expression Print." This particular function was so useful that it was made into its own utility, grep (after which egrep--extended grep--was later modeled."

--hope this helps leeeoooooo [002012-06-05] — Preceding unsigned comment added by Leeeoooooo (talkcontribs) 01:12, 6 June 2012 (UTC)[reply]

Have this book too and was actually planning to use the above quote as a citation. The book is well researched and would say written by someone who know regular expression very well. So I am hoping no one would object to us using the above book as a citation. 24.212.138.15 (talk) That was me gathima (talk)


The document http://genius.cat-v.org/brian-kernighan/articles/beautiful suggests the same:

"Regular expressions first appeared in a program setting in Ken Thompson's version of the QED text editor in the mid-1960's. In 1967, Ken applied for a patent on a mechanism for rapid text matching based on regular expressions; it was granted in 1971, one of the very first software patents [US Patent 3,568,156, Text Matching Algorithm, March 2, 1971]. [...] Regular expressions moved from QED to the Unix editor ed, and then to the quintessential Unix tool, grep, which Ken created by performing radical surgery on ed."

Also, roughly the same text appears in the book "Beautiful Code: Leading Programmers Explain How They Think" (the above link is a draft).

70.82.120.78 (talk) 19:23, 15 August 2012 (UTC)[reply]

Which algorithm is working behind REGEX...

What is the pattern matching algorithm which is actually working at the ground level is it 1. KMP Matching Technique 2. Rabin Karp... Questions araised since the Text is converted internally into char array and the pattern is matched over that.... — Preceding unsigned comment added by 106.51.151.241 (talk) 15:16, 2 December 2012 (UTC)[reply]