Jump to content

User:DOI bot/Zandbox: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m [Pu399β]Citation maintenance: Fixing/testing bugs. Problems? Contact the bot's operator.
Deleting refs/
Line 1: Line 1:
{{Lead missing|date=May 2010}}


==Notes==
==Libraries==
{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto; table-layout: fixed;"
{{reflist
|+ List of regular expression libraries
| colwidth = 30em
|-
| refs =
! style="width: 12em" |

! Official website
<ref name="London Gazette 17 May 1848">
! [[Programming language]]
{{London Gazette
! [[Software license]]
| issue = 20857
|-
| startpage = 1935
! {{Rh}} | [[Boost C++ Libraries|Boost.Regex]] {{R|group=Note|boost_regex_formerly_regex}}
| endpage = 1938
| [http://www.boost.org/libs/regex/doc/html/index.html Boost C++ Libraries]
| date = 17 May 1848
| C++
| accessdate = 2008-02-05
| [http://www.boost.org/LICENSE_1_0.txt Boost Software License]
|-
! {{Rh}} | [[Boost C++ Libraries|Boost.Xpressive]]
| [http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/index.html Boost C++ Libraries]
| C++
| [http://www.boost.org/LICENSE_1_0.txt Boost Software License]
|-
! {{Rh}} | [[CL-PPCRE]]
| [http://weitz.de/cl-ppcre/ Edi Weitz]
| [[Common Lisp]]
| [[BSD licenses|BSD]]
|-
! {{Rh}} | [[cppre]]
| [http://jeff.bleugris.com/journal/projects/ Jeff Stuart]
| C++
| [[GPL]]
|-
! {{Rh}} | DEELX
| [http://www.regexlab.com/en/deelx/ RegExLab]
| C++
| "free for personal use and commercial use"
|-
! {{Rh}} | FREJ {{R|group=Note|fuzzy_regexp_libraries}}
| [http://frej.sf.net Fuzzy Regular Expressions for Java]
| [[Java (programming language)|Java]]
| [[LGPL]]
|-
! {{Rh}} | [[GLib]]/GRegex {{R|group=Note|glib_gregex_version}}
| [http://www.barisione.org/gregex/glib-Perl-compatible-regular-expressions.html Marco Barisione]
| C
| [[LGPL]]
|-
! {{Rh}} | [[GRETA]]
| [http://research.microsoft.com/projects/greta/ Microsoft Research]
| C++
| {{?}}
|-
! {{Rh}} | [[International Components for Unicode|ICU]]
| [http://www.icu-project.org/userguide/regexp.html International Components for Unicode]
| C/C++/[[Java (programming language)|Java]]
| [http://source.icu-project.org/repos/icu/icu/trunk/license.html ICU license]
|-
! {{Rh}} | [[Jakarta Project#Subprojects|Jakarta]]/Regexp
| [http://jakarta.apache.org/regexp/ The Apache Jakarta Project]
| [[Java (programming language)|Java]]
| [[Apache License]]
|-
! {{Rh}} | JRegex
| [http://jregex.sourceforge.net/ JRegex]
| [[Java (programming language)|Java]]
| [[BSD License|BSD]]
|-
! {{Rh}} | [[Oniguruma]]
| [http://www.geocities.jp/kosako3/oniguruma/ Kosako]
| C
| [[BSD licenses|BSD]]
|-
! {{Rh}} | Pattwo
| [http://www.javaregex.com/home.html Stevesoft]
| [[Java (programming language)|Java]] (compatible with Java 1.0)
| [[GNU Lesser General Public License|LGPL]]
|-
! {{Rh}} | [[Perl Compatible Regular Expression|PCRE]]
| [http://www.pcre.org/ Philip Hazel]
| C/C++{{R|group=Note|pcre_cpp}}
| [[BSD licenses|BSD]]
|-
! {{Rh}} | [[Qt (toolkit)|Qt]]/QRegExp
| [http://doc.trolltech.com/4.7/qregexp.html]
| C++
| [http://www.qtsoftware.com/products/licensing/licensing#qt-gnu-gpl-v Qt GNU GPL v. 3.0] / [http://www.qtsoftware.com/products/licensing/licensing#qt-gnu-lgpl-v Qt GNU LGPL v. 2.1] / [http://www.qtsoftware.com/products/licensing/licensing#qt-commercial-license Qt Commercial]
|-
! {{Rh}} | regex - [[Henry Spencer]]'s regular expression libraries
| [http://arglist.com/regex/ ArgList]
| C
| [[BSD licenses|BSD]]
|-
! {{Rh}} | [[re2]]
| [http://code.google.com/p/re2/ Google Code]
| C++
| [[BSD licenses|BSD]]
|-
! {{Rh}} | [[TRE (computing)|TRE]] {{R|group=Note|fuzzy_regexp_libraries}}
| [http://laurikari.net/tre/ Ville Laurikari]
| C
| [[BSD licenses|BSD]]
|-
! {{Rh}} | TPerlRegEx
| [http://www.regexbuddy.com/delphi.html TPerlRegEx VCL Component]
| [[Object Pascal]]
| [[Mozilla Public License|MPLv1.1]]
|-
! {{Rh}} | TRegExpr
| [http://regexpstudio.com/TRegExpr/TRegExpr.html RegExp Studio]
| [[Object Pascal]]
| double licensed: [[Freeware]] or [[LGPL]] with static linking exception
|-
! {{Rh}} | RGX
| [https://www.p6r.com/software/rgx.html RGX ]
| C++ based component library
| [https://www.p6r.com/p6r-software-library-license-1-1.html P6R license]
|}
{{Reflist|group=Note|refs=
<ref name=boost_regex_formerly_regex>formerly called Regex++</ref>
<ref name=glib_gregex_version>included since version 2.13.0</ref>
<ref name=pcre_cpp>C++ bindings were developed by Google and became officially part of PCRE in 2006</ref>
<ref name=fuzzy_regexp_libraries>one of [[Regular expression#Fuzzy Regular Expressions|fuzzy regular expression]] engines</ref>
}}
}}
</ref>


==Languages==

{| class="wikitable sortable" style="text-align: center; font-size: 0.85em; line-height: 1.3em; width: auto; table-layout: fixed;"

|+ List of languages and frameworks coming with regular expression support
<ref name="London Gazette 25 October 1870">
|-
{{London Gazette
! Language
| issue = 23671
! Official website
| startpage = 4593
! [[Software license]]
| date = 25 October 1870
! Remarks
| accessdate = 2008-02-05
|-
! {{Rh}} | [[.NET framework|.NET]]
| [http://msdn2.microsoft.com/en-us/library/system.text.regularexpressions.aspx MSDN]
| Proprietary
| style="text-align: left;" | {{-}}
|-
! {{Rh}} | [[C++0x|C++]]
|
|
| style="text-align: left;" | since ISO14822:2011(e)
|-
! {{Rh}} | [[D (programming language)|D]]
| [http://www.digitalmars.com/d/index.html D]
| [[Boost Software License]]{{R|group=Note|boost_mars}}
| style="text-align: left;" | {{-}}
|-
! {{Rh}} | [[Go (programming language)|Go]]
| [http://golang.org/pkg/regexp/ Golang.org]
| [http://golang.org/LICENSE BSD-style license]
| style="text-align: left;" | {{-}}
|-
! {{Rh}} | [[Haskell (programming language)|Haskell]]
| [http://haskell.org/haskellwiki/Regular_expressions Haskell.org]
| BSD3
| style="text-align: left;" | Not included in the language report; nor in GHC's Hierarchical Libraries
|-
! {{Rh}} | [[Java (programming language)|Java]]
| [http://www.java.com Java]
| [[GNU General Public License]]
| style="text-align: left;" | REs are written as strings in source code (all backslashes must be doubled, hurting readability).
|-
! {{Rh}} | [[JavaScript]]/[[ECMAScript]]
| {{-}}
| {{?}}
| style="text-align: left;" | Limited but REs are first-class citizens of the language with a specific <code>/.../mod</code> syntax.
|-
! {{Rh}} | [[Lua (programming language)|Lua]]
| [http://www.lua.org Lua.org]
| [[MIT License]]
| style="text-align: left;" | Uses a simplified, limited dialect. Can be bound to a more powerful library, like PCRE or an alternative parser like LPeg.
|-
! {{Rh}} | [[Object Pascal]] ([[Free Pascal]])
| [http://www.freepascal.org www.freepascal.org]
| [[LGPL]] with static linking exception
| style="text-align: left;" | Free Pascal 2.6+ ships with TRegExpr from Sorokin as well as with 2 other regular expression libraries. See http://wiki.lazarus.freepascal.org/Regexpr
|-
! {{Rh}} | [[Objective-C]] ([[Cocoa (API)|Cocoa]] on iOS only)
| [http://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html Apple]
| [[Proprietary]]
| style="text-align: left;" | Currently only available on iOS 4+
|-
! {{Rh}} | [[OCaml]]
| [http://caml.inria.fr/pub/docs/manual-ocaml/libref/Str.html Caml]
| [[LGPL]]
| style="text-align: left;" | {{-}}
|-
! {{Rh}} | [[Perl]]
| [http://www.perl.com/doc/manual/html/pod/perlre.html Perl.com]
| [[Artistic License]] or the [[GNU General Public License]]
| style="text-align: left;" | Full, central part of the language.
|-
! {{Rh}} | [[PHP]]
| [http://www.php.net/manual/en/reference.pcre.pattern.syntax.php PHP.net]
| [[PHP License]]
| style="text-align: left;" | Has two implementations, with PCRE being the more efficient (speed, functionalities).
|-
! {{Rh}} | [[Python (programming language)|Python]]
| [http://docs.python.org/lib/module-re.html python.org]
| [[Python Software Foundation License]]
| style="text-align: left;" | {{-}}
|-
! {{Rh}} | [[Ruby (programming language)|Ruby]]
| [http://www.ruby-doc.org/docs/ProgrammingRuby/html/ref_c_regexp.html ruby-doc.org]
| [[GNU Library General Public License]]
| style="text-align: left;" | Ruby 1.8 and 1.9 use different engines; Ruby 1.9 integrates Oniguruma.
|-
! {{Rh}} | [[SAP ABAP]]
| [http://www.sap.com SAP.com]
| {{?}}
| style="text-align: left;" | {{-}}
|-
! {{Rh}} | [[Tcl]] 8.4
| [http://www.tcl.tk/ tcl.tk]
| [http://www.tcl.tk/software/tcltk/license.html Tcl/Tk License]<br>(Permissive, similar to BSD)
| style="text-align: left;" | {{-}}
|-
! {{Rh}} | [[ActionScript]] 3
| {{?}}
| {{?}}
| style="text-align: left;" | {{-}}
|-
|}
{{Reflist|group=Note|refs=
<ref name=boost_mars>http://www.digitalmars.com/d/2.0/phobos/std_regex.html</ref>
}}
}}
</ref>


==Language features==
<ref name="London Gazette 3 March 1871">
'''NOTE:''' An application using a library for regular expression support does not necessarily offer the full set of features of the library, e.g. GNU Grep which uses PCRE does not offer lookahead
{{London Gazette
support, though PCRE does.
| issue = 23712
| startpage = 1236
| date = 3 March 1871
| accessdate = 2008-02-05
}}
</ref>


===Part 1===
<ref name="London Gazette 24 March 1871">
{{London Gazette
| issue = 23720
| startpage = 1587
| endpage = 1598
| date = 24 March 1871
| accessdate = 2008-02-05
}}
</ref>


{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto; table-layout: fixed;"
<ref name="London Gazette 4 January 1878">
|+ Language feature comparison (part 1)
{{London Gazette
|-
| issue = 24539
! style="width: 12em" |
| startpage = 113
! "+" quantifier
| date = 4 January 1878
! Negated character classes
| accessdate = 2008-02-05
! Non-greedy quantifiers{{R|group=Note|non_greedy}}
! Shy groups{{R|group=Note|shy}}
! Recursion
! Lookahead
! Lookbehind
! Backreferences{{R|group=Note|backref}}
! >9 indexable captures
|-
| {{Rh}} | [[Boost.Regex]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}} {{R|group=Note|boost_regex_recursion}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[Boost.Xpressive]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}} {{R|group=Note|xpressive_recursion}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[CL-PPCRE]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[EmEditor]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
|-
| {{Rh}} | FREJ
| {{No}} {{R|group=Note|frej_non_greedy}}
| {{No}}
| {{Some}} {{R|group=Note|frej_non_greedy}}
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[GLib]]/GRegex
| {{yes}}
| {{?}}
| {{yes}}
| {{?}}
| {{No}}
| {{?}}
| {{?}}
| {{?}}
| {{?}}
|-
| {{Rh}} | [[Grep|GNU Grep]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{?}}
|-
| {{Rh}} | [[Haskell (programming language)|Haskell]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[Java (programming language)|Java]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[International Components for Unicode|ICU]] Regex
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[Just Great Software|JGsoft]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
! {{Rh}} | [[.NET framework|.NET]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[OCaml]]
| {{yes}}
| {{yes}}
| {{no}}
| {{no}}
| {{no}}
| {{no}}
| {{no}}
| {{yes}}
| {{no}}
|-
| {{Rh}} | [[OmniOutliner]] 3.6.2
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{No}}
| {{?}}
| {{?}}
|-
| {{Rh}} | [[Perl Compatible Regular Expression|PCRE]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[Perl]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[PHP]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[Python (programming language)|Python]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[Qt (toolkit)|Qt]]/QRegExp
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[re2]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{No}}
| {{yes}}
|-
| {{Rh}} | [[Ruby (programming language)|Ruby]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | [[TRE (computing)|TRE]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{yes}}
| {{No}}
|-
| {{Rh}} | [[Vim (text editor)|Vim]] {{Latest preview release/Vim}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
|-
| {{Rh}} | RGX
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
| {{Rh}} | TRegExpr
| {{yes}}
| {{?}}
| {{yes}}
| {{?}}
| {{?}}
| {{?}}
| {{?}}
| {{?}}
| {{?}}
|}
{{Reflist|group=Note|refs=
<ref name=non_greedy>''Non-greedy'' quantifiers match as few characters as possible, instead of the default as many. Note that many older, pre-[[POSIX]] engines were non-greedy and didn't have greedy quantifiers at all</ref>
<ref name=shy>''Shy groups'', also called ''non-capturing'' groups cannot be referred to with backreferences; non-capturing groups are used to speed up matching where the groups content needs not be accessed later.</ref>
<ref name=backref>''Backreferences'' enable referring to previously matched groups in later parts of the regex and/or replacement string (where applicable). For instance, ''([ab]+)\1'' matches "abab" but not "abaab"</ref>
<ref name=xpressive_recursion>http://www.boost.org/doc/libs/1_47_0/doc/html/xpressive/user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference</ref>
<ref name=boost_regex_recursion>http://www.boost.org/doc/libs/1_47_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.recursive_expressions</ref>
<ref name=frej_non_greedy>FREJ have no repetitive quantifiers, but have "optional" element which behaves similar to simple "?" quantifier</ref>
}}
}}
</ref>


===Part 2===
<ref name="London Gazette 15 October 1878">
{{London Gazette
| issue = 24633
| startpage = 5559
| date = 15 October 1878
| accessdate = 2008-02-05
}}
</ref>


{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto; table-layout: fixed;"
<ref name="London Gazette 11 August 1885">
|+ Language feature comparison (part 2)
{{London Gazette
|-
| issue = 25449
! style="width: 12em" |
| startpage = 3701
! Directives {{R|group=Note|directives_explanation}}
| date = 11 August 1885
! Conditionals
| accessdate = 2008-02-05
! Atomic groups {{R|group=Note|atomic_grouping_explanation}}
! Named capture {{R|group=Note|named_groups_explanation}}
! Comments
! Embedded code
! Partial matching{{Clarify|date=July 2011}}
! [[Fuzzy string searching|Fuzzy matching]]
! [[Unicode]] property support [http://www.unicode.org/reports/tr18/]
|-
| {{Rh}} | [[Boost.Regex]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{No}}
| {{Some}} {{R|group=Note|unicode_optional}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[Boost.Xpressive]]
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{No}}
| {{No}}
|-
| {{Rh}} | [[CL-PPCRE]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{?}}
| {{No}}
| {{No}}
|-
| {{Rh}} | [[EmEditor]]
| {{yes}}
| {{yes}}
| {{?}}
| {{?}}
| {{yes}}
| {{No}}
| {{yes}}
| {{No}}
| {{?}}
|-
| {{Rh}} | FREJ
| {{No}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{No}}
| {{yes}}
| {{?}}
|-
| {{Rh}} | [[GLib]]/GRegex
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{No}}
| {{Some}} {{R|group=Note|unicode_optional}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[Grep|GNU Grep]]
| {{yes}}
| {{yes}}
| {{?}}
| {{yes}}
| {{yes}}
| {{No}}
| {{?}}
| {{No}}
| {{No}}
|-
| {{Rh}} | [[Haskell (programming language)|Haskell]]
| {{?}}
| {{?}}
| {{?}}
| {{?}}
| {{?}}
| {{No}}
| {{?}}
| {{No}}
| {{No}}
|-
| {{Rh}} | [[Java (programming language)|Java]]
| {{yes}}
| {{no}}
| {{yes}}
| {{yes}} {{R|group=Note|available_java_7}}
| {{No}}
| {{No}}
| {{?}}
| {{No}}
| {{Some}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[International Components for Unicode|ICU]] Regex
| {{yes}}
| {{no}}
| {{yes}}
| {{No}}
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{yes}} {{R|group=Note|properties_all}}
|-
| {{Rh}} | [[Just Great Software|JGsoft]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{?}}
| {{Some}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[.NET framework|.NET]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{?}}
| {{No}}
| {{Some}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[OCaml]]
| {{no}}
| {{no}}
| {{no}}
| {{no}}
| {{no}}
| {{no}}
| {{?}}
| {{no}}
| {{no}}
|-
| {{Rh}} | [[OmniOutliner]] 3.6.2
| {{?}}
| {{?}}
| {{?}}
| {{?}}
| {{No}}
| {{No}}
| {{?}}
| {{No}}
| {{?}}
|-
| {{Rh}} | [[Perl Compatible Regular Expression|PCRE]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}} {{R|group=Note|available_pcre_70}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{Some}} {{R|group=Note|unicode_optional}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[Perl]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}} {{R|group=Note|available_perl_595}}
| {{yes}}
| {{yes}}
| {{No}}
| {{No}}
| {{yes}} {{R|group=Note|properties_all}}
|-
| {{Rh}} | [[PHP]]
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{No}}
|-
| {{Rh}} | [[Python (programming language)|Python]]
| {{yes}}
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{No}}
| {{Yes}}
| {{No}}
| {{No}}
|-
| {{Rh}} | [[Qt (toolkit)|Qt]]/QRegExp
| {{No}}
| {{No}}
| {{No}}
| {{No}}
| {{No}}
| {{No}}
| {{yes}}
| {{No}}
| {{No}}
|-
| {{Rh}} | [[re2]]
| {{yes}}
| {{No}}
| ?
| {{yes}}
| {{No}}
| {{No}}
| {{yes}}
| {{No}}
| {{Some}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[Ruby (programming language)|Ruby]]
| {{yes}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{No}}
| {{No}}
| {{Some}} {{R|group=Note|properties_limited}}
|-
| {{Rh}} | [[TRE (computing)|TRE]]
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{yes}}
| {{No}}
| {{No}}
| {{yes}}
| {{?}}
|-
| {{Rh}} | [[Vim (text editor)|Vim]] {{Latest preview release/Vim}}
| {{yes}}
| {{no}}
| {{yes}}
| {{no}}
| {{no}}
| {{No}}
| {{yes}}
| {{No}}
| {{no}}
|-
| {{Rh}} | RGX
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{yes}}
| {{no}}
| {{No}}
| {{No}}
| {{yes}}
|}
{{Reflist|group=Note|refs=
<ref name=directives_explanation>Also known as ''Flags modifiers'' or ''Option letters''. Example pattern: "(?i:test)"</ref>
<ref name=atomic_grouping_explanation>Also called ''Independent sub-expressions''</ref>
<ref name=named_groups_explanation>Similar to back references but with names instead of indices</ref>
<ref name=available_java_7>Available as of JDK7.</ref>
<ref name=available_pcre_70>Available as of PCRE 7.0 (as of PCRE 4.0 with Python-like syntax <code>(?P<name>...)</code>)</ref>
<ref name=available_perl_595>Available as of perl 5.9.5</ref>
<ref name=unicode_optional>Requires optional Unicode support enabled.</ref>
<ref name=properties_limited>Supports only a subset of Unicode properties, not all of them.</ref>
<ref name=properties_all>Supports all Unicode properties, including non-binary properties.</ref>
}}
}}
</ref>


==API features==
<ref name="London Gazette 4 June 1918">
{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto; table-layout: fixed;"
{{London Gazette
|+ API feature comparison
| issue = 30730
|-
| supp = yes
! style="width: 12em" |
| startpage = 6685
! Native [[UTF-16]] support <!---{{r|group=Note|unicode_conversion}} undefined reference--->
| date = 4 June 1918
! Native [[UTF-8]] support <!---{{r|group=Note|unicode_conversion}} undefined reference--->
| accessdate = 2008-02-05
! Non-linear input support
}}
! Dot-matches-newline option
</ref>
! Anchor-matches-newline option
|-
! {{Rh}} | [[Boost.Regex]]
| {{No}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
! {{Rh}} | [[GLib]]/GRegex
| {{No}}
| {{yes}} {{R|group=Note|unicode optional 1}}
| {{No}}
| {{yes}}
| {{yes}}
|-
! {{Rh}} | [[International Components for Unicode|ICU]] Regex
| {{yes}}
| {{No}}
| {{yes}} <!-- UText -->
| {{yes}}
| {{yes}} <!-- Probably means UREGEX_DOTALL or UREGEX_MULTILINE -->
|-
! {{Rh}} | [[Java (programming language)|Java]]
| {{yes}}
| {{No}}
| {{yes}} <!-- CharSequence -->
| {{yes}}
| {{yes}}
|-
! {{Rh}} | [[.NET framework|.NET]]
| {{No}} {{R|group=Note|broken}}
| {{No}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
! {{Rh}} | [[Perl Compatible Regular Expression|PCRE]]
| {{No}}
| {{yes}} {{R|group=Note|unicode optional 1}}
| {{No}}
| {{yes}}
| {{yes}}
|-
! {{Rh}} | [[Qt (toolkit)|Qt]]/QRegExp
| {{yes}}
| {{No}}
| {{No}}
| {{No}}
| {{No}}
|-
! {{Rh}} | [[TRE (computing)|TRE]]
| {{No}}
| {{?}}
| {{yes}}
| {{yes}}
| {{yes}}
|-
! {{Rh}} | RGX
| {{No}}
| {{No}}
| {{?}}
| {{yes}}
| {{yes}}
|}
{{Reflist|group=Note|refs=
<ref name="unicode optional 1">Requires optional Unicode support enabled.</ref>
<ref name="broken">Implementation is incorrect - it treats UTF-16 code units as characters, so characters outside the BMP don't work properly. See, for example, this bug report [http://connect.microsoft.com/VisualStudio/feedback/details/357780/extend-regex-to-process-unicode-characters-not-utf-16-code-units].</ref>


<!---Unused references
<ref name="London Gazette 14 June 1927">
<ref name=unicode_native> Native support means that conversion between UTF-16 <-> UTF-8 isn't required, the Unicode properties are supported, and the encoding type is always available (platform dependent wchar_t doesn't count).</ref>
{{London Gazette
--->
| issue = 33284
| supp = yes
| startpage = 3074
| date = 14 June 1927
| accessdate = 2008-02-05
}}
}}
</ref>


==See also==
<ref name="London Gazette 11 May 1937">
* [[List of regular expression software]]
{{London Gazette
| issue = 34396
| supp = yes
| startpage = 3074
| date = 11 May 1937
| accessdate = 2008-02-05
}}
</ref>


==External links==
<ref name="London Gazette 1 December 1939">
* [http://www.regular-expressions.info/refflavors.html Regular Expression Flavor Comparison] — Detailed comparison of the most popular regular expression flavors
{{London Gazette
* [http://www.greenend.org.uk/rjk/2002/06/regexp.html Regexp Syntax Summary]
| issue = 34746
| supp = yes
| startpage = 8097
| date = 1 December 1939
| accessdate = 2008-02-05
}}
</ref>


{{DEFAULTSORT:Comparison Of Regular Expression Engines}}
}}
[[Category:Pattern matching]]
[[Category:Software comparisons|Regular expression engines]]
[[Category:Regular expressions]]

Revision as of 16:59, 21 January 2012

Libraries

List of regular expression libraries
Official website Programming language Software license
Boost.Regex [Note 1] Boost C++ Libraries C++ Boost Software License
Boost.Xpressive Boost C++ Libraries C++ Boost Software License
CL-PPCRE Edi Weitz Common Lisp BSD
cppre Jeff Stuart C++ GPL
DEELX RegExLab C++ "free for personal use and commercial use"
FREJ [Note 2] Fuzzy Regular Expressions for Java Java LGPL
GLib/GRegex [Note 3] Marco Barisione C LGPL
GRETA Microsoft Research C++ ?
ICU International Components for Unicode C/C++/Java ICU license
Jakarta/Regexp The Apache Jakarta Project Java Apache License
JRegex JRegex Java BSD
Oniguruma Kosako C BSD
Pattwo Stevesoft Java (compatible with Java 1.0) LGPL
PCRE Philip Hazel C/C++[Note 4] BSD
Qt/QRegExp [2] C++ Qt GNU GPL v. 3.0 / Qt GNU LGPL v. 2.1 / Qt Commercial
regex - Henry Spencer's regular expression libraries ArgList C BSD
re2 Google Code C++ BSD
TRE [Note 2] Ville Laurikari C BSD
TPerlRegEx TPerlRegEx VCL Component Object Pascal MPLv1.1
TRegExpr RegExp Studio Object Pascal double licensed: Freeware or LGPL with static linking exception
RGX RGX C++ based component library P6R license
  1. ^ formerly called Regex++
  2. ^ a b one of fuzzy regular expression engines
  3. ^ included since version 2.13.0
  4. ^ C++ bindings were developed by Google and became officially part of PCRE in 2006

Languages

List of languages and frameworks coming with regular expression support
Language Official website Software license Remarks
.NET MSDN Proprietary
C++ since ISO14822:2011(e)
D D Boost Software License[Note 1]
Go Golang.org BSD-style license
Haskell Haskell.org BSD3 Not included in the language report; nor in GHC's Hierarchical Libraries
Java Java GNU General Public License REs are written as strings in source code (all backslashes must be doubled, hurting readability).
JavaScript/ECMAScript
? Limited but REs are first-class citizens of the language with a specific /.../mod syntax.
Lua Lua.org MIT License Uses a simplified, limited dialect. Can be bound to a more powerful library, like PCRE or an alternative parser like LPeg.
Object Pascal (Free Pascal) www.freepascal.org LGPL with static linking exception Free Pascal 2.6+ ships with TRegExpr from Sorokin as well as with 2 other regular expression libraries. See http://wiki.lazarus.freepascal.org/Regexpr
Objective-C (Cocoa on iOS only) Apple Proprietary Currently only available on iOS 4+
OCaml Caml LGPL
Perl Perl.com Artistic License or the GNU General Public License Full, central part of the language.
PHP PHP.net PHP License Has two implementations, with PCRE being the more efficient (speed, functionalities).
Python python.org Python Software Foundation License
Ruby ruby-doc.org GNU Library General Public License Ruby 1.8 and 1.9 use different engines; Ruby 1.9 integrates Oniguruma.
SAP ABAP SAP.com ?
Tcl 8.4 tcl.tk Tcl/Tk License
(Permissive, similar to BSD)
ActionScript 3 ? ?

Language features

NOTE: An application using a library for regular expression support does not necessarily offer the full set of features of the library, e.g. GNU Grep which uses PCRE does not offer lookahead support, though PCRE does.

Part 1

Language feature comparison (part 1)
"+" quantifier Negated character classes Non-greedy quantifiers[Note 1] Shy groups[Note 2] Recursion Lookahead Lookbehind Backreferences[Note 3] >9 indexable captures
Boost.Regex Yes Yes Yes Yes Yes [Note 4] Yes Yes Yes Yes
Boost.Xpressive Yes Yes Yes Yes Yes [Note 5] Yes Yes Yes Yes
CL-PPCRE Yes Yes Yes Yes No Yes Yes Yes Yes
EmEditor Yes Yes Yes Yes No Yes Yes Yes No
FREJ No [Note 6] No Some [Note 6] Yes No No No Yes Yes
GLib/GRegex Yes ? Yes ? No ? ? ? ?
GNU Grep Yes Yes Yes Yes No Yes Yes Yes ?
Haskell Yes Yes Yes Yes No Yes Yes Yes Yes
Java Yes Yes Yes Yes No Yes Yes Yes Yes
ICU Regex Yes Yes Yes Yes No Yes Yes Yes Yes
JGsoft Yes Yes Yes Yes No Yes Yes Yes Yes
.NET Yes Yes Yes Yes No Yes Yes Yes Yes
OCaml Yes Yes No No No No No Yes No
OmniOutliner 3.6.2 Yes Yes Yes No No No No ? ?
PCRE Yes Yes Yes Yes Yes Yes Yes Yes Yes
Perl Yes Yes Yes Yes Yes Yes Yes Yes Yes
PHP Yes Yes Yes Yes Yes Yes Yes Yes Yes
Python Yes Yes Yes Yes No Yes Yes Yes Yes
Qt/QRegExp Yes Yes Yes Yes No Yes No Yes Yes
re2 Yes Yes Yes Yes No No No No Yes
Ruby Yes Yes Yes Yes Yes Yes Yes Yes Yes
TRE Yes Yes Yes Yes No No No Yes No
Vim Yes Yes Yes Yes No Yes Yes Yes No
RGX Yes Yes Yes Yes No Yes Yes Yes Yes
TRegExpr Yes ? Yes ? ? ? ? ? ?
  1. ^ Non-greedy quantifiers match as few characters as possible, instead of the default as many. Note that many older, pre-POSIX engines were non-greedy and didn't have greedy quantifiers at all
  2. ^ Shy groups, also called non-capturing groups cannot be referred to with backreferences; non-capturing groups are used to speed up matching where the groups content needs not be accessed later.
  3. ^ Backreferences enable referring to previously matched groups in later parts of the regex and/or replacement string (where applicable). For instance, ([ab]+)\1 matches "abab" but not "abaab"
  4. ^ http://www.boost.org/doc/libs/1_47_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.recursive_expressions
  5. ^ http://www.boost.org/doc/libs/1_47_0/doc/html/xpressive/user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference
  6. ^ a b FREJ have no repetitive quantifiers, but have "optional" element which behaves similar to simple "?" quantifier

Part 2

Language feature comparison (part 2)
Directives [Note 1] Conditionals Atomic groups [Note 2] Named capture [Note 3] Comments Embedded code Partial matching[clarification needed] Fuzzy matching Unicode property support [3]
Boost.Regex Yes Yes Yes Yes Yes No Yes No Some [Note 4] [Note 5]
Boost.Xpressive Yes No Yes Yes Yes No Yes No No
CL-PPCRE Yes Yes Yes Yes Yes Yes ? No No
EmEditor Yes Yes ? ? Yes No Yes No ?
FREJ No No Yes Yes Yes No No Yes ?
GLib/GRegex Yes Yes Yes Yes Yes No Yes No Some [Note 4] [Note 5]
GNU Grep Yes Yes ? Yes Yes No ? No No
Haskell ? ? ? ? ? No ? No No
Java Yes No Yes Yes [Note 6] No No ? No Some [Note 5]
ICU Regex Yes No Yes No Yes No No No Yes [Note 7]
JGsoft Yes Yes Yes Yes Yes No Yes ? Some [Note 5]
.NET Yes Yes Yes Yes Yes No ? No Some [Note 5]
OCaml No No No No No No ? No No
OmniOutliner 3.6.2 ? ? ? ? No No ? No ?
PCRE Yes Yes Yes Yes [Note 8] Yes Yes Yes No Some [Note 4] [Note 5]
Perl Yes Yes Yes Yes [Note 9] Yes Yes No No Yes [Note 7]
PHP Yes Yes Yes Yes Yes No No No No
Python Yes Yes No Yes Yes No Yes No No
Qt/QRegExp No No No No No No Yes No No
re2 Yes No ? Yes No No Yes No Some [Note 5]
Ruby Yes No Yes Yes Yes Yes No No Some [Note 5]
TRE Yes No No No Yes No No Yes ?
Vim Yes No Yes No No No Yes No No
RGX Yes Yes Yes Yes Yes No No No Yes
  1. ^ Also known as Flags modifiers or Option letters. Example pattern: "(?i:test)"
  2. ^ Also called Independent sub-expressions
  3. ^ Similar to back references but with names instead of indices
  4. ^ a b c Requires optional Unicode support enabled.
  5. ^ a b c d e f g h Supports only a subset of Unicode properties, not all of them.
  6. ^ Available as of JDK7.
  7. ^ a b Supports all Unicode properties, including non-binary properties.
  8. ^ Available as of PCRE 7.0 (as of PCRE 4.0 with Python-like syntax (?P<name>...))
  9. ^ Available as of perl 5.9.5

API features

API feature comparison
Native UTF-16 support Native UTF-8 support Non-linear input support Dot-matches-newline option Anchor-matches-newline option
Boost.Regex No No Yes Yes Yes
GLib/GRegex No Yes [Note 1] No Yes Yes
ICU Regex Yes No Yes Yes Yes
Java Yes No Yes Yes Yes
.NET No [Note 2] No Yes Yes Yes
PCRE No Yes [Note 1] No Yes Yes
Qt/QRegExp Yes No No No No
TRE No ? Yes Yes Yes
RGX No No ? Yes Yes
  1. ^ a b Requires optional Unicode support enabled.
  2. ^ Implementation is incorrect - it treats UTF-16 code units as characters, so characters outside the BMP don't work properly. See, for example, this bug report [1].

See also