Bug 2108 – regexp.d: The greedy dotstar isn't so greedy

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
x86
OS
All
Creation time
2008-05-14T23:09:00Z
Last change time
2015-06-09T01:14:37Z
Assigned to
dmitry.olsh
Creator
nyphbl8d

Comments

Comment #0 by nyphbl8d — 2008-05-14T23:09:53Z
As far as I'm aware, ".*" should be greedy by default and become non-greedy when changed to ".*?". As it stands now, both ".*" and ".*?" are non-greedy when it comes to std.regexp and I have found no way to make ".*" greedy, flags or otherwise. This can be seen by using "<packet>text</packet><packet>text</packet>" as the buffer to match against and "<packet.*/packet>" as the pattern. When I use this with std.regexp.search, it only matches the first opening and closing tag instead of the outer set. I just hope this isn't my lack of regex-fu coming back to haunt me.
Comment #1 by dsimcha — 2009-10-18T07:44:51Z
*** Issue 2487 has been marked as a duplicate of this issue. ***
Comment #2 by Jesse.K.Phillips+D — 2010-05-06T14:30:40Z
This is also an issue in Windows with std.regex using DMD 2.043 But I would like to add that it is always greedy prior to text. The first assert will fail since it was not non-greedy and the second is what it should be. import std.regex; void main() { assert(match("Hello there you silly person you.", regex(r"\b.+? you .+\w")).hit != "Hello there you silly"); assert(match("Hello there you silly person you.", regex(r"\b.+? you .+\w")).hit == "there you silly person"); }
Comment #3 by dmitry.olsh — 2011-04-18T13:54:37Z
(In reply to comment #2) > This is also an issue in Windows with std.regex using DMD 2.043 > > But I would like to add that it is always greedy prior to text. The first > assert will fail since it was not non-greedy and the second is what it should > be. > > import std.regex; > > void main() { > assert(match("Hello there you silly person you.", > regex(r"\b.+? you .+\w")).hit != "Hello there you silly"); > > assert(match("Hello there you silly person you.", > regex(r"\b.+? you .+\w")).hit == "there you silly person"); > } Actually it should be assert(match("Hello there you silly person you.", regex(r"\b.+? you .+\w")).hit == "Hello there you silly person you"); Two points - \b also matches at the begining of input (if the first char is \w), and the last .+ is greedy, and since '.' is certainly not a \w, we have what we have. Also tested at: http://www.regextester.com/ http://www.regular-expressions.info/javascriptexample.html ... etc. P.S. The patch is coming ;)
Comment #4 by andrei — 2011-06-04T17:48:52Z
Reassigning to Dmitry.
Comment #5 by dmitry.olsh — 2011-06-05T00:09:28Z
I'd gladly close this issue, since it now works correctly in std.regex. But the report is filed against std.regexP. Should I close it?
Comment #6 by andrei — 2011-06-05T06:18:54Z
Yes. Please also update the changelog.dd file.
Comment #7 by dmitry.olsh — 2011-06-06T08:02:43Z