Bug 2108 – regexp.d: The greedy dotstar isn't so greedy
Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
x86
OS
All
Creation time
2008-05-14T23:09:00Z
Last change time
2015-06-09T01:14:37Z
Assigned to
dmitry.olsh
Creator
nyphbl8d
Comments
Comment #0 by nyphbl8d — 2008-05-14T23:09:53Z
As far as I'm aware, ".*" should be greedy by default and become non-greedy when changed to ".*?". As it stands now, both ".*" and ".*?" are non-greedy when it comes to std.regexp and I have found no way to make ".*" greedy, flags or otherwise. This can be seen by using "<packet>text</packet><packet>text</packet>" as the buffer to match against and "<packet.*/packet>" as the pattern. When I use this with std.regexp.search, it only matches the first opening and closing tag instead of the outer set. I just hope this isn't my lack of regex-fu coming back to haunt me.
Comment #1 by dsimcha — 2009-10-18T07:44:51Z
*** Issue 2487 has been marked as a duplicate of this issue. ***
Comment #2 by Jesse.K.Phillips+D — 2010-05-06T14:30:40Z
This is also an issue in Windows with std.regex using DMD 2.043
But I would like to add that it is always greedy prior to text. The first assert will fail since it was not non-greedy and the second is what it should be.
import std.regex;
void main() {
assert(match("Hello there you silly person you.",
regex(r"\b.+? you .+\w")).hit != "Hello there you silly");
assert(match("Hello there you silly person you.",
regex(r"\b.+? you .+\w")).hit == "there you silly person");
}
Comment #3 by dmitry.olsh — 2011-04-18T13:54:37Z
(In reply to comment #2)
> This is also an issue in Windows with std.regex using DMD 2.043
>
> But I would like to add that it is always greedy prior to text. The first
> assert will fail since it was not non-greedy and the second is what it should
> be.
>
> import std.regex;
>
> void main() {
> assert(match("Hello there you silly person you.",
> regex(r"\b.+? you .+\w")).hit != "Hello there you silly");
>
> assert(match("Hello there you silly person you.",
> regex(r"\b.+? you .+\w")).hit == "there you silly person");
> }
Actually it should be
assert(match("Hello there you silly person you.",
regex(r"\b.+? you .+\w")).hit == "Hello there you silly person you");
Two points - \b also matches at the begining of input (if the first char is \w), and the last .+ is greedy, and since '.' is certainly not a \w, we have what we have.
Also tested at:
http://www.regextester.com/http://www.regular-expressions.info/javascriptexample.html
... etc.
P.S. The patch is coming ;)
Comment #4 by andrei — 2011-06-04T17:48:52Z
Reassigning to Dmitry.
Comment #5 by dmitry.olsh — 2011-06-05T00:09:28Z
I'd gladly close this issue, since it now works correctly in std.regex. But the report is filed against std.regexP. Should I close it?