Bug 5674 – AssertError in std.regex

Status
RESOLVED
Resolution
WONTFIX
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
Other
OS
Mac OS X
Creation time
2011-03-01T08:14:00Z
Last change time
2012-02-24T12:04:19Z
Assigned to
nobody
Creator
doob

Attachments

IDFilenameSummaryContent-TypeSize
939regex.d.patchThis patch fixes the problems with unmatched groups in a match.text/plain3642

Comments

Comment #0 by doob — 2011-03-01T08:14:43Z
The following code results in an AssertError or RangeError (don't know if the RangeError is expected behavior) : import std.regex; import std.stdio; void main () { auto m = "abc".match(`a(\w)b`); writeln(m.hit); // AssertError in regex.d:1795 writeln(m.captures); // RangeError in regex.d:1719 } Can't "hit" just return an empty string and "captures" an empty range?
Comment #1 by magnus — 2011-03-31T07:09:07Z
I have similar problems with stuff like this: import std.stdio, std.regex; void main() { foreach (m; match("abc", "a|(x)")) { foreach (e; m.captures) { writeln(e); } } } Here it prints out "a" and then I get a range violation. Whether or not m.captures[1] exists, iterating over m.captures should be possible? Also: Checking whether m.captures[1] exists would be highly useful -- to see what has matched. (Doing this by length wouldn't work in general, of course.)
Comment #2 by ricochet1k — 2011-04-06T11:41:18Z
After some debugging, it looks like Captures is looking for the first unmatched group and stopping there when giving the length of the captures, which I believe is the cause of the assert error. The second problem is that when a group is unmatched the startIdx and endIdx are stored as size_t.max, and when Captures.front/opIndex as well as RegexMatch.hit try to slice the input with those numbers causes a range violation. Most regex engines handle this by returning null if a group is unmatched. I'll try to submit a patch soon if I get it working.
Comment #3 by ricochet1k — 2011-04-06T12:50:32Z
Created attachment 939 This patch fixes the problems with unmatched groups in a match.
Comment #4 by dmitry.olsh — 2011-04-20T04:04:53Z
(In reply to comment #3) > Created an attachment (id=939) [details] > This patch fixes the problems with unmatched groups in a match. Acctually I'm working on fixing all of the issues of std.regex, see this pull request https://github.com/D-Programming-Language/phobos/pull/22 There is a litle problem with your patch. If the match is empty (there are such regexes) or there is not match RegexMatch.hit still happily returns "", maybe it's better to let it hit assert on no match just like it was to enforce checking of empty.
Comment #5 by dmitry.olsh — 2012-02-24T12:04:19Z
Things got mixed here a bit, but initial issue is a clean won't fix as it works as designed. One should test RegexMatch for empty just like any other range. The second issue here was fixed with pull 22 for the previous version of std.regex, and never existed in a new one.