Bug 4627 – Ideas for std.regex.match usage syntax

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2010-08-11T19:21:00Z
Last change time
2011-06-06T11:10:24Z
Assigned to
andrei
Creator
bearophile_hugs

Comments

Comment #0 by bearophile_hugs — 2010-08-11T19:21:18Z
Ideas for possible changes in std.regex.match() user interface, mostly to shorten it, but also to make it simpler to use. This is what you currently ned to use to iterate on matches: stringtext = "..."; foreach (m; match(text, regex(r"\d")).captures) { ... } The regex() there is useful because you can add attributes like "g" as second argument, but often I don't need attributes, while often I may appreciate a shorter syntax (even if I don't need a built-in regex syntax as in Ruby and Perl). So match() can accept as second argument both an engine (regex) or a string, when attributes are not necessary: foreach (m; match(text, r"\d").captures) { ... } Another possible idea to shorten the syntax is to make match() iterable (I don't know if this is possible or if it is a good idea), this also makes it simpler to use (no need to know about 'captures'): foreach (m; match(text, r"\d")) { ... }
Comment #1 by dmitry.olsh — 2011-06-06T08:41:35Z
It works exactly like that. Keeping in mind that captures is a range [full match, submatch0, submatch1, ...] fro a given match. And foreach (m; match(text, r"\d")) { ... } iterates over consecutive matches of regex (if "g" option is set, otherwise it's one iteration). Resolved ?
Comment #2 by bearophile_hugs — 2011-06-06T10:33:45Z
(In reply to comment #1) > Resolved ? You have improved the D regular expressions a lot, it seems. To me this program crashes at runtime (DMD 2.053): import std.stdio, std.regex; void main() { foreach (m; match("125 155 ss25", r"\d+")) writeln(m); } If I use this line it works: writeln(m.toString());
Comment #3 by dmitry.olsh — 2011-06-06T10:43:32Z
Yeah, that's very embarassing bug related to writeln/formattedWrite. The reason is that toString seems to have less priority then range formatting. And ranges that return elements of the same type as range itself are unexpected in that formatting code. In essence, it's the same issue as this one http://d.puremagic.com/issues/show_bug.cgi?id=4604 So let's keep thing where they belong and if you have no futher things for this bugzilla, I think you should close it. And I think adding this simple example in issue 4604 won't hurt.
Comment #4 by bearophile_hugs — 2011-06-06T11:10:24Z
(In reply to comment #3) > So let's keep thing where they belong and if you have no futher things for this > bugzilla, I think you should close it. Right. A benchmark for the regex: http://shootout.alioth.debian.org/debian/program.php?test=regexdna&lang=gdc&id=4 > And I think adding this simple example in issue 4604 won't hurt. Done.