Bug 15773 – D's treatment of whitespace in character classes in free-form regexes is not the same as Perl's
Status
RESOLVED
Resolution
FIXED
Severity
minor
Priority
P1
Component
phobos
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2016-03-06T12:10:00Z
Last change time
2016-04-09T14:44:49Z
Assigned to
nobody
Creator
d20160306.20.mlaker
Comments
Comment #0 by d20160306.20.mlaker — 2016-03-06T12:10:08Z
In Perl, whitespace in a character class is always significant, even in /x extend mode:
msl@james:~$ perl -wE 'say "Matched" if "a b" =~ /[c d]/'
Matched
msl@james:~$ perl -wE 'say "Matched" if "a b" =~ /[c d]/x'
Matched
msl@james:~$
D's std.regex ignores whitespace in "x" free-form mode:
msl@james:~$ rdmd --eval='auto rx = regex("[c d]", ""); "a b".matchFirst(rx).writeln'
[" "]
msl@james:~$ rdmd --eval='auto rx = regex("[c d]", "x"); "a b".matchFirst(rx).writeln'
[]
msl@james:~$ rdmd --eval='auto rx = ctRegex!("[c d]", ""); "a b".matchFirst(rx).writeln'
[" "]
msl@james:~$ rdmd --eval='auto rx = ctRegex!("[c d]", "x"); "a b".matchFirst(rx).writeln'
[]
msl@james:~$
I wasted an hour's debugging time because I didn't expect this difference: I thought whitespace would always be significant inside a character class. Perhaps other developers will have the same expectation that I did. I don't suggest that we change the behaviour of std.regex, because it would break too much existing code, but could we explicitly mention D's behaviour in the docs? Many thanks.
Comment #1 by d20160306.20.mlaker — 2016-03-06T12:19:37Z