← Back to index
|
Original Bugzilla link
Bug 395 – std.regexp incorrectly handles UTF text
Status
RESOLVED
Resolution
FIXED
Severity
major
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2006-10-02T21:23:00Z
Last change time
2015-06-09T01:31:23Z
Assigned to
bugzilla
Creator
ddparnell
Comments
Comment #0
by ddparnell — 2006-10-02T21:23:55Z
It seems that the std.regexp module doesn't correctly handle non-ASCII text and wildcard matching. import std.stdio; import std.regexp; import std.utf; void test(char[] sample, char[] pat) { int pos; validate(sample); validate(pat); writefln("sample = %s", cast(ubyte[])sample); pos = find(sample, pat); writefln("Where = %s %s", cast(ubyte[])pat, pos); } void main() { test("\u3026a\u2021\u5004b\u4011", "a\u2021\u5004b"); // works test("\u3026a\u2021\u5004b\u4011", "a..b"); // fails test("1a23b4", "a23b"); // works test("1a23b4", "a..b"); // works }
Comment #1
by bugzilla — 2006-10-10T03:30:15Z
Fixed DMD 0.169, but probably more UTF bugs remain.