Bug 8800 – Invalid UTF-8 sequences allowed in strings with 'c' postfix.
Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-10-11T05:56:00Z
Last change time
2012-10-11T06:28:34Z
Assigned to
nobody
Creator
aziz.koeksal
Comments
Comment #0 by aziz.koeksal — 2012-10-11T05:56:24Z
Consider this code:
auto s1 = "\x80"; // No error.
auto s2 = "\x80"c; // No error.
auto s3 = "\x80"w; // Error: invalid UTF-8 sequence
auto s4 = "\x80"d; // Error: invalid UTF-8 sequence
When the user explicitly appends the c-postfix, I think for consistency's sake, the string should be validated and invalid UTF-8 sequences should be rejected.
Comment #1 by bugzilla — 2012-10-11T06:08:41Z
I think this could become very annoying, as strings are often invalid UTF-8 sequences while they are being constructed.
Comment #2 by aziz.koeksal — 2012-10-11T06:28:34Z
I'm not sure how it would, because I'm only talking about string literals.
So code like this would still work, of course:
auto s = "valid utf-8"c;
s ~= "invalid utf-8: \x80";