Bug 8800 – Invalid UTF-8 sequences allowed in strings with 'c' postfix.

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-10-11T05:56:00Z
Last change time
2012-10-11T06:28:34Z
Assigned to
nobody
Creator
aziz.koeksal

Comments

Comment #0 by aziz.koeksal — 2012-10-11T05:56:24Z
Consider this code: auto s1 = "\x80"; // No error. auto s2 = "\x80"c; // No error. auto s3 = "\x80"w; // Error: invalid UTF-8 sequence auto s4 = "\x80"d; // Error: invalid UTF-8 sequence When the user explicitly appends the c-postfix, I think for consistency's sake, the string should be validated and invalid UTF-8 sequences should be rejected.
Comment #1 by bugzilla — 2012-10-11T06:08:41Z
I think this could become very annoying, as strings are often invalid UTF-8 sequences while they are being constructed.
Comment #2 by aziz.koeksal — 2012-10-11T06:28:34Z
I'm not sure how it would, because I'm only talking about string literals. So code like this would still work, of course: auto s = "valid utf-8"c; s ~= "invalid utf-8: \x80";