Comment #0 by default_357-line — 2018-02-27T14:13:13Z
Hex literals let you declare strings that are invalid utf-8. This violates the docs, as well as the typesystem.
"\xff" is an expression of type string. string is defined ( https://dlang.org/spec/arrays.html#strings ) to be in UTF-8 format. Furthermore, string is an array of char, and chars are defined to be UTF-8 codepoints. 0xFF is not a valid UTF-8 codepoint.
The docs state that hex strings do not perform UTF-8 checking. The docs accurately describe the code; the code is mistaken since it breaks the type.
Either the behavior of hex literals must be changed, or the definition of char must be changed. As it stands, the documentation and behavior is self-contradictory.
Maybe hex literals can be ubyte[]?
Comment #1 by default_357-line — 2018-02-27T14:18:52Z
Update: std.conv.hexString does not validate its return value either.
Comment #2 by b2.temp — 2018-02-27T16:33:13Z
It doesn't have to. hexString isn't even design to represent strings literals, it can be a memory dump as well that can be cast to ubyte[].
Comment #3 by default_357-line — 2018-02-27T17:01:48Z
It has to, because it returns string and string is defined to be UTF-8.
If it wants to return something that is not UTF-8, it should return ubyte[], and you should have to cast it to string explicitly.
Comment #4 by dfj1esp02 — 2018-02-28T08:54:04Z
I'd say the spec just specifies encodings for strings, meaning that it can't be something else like EBCDIC or cp1252. There was a debate whether invalid utf violates type system and an idea that invalid utf can produce an exception, a replacement character or be ignored.