Bug 5743 – readf cannot read wchar or dchar from UTF-8 stdin
Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
Other
OS
Linux
Creation time
2011-03-16T13:20:54Z
Last change time
2020-03-21T03:56:36Z
Assigned to
No Owner
Creator
Ali Cehreli
Comments
Comment #0 by acehreli — 2011-03-16T13:20:54Z
I compiled the following program with dmd 2.052 on an Ubuntu 10.10 console.
The following program reads only the first code unit instead of the whole character.
import std.stdio;
void main()
{
wchar c; // Please note: same problem with dchar as well
readf(" %s", &c);
writeln(c);
}
For example when the input is the character ö (encoded with byte values 195 182 in UTF-8), only the first code unit is read and the output becomes the Unicode character that corresponds to the value of that code unit.
In a sense, the program reads a code unit and outputs it as a code point.
Thank you,
Ali
Comment #1 by clugdbug — 2011-03-19T17:14:45Z
This is marked as 'regression'. What previous version did it work with?
Comment #2 by acehreli — 2011-03-19T17:49:29Z
"regression" turns out to be my mistake. I just went back more than a dozen dmd versions and see that std.stdio.readf (or File.readf) is pretty new.
I've been using std.cstream.din, which used to work better than stdio.readf. Thinking that they must be using the same underlying format functions I thought that this was a regression.