import std.stdio;
void main()
{
string s = stdin.readln();
write(s);
}
The code above should write a unicode (specifically cyrillic) string to output to a windows console (with cp set to 65001), but the string comes out empty. The same code works correctly when run through windows debugger windbg.exe, so hopefully it will be an easy fix.
Comment #1 by dfj1esp02 — 2014-06-27T14:59:07Z
This bug is probably better to split. It either read an invalid utf-8 string, or couldn't write a valid utf-8 string.
Comment #2 by dfj1esp02 — 2014-06-27T15:02:18Z
import std.stdio, std.utf;
void main()
{
string s = stdin.readln();
validate(s);
write(s);
}
Check if validation passes.
Comment #3 by sum.proxy — 2014-06-27T15:27:51Z
I still see no output in the regular console (no exception indication either). However, when I run it with windbg.exe it throws some exception (can't tell which one exactly, couldn't figure out how to load debug symbols). Appears like a write problem to me..
Comment #4 by dfj1esp02 — 2014-06-27T18:56:55Z
Then try
write(cast(ubyte[])s);
Comment #5 by sum.proxy — 2014-06-28T07:12:22Z
This time it returned an empty array ([]).
Thanks.
Comment #6 by sum.proxy — 2014-07-03T08:02:07Z
I also tried it on a 32-bit windows system and the behavior is the same - no output.
Comment #7 by dfj1esp02 — 2014-07-07T09:00:29Z
An empty array means no input rather than no output. Did it wait for the input? Do you compile it for console or GUI subsystem?
echo 000 | yourprogram.exe
Does this work?
Comment #8 by sum.proxy — 2014-07-07T09:38:18Z
Yes, it does wait for the input, but the output is empty. It's a console application and sending the input through pipe seems to work correctly.
Comment #9 by sum.proxy — 2014-08-15T11:34:25Z
Sorry, any feedback on this one?
Comment #10 by dlang-bugzilla — 2014-10-25T02:10:16Z
Try calling SetConsoleCP(65001) and SetConsoleOutputCP(65001).
Comment #11 by sum.proxy — 2014-10-25T10:41:22Z
I tried the new version of the compiler with the issue you referred to, but alas - no luck.
Please see https://issues.dlang.org/show_bug.cgi?id=1448#c12
SetConsoleCP(65001) and SetConsoleOutputCP(65001) didn't help either.
Thanks.
Comment #12 by dlang-bugzilla — 2014-10-25T13:51:34Z
Indeed.
Happens with both DMC and MSVC runtime.
Comment #13 by dlang-bugzilla — 2014-10-25T13:53:32Z
"scanf" misbehaves in the same way. Not a D bug, I think.
From what I know this program will work incorrectly for any non-ascii unicode input, which I have confirmed through simple tests.
scanf and strlen rely on '\0' to indicate string termination, but I don't think this goes well with unicode strings.
I believe the right way to do something similar (without buffer length) is this:
#include <stdio.h>
#include <fcntl.h>
#include <io.h>
int main( void )
{
wchar_t buf[1024];
_setmode( _fileno( stdin ), _O_U16TEXT );
_setmode( _fileno( stdout ), _O_U16TEXT );
wscanf( L"%ls", buf );
wprintf( L"%s", buf );
}
For further info please refer to http://www.siao2.com/2008/03/18/8306597.aspx and http://msdn.microsoft.com/en-us/library/tw4k6df8%28v=vs.120%29.aspx
HTH,
Thanks.
Comment #19 by dlang-bugzilla — 2014-10-26T00:35:23Z
(In reply to Sum Proxy from comment #18)
> scanf and strlen rely on '\0' to indicate string termination, but I don't
> think this goes well with unicode strings.
Not true. At least, not true with UTF-8, which is what we set the CP to.
> I believe the right way to do something similar (without buffer length) is
> this:
I would not say that's the "right" way. That's the way to read wchar_t text, but we need UTF-8 text.
Comment #20 by sum.proxy — 2014-10-28T11:32:14Z
I believe the problem is that default internal representation of Unicode in Windows is UTF-16, which implies that some sort of conversion would be necessary here.
I haven't found a way to do it right yet.
Comment #21 by sum.proxy — 2014-10-28T12:28:55Z
Or perhaps "the right" way would be to stick to UTF-16, since it's default for Unicode in Windows.
Comment #22 by sum.proxy — 2014-10-28T12:53:37Z
This actually works on my system:
///////////// test.d //////////////
import std.stdio;
import std.c.windows.windows;
extern(Windows) BOOL SetConsoleCP( UINT );
void main()
{
SetConsoleCP(1200);
string s = stdin.readln();
write(s);
}
///////////////////////////////////