Bug 8020 – std.stdio can't open UTF16 file names in Windows
Status
RESOLVED
Resolution
DUPLICATE
Severity
major
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
Windows
Creation time
2012-05-03T00:02:00Z
Last change time
2012-07-06T00:54:47Z
Assigned to
nobody
Creator
Oleg.Kuporosov
Comments
Comment #0 by Oleg.Kuporosov — 2012-05-03T00:02:15Z
File() and p/open() assume to receive only ASCII or UTF8 file names.
Windows is supporting UTF16 file systems so portability is limited only
by ASCII names.
We probably may have these API receiving wstring also to satisfy this enhancement.
Comment #1 by bugzilla — 2012-05-03T00:51:22Z
UTF8 supports the full unicode set, not just ASCII.
Comment #2 by Oleg.Kuporosov — 2012-05-03T04:54:32Z
Problem is Windows isn't supporting UTF8. So created file in some 3rd party app with UTF16 name will not match UTF8 name by std.stdio.
http://d.puremagic.com/issues/show_bug.cgi?id=7648 clearly shows that, even
I think it is not a bug, just OS limitation.
Comment #3 by dmitry.olsh — 2012-05-03T07:33:42Z
I assumed it just transcodes UTF-8 into UTF-16 before trying to contact the OS on win32. Apparently that's not the case.
Comment #4 by Oleg.Kuporosov — 2012-05-04T06:05:24Z
Dmitry, we should not assume the name string is in UTF8, it may be also some another 8-bit code page in being supported in Windows, like 125x and so on.
Such encoding should be done by application itself.
What I think is to have File/open/popen( wstring, string mode ) which should
care about UTF16 names. Surprisingly I found some links in DMC includes to _wfopen receiving wchar_t which should exacly help here.
Comment #5 by dmitry.olsh — 2012-05-04T07:48:07Z
(In reply to comment #4)
> Dmitry, we should not assume the name string is in UTF8, it may be also some
> another 8-bit code page in being supported in Windows, like 125x and so on.
> Such encoding should be done by application itself.
Nope, char is UTF-8 codeunit period. See TDPL, language spec etc.
Legacy one-byte encodings should be transfered in bytes/ubytes whatever. BTW NTFS is UTF-16 (or subset of it).
> What I think is to have File/open/popen( wstring, string mode ) which should
> care about UTF16 names. Surprisingly I found some links in DMC includes to
> _wfopen receiving wchar_t which should exacly help here.
Then someone just needs rig current std.file to call toUTF16/toUTFz (see std.uni) and forward the result to the right _wfopen on win32. UTF-16 been the defacto standard in Windows for a looong time. This is all is just embarracing.
Comment #6 by verylonglogin.reg — 2012-07-06T00:54:47Z
(In reply to comment #5)
> Then someone just needs rig current std.file to call toUTF16/toUTFz...
std.file works good with non-ASCII strings. This is std.stdio issue.
> ...and forward the result to the right _wfopen...
And std.file uses plain WinAPI, not its buggy wrapper from Digital Mars C runtime.
> ...This is all is just embarracing.
Yes, but std.stdio is even worse than you think (e.g. it can be 100x slower than direct C function calls as bearophile noted about rawWrite).
*** This issue has been marked as a duplicate of issue 7648 ***