← Back to index
|
Original Bugzilla link
Bug 23906 – Unicode file names are not properly handled
Status
NEW
Severity
normal
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
Windows
Creation time
2023-05-08T15:49:17Z
Last change time
2024-12-13T19:28:45Z
Assigned to
No Owner
Creator
Richard (Rikki) Andrew Cattermole
Moved to GitHub: dmd#20275 →
Comments
Comment #0
by alphaglosined — 2023-05-08T15:49:17Z
Unicode file names for D source code are apparently not handled correctly. This was reported by a non-Latin user.
https://forum.dlang.org/post/
[email protected]
Proposed steps to fix this: toWStringz should be converting from CP_UTF8 not CP_ACP (checked, this looks to be correct).
https://github.com/dlang/dmd/blob/be151e6d854c0df8af7ee88b6f380b6283ea824f/compiler/src/dmd/common/string.d#L136
I will counterpropose this proposal in suggesting the conversion of CreateProcessA to instead be CreateProcessW with the help of toWStringZ.
https://github.com/dlang/dmd/blob/master/compiler/src/dmd/link.d#L892
Comment #1
by kinke — 2023-05-08T16:46:03Z
I've fixed this for LDC (AFAIK :D), by IIRC: * Switching the main() C entry point on Windows to wmain(), so that its gets the cmdline params (source files, import dirs...) in UTF16 encoding, *not* the current 8-bit code page (CP_ACP). `_d_wrun_main` in druntime then converts those to proper UTF8 strings for _Dmain(). See:
https://github.com/dlang/dmd/blob/be151e6d854c0df8af7ee88b6f380b6283ea824f/compiler/src/dmd/mars.d#L872-L931
* Then redefining the `CodePage` enum in
https://github.com/dlang/dmd/blob/b87b011e0c91596b9722187192416a5a6534b16f/compiler/src/dmd/root/filename.d#L46
from `CP_ACP` to `CP_UTF8`. That
https://github.com/dlang/dmd/blob/be151e6d854c0df8af7ee88b6f380b6283ea824f/compiler/src/dmd/common/string.d#L140
is new to me (and missed by LDC! - thx for the link) - it should definitely use `dmd.root.filename.CodePage` instead (is currently a *private* enum). > suggesting the conversion of CreateProcessA to instead be CreateProcessW with the help of toWStringZ Yes, all child process invocations on Windows should use the wide API.
Comment #2
by kinke — 2023-05-08T16:57:03Z
[Oh, switching to wmain() shouldn't be required; _d_run_main in druntime ignores the narrow cmdline args anyway and properly converts the UTF16 ones to UTF8 for _Dmain.]
Comment #3
by robert.schadek — 2024-12-13T19:28:45Z
THIS ISSUE HAS BEEN MOVED TO GITHUB
https://github.com/dlang/dmd/issues/20275
DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB