Bug 23508 – Unable to build japanese named source files

Status
NEW
Severity
normal
Priority
P3
Component
dmd
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2022-11-25T10:45:49Z
Last change time
2024-12-13T19:25:56Z
Assigned to
No Owner
Creator
Marcelo Silva Nascimento Mancini
Moved to GitHub: dmd#20193 →

Comments

Comment #0 by msnmancini — 2022-11-25T10:45:49Z
わかもの.d -> ``` module わかもの; ``` app.d ``` import わかもの; void main() { } ``` dmd -i source/app.d: Error: unable to read module `πéÅπüïπééπü«` ldc: Error: unable to read module `わかもの`
Comment #1 by default_357-line — 2022-11-25T10:53:14Z
Works here. Are you sure you've saved app.d as UTF-8?
Comment #2 by msnmancini — 2022-11-25T10:55:08Z
Yes, are you on Windows?
Comment #3 by default_357-line — 2022-11-25T11:03:30Z
Nope, Linux. Maybe there's something weird with how Windows opens text files. But the spec says that the source file has to be (ASCII-7 or) UTF ( https://dlang.org/spec/intro.html#phases-of-compilation ). So if your file is that format, it should be expected to work. Sorry I can't test on Windows. Maybe that weird encoding is just a DMD output issue, and there's some other reason it can't find the file?
Comment #4 by msnmancini — 2022-11-25T11:07:27Z
This surely is a problem with Windows. I triple checked and both files are using UTF-8 encoding, ( just check on VS Code ). I feel like this problem is related to the file opening API.
Comment #5 by default_357-line — 2022-11-25T11:21:22Z
The relevant code seems to be compiler/src/dmd/common/string.d:122, which converts a D string (UTF-8) to the "system default Windows ANSI code page" (CP_ACP). This then gets passed to GetFileAttributesW to check that the file exists. (file_manager.d:51) Maybe throw a printf in there to see what it does?
Comment #6 by default_357-line — 2022-11-25T11:25:16Z
Hang the hell on. If the symbol name comes from the source, why the fuck does it use the *System ANSI code page* to *convert to UTF-16 from*?!
Comment #7 by default_357-line — 2022-11-25T11:32:51Z
Yeah this is definitely a DMD bug. Data flow: Identifier (UTF) -> getFilename -> FileManager.lookForSourceFile -> FileName.exists -> extendedPathThen -> toWStringz treats string as default codepage, not UTF -> GetFileAttributesW (UTF treated as CA_ACP transcoded to UTF-16) -> error. To reproduce, would be something like: - have a Windows computer - Settings, Region and Language, Administrative, Language for non-Unicode programs - Select a language whose encoding is not compatible with UTF-8 - Compile file that imports module with non-ASCII UTF characters.
Comment #8 by default_357-line — 2022-11-25T11:35:42Z
CP_ACP*
Comment #9 by robert.schadek — 2024-12-13T19:25:56Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/20193 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB