Comment #0 by alphaglosined — 2022-06-12T13:51:53Z
The MSVC linker does not support Unicode characters in symbol names when creating import/export files.
This has not been found before now due to other blockers associated with dll's.
For executables, we do have a test (runnable/testmodule.d), but I had to disable it for Windows to fix https://issues.dlang.org/show_bug.cgi?id=23177
Comment #1 by kinke — 2022-06-12T16:13:09Z
To be clear, we're talking about linker directives (cmdline option strings) embedded in COFF object files. LDC uses UTF8 encoding for these (IIRC), and those do work with the LLD linker, but don't with the MS linker. So I *guess* the MS linker expects some other encoding.
Comment #2 by alphaglosined — 2022-06-12T16:40:00Z
After a bunch of hunting wrt. GetProcAddress, it seems Microsoft does not intend for exports to support anything other than ANSI. There are no A/W versions of this function which based upon consistency means that it only takes ANSI.
Which gets us back to the fact that we will probably need to sanitize mangling to not include Unicode, at least on Windows.
Comment #3 by dlang-bot — 2022-06-12T17:35:51Z
@rikkimax created dlang/dmd pull request #14207 "[DO NOT MERGE] Fix Issue 23179 - Unicode in symbol names in DLLs breaks MSVC linker" fixing this issue:
- Fix Issue 23179 - Unicode in symbol names in DLLs breaks MSVC linker
https://github.com/dlang/dmd/pull/14207
Comment #4 by alphaglosined — 2022-06-13T19:55:06Z
Created attachment 1854
Attempted fix as patch
After talking with kinke, we have decided to wait for this to appear in the wild before fixing.
I've attached my proposed fix as a patch, in case something happens to my fork with the branch containing it.
If you experience this please do reply!
Comment #5 by bugzilla — 2023-01-26T07:53:34Z
There are other limitations on names we accept on Windows, such as the file names being insensitive to case. This has tripped up a handful of people, but people do accept it for what it is. It's not an onerous limitation.
If the Microsoft linker fails at Unicode characters, so be it. Turning them into hex makes the mangled names even uglier and longer. Demangling them also becomes another problem.
I suggest to just let Microsoft worry about this issue. They'll probably eventually fix their linker anyway. It's not worth us fixing it, then unfixing it when MS updates their linker.
So WONTFIX.
Comment #6 by alphaglosined — 2023-01-26T09:26:42Z
They won't eventually fix this.
It permeates the kernel and WinAPI as well.
It is an intentional limitation that occasionally becomes an issue on other platforms as well. Other languages like Rust use Punycode for encoding Unicode.
I picked hex for my implementation because it's easy to encode and also decode.
So making this WONTFIX not only prevents statically binding against c/c++ code but it also leaves people who have Unicode names in symbols with no option to compile their existing codebases as DLLs.
Comment #7 by alphaglosined — 2023-01-28T14:39:24Z
Okay I may end up eating my words on this one.
I can't reproduce on VS 2022.
But what I can get on dmd&ldc rather than VC is:
```
Creating library test.lib and object test.exp
test.exp : error LNK2001: unresolved external symbol _µ
Hint on symbols that are defined and could potentially match:
_µ
test.exe : fatal error LNK1120: 1 unresolved externals
```
So something isn't right, will need to review this at some other point in time and file a different bug report if I can figure out what is going on there.