This matter has been tenuously discovered by Rikki Cattermole with the help of several other folks in the community during work on https://github.com/dlang/dmd/pull/12832/.
Under certain conditions, not readily reproducible in the small, importing a module XYZ that is separately-compiled with `-betterC` causes a linker error caused by the absence of a symbol called __ModuleInfo that is supposed to be defined by the XYZ module.
One remedy is to define this inside XYZ:
extern(C) __gshared ModuleInfo _D3dmd7backend7ptrntab12__ModuleInfoZ;
The apparent reason is that importing a module adds in turn references to all imported modules, see the definition of the importedModules property defined in .object.ModuleInfo.
Comment #1 by ibuclaw — 2021-10-08T17:02:08Z
Is this really an issue? What use-case is there for compiling a D library with -betterC then using it from a D program?
Comment #2 by andrei — 2021-10-08T17:07:35Z
(In reply to Iain Buclaw from comment #1)
> Is this really an issue? What use-case is there for compiling a D library
> with -betterC then using it from a D program?
A use case is the dmd compiler itself (not sure why it's built this way).
Anyway, there are a variety of cases in which you want to build a library in D that could work with both C and D.
Comment #3 by alphaglosined — 2021-10-08T17:27:41Z
Comment #4 by pro.mathias.lang — 2021-10-21T13:53:41Z
Why does `mylib1` has a module ctor ? Shouldn't `-betterC` reject it ?
Comment #5 by ibuclaw — 2021-10-21T19:53:27Z
(In reply to Mathias LANG from comment #4)
> Why does `mylib1` has a module ctor ? Shouldn't `-betterC` reject it ?
I think someone had the ingenuity to make module ctors `pragma(crt_constructor)` in betterC mode.
Comment #6 by pro.mathias.lang — 2021-10-22T02:26:49Z
Shouldn't that be only for `shared static this` ?
Comment #7 by bugzilla — 2022-11-30T07:42:03Z
> $ dmd -betterC -lib mylib1.d mylib2.d
This compiles mylib1.d and mylib2.d, and creates a library file mylib.lib containing the object code from both.
> $ dmd -I. mylib1 myexe.d -main
This compiles mylib1.d and myexe.d together to form an executable named mylib1.exe. It fails to find anything from mylib2.d, because that wasn't given on the command line. This is not a compiler bug.
I think what you really meant was:
dmd -betterC mylib1.d mylib2.d
which creates mylib1.obj into which is placed the compiled versions of mylib1.d and mylib2.d
dmd -I. myexe.d mylib1.obj -main
which compiles myexe.d and links it to mylib1.obj, creating an executable named myexe.exe. Or at least it tries to, as it gives:
myexe.obj(myexe)
Error 42: Symbol Undefined __D6mylib212__ModuleInfoZ
because -betterC suppresses generating a ModuleInfo, while myexe.d expects it to be there. This is a compiler bug, or at least a compiler problem.
Comment #8 by bugzilla — 2022-11-30T08:14:07Z
A module generates a ModuleInfo if at least one of these is true:
1. it imports a module that generates ModuleInfo
2. it has a static constructor
3. it has a static destructor
4. it has a unit test declaration
but is disabled if -betterC is on. This particular bug report is the result of having a static constructor.
Comment #9 by bugzilla — 2022-11-30T08:25:59Z
Iain has the right idea. The solution is to, when in -betterC mode:
1. automatically annotate static constructors with:
pragma(crt_constructor) extern (C)
2. do the same for static destructors
3. not set `needmoduleinfo` for (1) and (2)
This will run the constructors and destructors using the C runtime library mechanism. The downside of this is the order of construction and destruction will be in the order the object files are seen by the linker, rather than a depth-first hierarchical order.
----
Mathias' suggestion is a good one. Give an error on `static this()`, and only work with `shared static this()`.
Comment #10 by bugzilla — 2022-11-30T09:06:25Z
That doesn't quite solve the problem. Will have to think about it some more.
Comment #11 by bugzilla — 2022-11-30T09:58:32Z
mylib1.d has a static constructor in it. When does construction happen?
In C code, the C runtime takes care of it, in the order they appear to the linker.
In D code, the D startup code takes care of it, *after* the C runtime does its initializations, in depth-first order.
The two are different, and are irreconcilable (though most static constructors probably don't care about the order, we can't really rely on that).
myexe.d has no way to know that it is importing a betterC module, so it can't do the right thing with the construction.
So, I propose another solution. mylib1.d simply has to choose whether it wants to do C construction or D construction. C construction would be:
pragma(crt_constructor) extern (C) static this() { ... }
D construction would be:
static this() { ... }
myexe.d, upon seeing the D static constructor, is going to expect a ModuleInfo from mylib1.d. The compiler, when compiling mylib1.d with -betterC, and it sees a D static constructor, can create a ModuleInfo for that static constructor.
The programmer creating a betterC library for both betterC and D programs, would use:
pragma(crt_constructor) extern (C) static this() { ... }
Comment #12 by alphaglosined — 2022-11-30T21:06:09Z
D module constructors of course shouldn't work in -betterC code. They can throw a warning as dead code and hence don't require ModuleInfo to be generated (this should be easy to resolve). That'll prevent surprises in the future.
However, the fundamental problem here is with pay-as-you-go runtime. You can't turn on ModuleInfo when you need it in -betterC to move you into pay-as-you-go area of the scale.
We can't turn on ModuleInfo generation right now, because DllImport is incomplete (a good bit harder to implement). If we did turn it on right now, it would result in segfaults if you have D DLL's with D with the runtime executable.
I am arguing that instead of fixing this bug, we solve the DllImport issues first and use code like this as test cases to verify that it is indeed fixed.
Comment #13 by bugzilla — 2022-12-01T03:32:57Z
In this case, ModuleInfo is how the D runtime runs static constructors. Programs compiled with betterC are meant to link only with the C runtime library, which knows nothing about ModuleInfo.
The problem here is writing a library that is compiled with betterC, and meant to be linked with either a betterC program or a D program.
Simply turning off ModuleInfo generation means the betterC's library does not run its static constructors.
Since a D program that is importing betterC modules does not know if they are betterC modules or not, it is the betterC modules' responsibility to choose how to do its own static construction.
I.e. a betterC module should use the following to run its static construction:
pragma(crt_constructor) extern (C) void doMyStaticConstruction() { ... }
If a betterC is to be only linked with a D main, it should do:
static this() { ... }
It should not do both, as then the static constructions will get run twice if it is linked with a D main.
The change to fix this bug report, then, is for betterC modules to generate a ModuleInfo if it has a `static this` constructor. And to add these instructions to the documentation.
The DLL export stuff is an orthogonal problem.
Comment #14 by alphaglosined — 2022-12-01T03:47:01Z
(In reply to Walter Bright from comment #13)
> The change to fix this bug report, then, is for betterC modules to generate
> a ModuleInfo if it has a `static this` constructor. And to add these
> instructions to the documentation.
This should not be automatic.
-betterC is a collection of switches all rolled in together. One of these is turning off of ModuleInfo generation. This is how it works in both LDC and GDC.
It needs to be opt-in via switches. Otherwise, behavior that wasn't expected may occur.
Comment #15 by dlang-bot — 2022-12-02T06:07:19Z
@WalterBright created dlang/dmd pull request #14665 "fix Issue 22367 - Modules compiled with -betterC never generate a Mod…" fixing this issue:
- fix Issue 22367 - Modules compiled with -betterC never generate a ModuleInfo
https://github.com/dlang/dmd/pull/14665
Comment #16 by bugzilla — 2022-12-02T06:09:53Z
Better than turning on/off the ModuleInfo generation, it is better to do or not do the triggers that cause the ModuleInfo to be generated.
For example, if a static constructor is written, and the ModuleInfo is suppressed, the program will link but the static constructor will never be run, causing the resulting program to not behave as expected.
Comment #17 by bugzilla — 2022-12-06T09:20:57Z
I looked further into this.
Essentially, betterC code cannot generate ModuleInfo, because ModuleInfo also generates a call to _d_so_registry in druntime.
Having a `static this` in a betterC module, or a betterC module importing a module with a `static this`, requires a ModuleInfo to guarantee the semantics. Simply turning off ModuleInfo generation will get the program to link, but will leave the static this code un-run, i.e. the code will not work.
Instead, betterC code must use pragma(crt_constructor) functions instead to perform static initializations. These functions will be called by the C runtime startup code.
To reply on pragma(crt_constructor) means fixing the reported problems with them, and making sure they are correctly defined. To that end is:
https://github.com/dlang/dmd/pull/14669
which is to be followed by going through druntime to replace `static this` with `pragma(crt_constructor)` wherever possible, such as with:
https://github.com/dlang/dmd/pull/14671
Comment #18 by robert.schadek — 2024-12-13T19:18:40Z