Bug 24590 – Illegal instruction with module constructors cycle and shared libphobos2 in _d_criticalenter2

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P1
Component
druntime
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2024-06-08T11:45:31Z
Last change time
2024-06-12T00:43:08Z
Assigned to
No Owner
Creator
Andrei Horodniceanu

Comments

Comment #0 by a.horodniceanu — 2024-06-08T11:45:31Z
Take the two files: ---- module a; import b; shared static this(){} ---- ---- module b; import a; shared static this(){} ---- compiled with `dmd a.d b.d -defaultlib=libphobos2.so -main -ofmain`. Running main produces: ---- $ ./main object.Error@src/rt/minfo.d(372): Cyclic dependency between module constructors/destructors of b and a b* -> a* -> b* ---------------- ??:? nothrow bool rt.minfo.ModuleGroup.sortCtors(immutable(char)[]).findDeps(ulong, ulong*) [0x7f240dc5a1e9] ??:? nothrow bool rt.minfo.ModuleGroup.sortCtors(immutable(char)[]).processMod(ulong) [0x7f240dc5a3fa] ??:? nothrow bool rt.minfo.ModuleGroup.sortCtors(immutable(char)[]).processMod(ulong) [0x7f240dc5a4eb] ??:? nothrow bool rt.minfo.ModuleGroup.sortCtors(immutable(char)[]).doSort(ulong, ref immutable(object.ModuleInfo)*[]) [0x7f240dc5a773] ??:? nothrow void rt.minfo.ModuleGroup.sortCtors(immutable(char)[]) [0x7f240dc59c89] ??:? void rt.minfo.ModuleGroup.sortCtors() [0x7f240dc5a838] ??:? int rt.minfo.rt_moduleCtor().__foreachbody1(ref rt.sections_elf_shared.DSO) [0x7f240dc5abe8] ??:? int rt.sections_elf_shared.DSO.opApply(scope int delegate(ref rt.sections_elf_shared.DSO)) [0x7f240dc5c557] ??:? rt_moduleCtor [0x7f240dc5abc8] ??:? rt_init [0x7f240dc524c9] ??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x7f240dc52a6f] ??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).tryExec(scope void delegate()) [0x7f240dc52a0d] ??:? _d_run_main2 [0x7f240dc52976] ??:? _d_run_main [0x7f240dc5275f] ??:? main [0x55f47ddba1af] ??:? [0x7f240d5660cf] ??:? __libc_start_main [0x7f240d566188] ??:? _start [0x55f47ddba084] Illegal instruction (core dumped) ---- The core dump shows: ---- (gdb) bt #0 0x00007f240dc519dd in _d_criticalenter2 () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #1 0x00007f240dc5f498 in rt.trace._staticDtor_L408_C1() () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #2 0x00007f240dc603c1 in rt.trace.__moddtor() () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #3 0x00007f240dc5aece in rt.minfo.runModuleFuncsRev!(rt.minfo.ModuleGroup.runTlsDtors().__lambda1).runModuleFuncsRev(const(immutable(object.ModuleInfo)*)[]) () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #4 0x00007f240dc5a8b9 in rt.minfo.ModuleGroup.runTlsDtors() () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #5 0x00007f240dc5d5c2 in _d_dso_registry () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #6 0x00007f240dbe83b6 in ?? () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #7 0x0000000000000001 in ?? () #8 0x00007f240dda99a0 in ?? () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #9 0x00007f240ddd2fe0 in vtable for etc.linux.memoryerror.NullPointerError () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #10 0x00007f240ddd4660 in ?? () from /usr/lib/dmd/2.108/lib64/libphobos2.so.0.108 #11 0x00007f240de640b0 in ?? () #12 0x00007f240de670e2 in _dl_call_fini () from /lib64/ld-linux-x86-64.so.2 #13 0x00007f240de6af9e in _dl_fini () from /lib64/ld-linux-x86-64.so.2 #14 0x00007f240d57e656 in ?? () from /usr/lib64/libc.so.6 #15 0x00007f240d57e790 in exit () from /usr/lib64/libc.so.6 #16 0x00007f240d5660d7 in ?? () from /usr/lib64/libc.so.6 #17 0x00007f240d566189 in __libc_start_main () from /usr/lib64/libc.so.6 #18 0x000055f47ddba085 in _start () ---- And the surrounding assembly: ---- Dump of assembler code for function _d_criticalenter2: 0x00007f240dc519a8 <+0>: push %rbp 0x00007f240dc519a9 <+1>: mov %rsp,%rbp 0x00007f240dc519ac <+4>: sub $0x30,%rsp 0x00007f240dc519b0 <+8>: mov %rbx,-0x28(%rbp) 0x00007f240dc519b4 <+12>: mov %r12,-0x20(%rbp) 0x00007f240dc519b8 <+16>: mov %rdi,%r12 0x00007f240dc519bb <+19>: cmpq $0x0,(%r12) 0x00007f240dc519c0 <+24>: jne 0x7f240dc51a5e <_d_criticalenter2+182> 0x00007f240dc519c6 <+30>: mov 0xff003(%rip),%rbx # 0x7f240dd509d0 0x00007f240dc519cd <+37>: add $0x8,%rbx 0x00007f240dc519d1 <+41>: mov %rbx,%rdi 0x00007f240dc519d4 <+44>: call 0x7f240da893b0 <pthread_mutex_lock@plt> 0x00007f240dc519d9 <+49>: test %eax,%eax 0x00007f240dc519db <+51>: je 0x7f240dc519df <_d_criticalenter2+55> => 0x00007f240dc519dd <+53>: ud2 0x00007f240dc519df <+55>: cmpq $0x0,(%r12) 0x00007f240dc519e4 <+60>: jne 0x7f240dc51a45 <_d_criticalenter2+157> # <snip> ---- Note that the eax register has the value 22 which is EINVAL which would indicate that the mutex passed to pthread_mutex_lock is uninitialized. I say this yet I don't know how the code got past the `test %eax,%eax`. The code above fails with both dmd and ldc2 but not with gdc.
Comment #1 by a.horodniceanu — 2024-06-09T03:35:42Z
> The code above fails with both dmd and ldc2 but not with gdc. My bad, I forgot to pass `-shared-libphobos` when building with gdc. With the flag disabled I get: ---- object.Error@/var/tmp/portage/sys-devel/gcc-14.1.1_p20240518/work/gcc-14-20240518/libphobos/libdruntime/rt/minfo.d(372): Cyclic dependency between module constructors/destructors of b and a b* -> a* -> b* ---- And when passing the flag I get: ---- object.Error@/var/tmp/portage/sys-devel/gcc-14.1.1_p20240518/work/gcc-14-20240518/libphobos/libdruntime/rt/minfo.d(372): Cyclic dependency between module constructors/destructors of b and a b* -> a* -> b* ---------------- ??:? _d_createTrace [0x7fafb9cfe3fb] ??:? _d_throw [0x7fafb9cf23b4] ??:? ???[0x7fafb9d02c08] ??:? ???[0x7fafb9d02b7d] ??:? ???[0x7fafb9d02d87] ??:? nothrow void rt.minfo.ModuleGroup.sortCtors(immutable(char)[]) [0x7fafb9d04317] ??:? ???[0x7fafb9d04427] ??:? int gcc.sections.elf.DSO.opApply(scope int delegate(ref gcc.sections.elf.DSO)) [0x7fafb9cf321a] ??:? rt_init [0x7fafb9cfe84a] ??:? ???[0x7fafb9cfe9ef] ??:? _d_run_main2 [0x7fafb9cfed1b] ??:? _d_run_main [0x7fafb9cfeefc] /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/d/core/internal/entrypoint.d:29 main [0x560cd1844306] ??:? ???[0x7fafb96592df] ??:? __libc_start_main [0x7fafb9659398] ??:? _start [0x560cd1844094] ??:? ???[0xffffffffffffffff] ---- Which is an improvement since it's not a crash.
Comment #2 by kinke — 2024-06-09T12:32:17Z
AFAICT, this happens because at program termination, each binary unregisters itself via a CRT destructor, incl. running the module dtors of that binary if druntime had been initialized via `rt_init()`. For libphobos2.so, this includes the `rt.trace` module dtor (whereas static druntime probably doesn't unless using tracing functionality), which uses an anonymous mutex, which depends on an initialized druntime (`_d_critical_init()` without `_d_critical_term()`). But `rt_init()` already cleans up via `_d_critical_term()` if initialization failed, e.g., due to module cycles here. AFAICT, the problem is `rt.sections_elf_shared._isRuntimeInitialized`, which is set once during the `initSections()` call, but would need to be unset if module ctors later throw. We should probably get rid of that variable and use `rt.dmain2._initCount` instead. This also crashes with `-defaultlib=libphobos2.so` only: ``` shared static this() { throw new Exception("oops"); } void main() {} ```
Comment #3 by dlang-bot — 2024-06-11T12:26:19Z
@kinke created dlang/dmd pull request #16578 "Fix Bugzilla 24590 - Don't run module destructors if druntime failed to initialize" mentioning this issue: - Add test case for Bugzilla 24590 https://github.com/dlang/dmd/pull/16578
Comment #4 by dlang-bot — 2024-06-12T00:43:08Z
dlang/dmd pull request #16578 "Fix Bugzilla 24590 - Don't run module destructors if druntime failed to initialize" was merged into master: - 83b62e9633dbe8b789065cec7a7253cf8480e5e2 by Martin Kinkelin: Add test case for Bugzilla 24590 - 6f20e8f9da84c9498a3c865734b14bbae6d5213a by Martin Kinkelin: Fix Bugzilla 24590 - Don't run module destructors if druntime failed to initialize https://github.com/dlang/dmd/pull/16578