Bug 22843 – Program hangs on full gc collect with --DRT-gcopt=fork:1 if run under valgrind/callgrind

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P1
Component
druntime
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2022-03-03T19:40:59Z
Last change time
2022-03-28T11:33:58Z
Keywords
pull
Assigned to
No Owner
Creator
JR

Comments

Comment #0 by zorael — 2022-03-03T19:40:59Z
Manjaro/Arch x86_64, dmd 2.098.1, ldc 1.28.1. When the new forking garbage collector is enabled, the program hangs on what I imagine is the first collection, if run under the valgrind/callgrind profiler. ```d import std.string : succ; extern(C) __gshared string[] rt_options = [ "gcopt=profile:1 fork:1 initReserve:0 minPoolSize:0" ]; void main() { string[string] aa; string key = "a"; foreach (immutable i; 0..1_000) { aa[key] = key; key = key.succ; } } ``` ``` $ dmd fork.d && valgrind --tool=callgrind ./fork ==799709== Callgrind, a call-graph generating cache profiler ==799709== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al. ==799709== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==799709== Command: ./fork ==799709== ==799709== For interactive control, run 'callgrind_control -h'. [hangs here, 100% cpu use until Ctrl+C] ``` ``` $ callgrind_control -b | ddemangle PID 799709: ./fork sending command status internal to pid 799709 Frame: Backtrace for Thread 1 [ 0] clock_gettime@@GLIBC_2.17 (32260048 x) [ 1] nothrow ulong core.internal.gc.impl.conservative.gc.Gcx.fullcollect(bool, bool, bool) (1 x) [ 2] nothrow void* core.internal.gc.impl.conservative.gc.Gcx.smallAlloc(ulong, ref ulong, uint, const(TypeInfo)) (2 x) [ 3] nothrow void* core.internal.gc.impl.conservative.gc.ConservativeGC.runLocked!(core.internal.gc.impl.conservative.gc.ConservativeGC.mallocNoSync(ulong, uint, ref ulong, const( TypeInfo)), core.internal.gc.impl.conservative.gc.mallocTime, core.internal.gc.impl.conservative.gc.numMallocs, ulong, uint, ulong, const(TypeInfo)).runLocked(ref ulong, ref uint, ref ulong, ref const(TypeInfo)) (1 x) [ 4] nothrow void* core.internal.gc.impl.conservative.gc.ConservativeGC.calloc(ulong, uint, const(TypeInfo)) (1 x) [ 5] _THUNK16 (1 x) [ 6] gc_calloc (1 x) [ 7] ref rt.aaA.Impl rt.aaA.Impl.__ctor(scope const(TypeInfo_AssociativeArray), ulong) (1 x) [ 8] _aaGetX (1 x) [ 9] _aaGetY (1 x) [10] _Dmain (1 x) [11] void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll().__lambda2() (1 x) [12] void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).tryExec(scope void delegate()) (1 x) [13] void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() (1 x) [14] void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).tryExec(scope void delegate()) (1 x) [15] _d_run_main2 (1 x) [16] __alloca (1 x) [17] _d_run_main2 (1 x) [18] _d_run_main (1 x) [19] __alloca (1 x) [20] _d_run_main (1 x) [21] main (1 x) [22] (below main) (1 x) [23] __libc_start_main@@GLIBC_2.34 (1 x) [24] (below main) (1 x) [25] 0x000000000001d930 ``` I could reproduce it with both dmd and ldc, though naturally the backtrace differs slightly between them. It does not happen with `--DRT-gcopt="fork:0"`.
Comment #1 by dlang-bot — 2022-03-26T19:02:34Z
@huglovefan created dlang/druntime pull request #3788 "fix Issue 22843 - Program hangs on full gc collect with --DRT-gcopt=f…" fixing this issue: - fix Issue 22843 - Program hangs on full gc collect with --DRT-gcopt=fork:1 if run under valgrind/callgrind the clone() call was using the `CLONE_CHILD_CLEARTID` flag without passing it a thread id pointer in the optional argument based on the comment `// child thread id not needed`, the flag was probably intended to disable any passing of a thread id which is already the default if neither of `CLONE_CHILD_CLEARTID` and `CLONE_CHILD_SETTID` are used, so just remove the flag https://github.com/dlang/druntime/pull/3788
Comment #2 by dlang-bot — 2022-03-28T11:33:58Z
dlang/druntime pull request #3788 "fix Issue 22843 - Program hangs on full gc collect with --DRT-gcopt=f…" was merged into master: - f9e806cc906508199ba96b2d299f2301568a0a19 by human: fix Issue 22843 - Program hangs on full gc collect with --DRT-gcopt=fork:1 if run under valgrind/callgrind the clone() call was using the `CLONE_CHILD_CLEARTID` flag without passing it a thread id pointer in the optional argument based on the comment `// child thread id not needed`, the flag was probably intended to disable any passing of a thread id which is already the default if neither of `CLONE_CHILD_CLEARTID` and `CLONE_CHILD_SETTID` are used, so just remove the flag - 56bf314d41166c61b72a80737d519440647d3dee by human: add test for issue 22843 - 5fc7da62cc5eab8b422079017dce0d928dd0de0b by human: install valgrind on cirrus to run the test for issue 22843 https://github.com/dlang/druntime/pull/3788