Bug 19978 – D sometimes just crashes on exit with daemon threads

Status
NEW
Severity
critical
Priority
P2
Component
druntime
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2019-06-17T06:11:30Z
Last change time
2024-12-07T13:39:25Z
Keywords
pull
Assigned to
No Owner
Creator
FeepingCreature
Moved to GitHub: dmd#17385 →

Comments

Comment #0 by default_357-line — 2019-06-17T06:11:30Z
Consider the following code: void main() { import core.thread : Thread; with (new Thread({ })) { isDaemon = true; start; } } On Linux, this builds and runs successfully, maybe 99% of the time. But if you run it in a loop: while true; do ./test || break; done You will see that it segfaults after a few seconds. I asked the forum for repros ( https://forum.dlang.org/post/[email protected] ), and got successful crashes on these systems: - DMD 2.080.0, Linux 4.18.0-20 x86_64 - DMD 2.086.0, Linux 5.0.0-16-generic x86_64 - DMD 2.086.0, Linux hal9000 5.1.9-zen1-1-zen #1 ZEN SMP PREEMPT Tue Jun 11 - LDC2-1.11, Linux 4.14.111 - DMD64 2.086.0, MacOs 10.13.6; Darwin Kernel Version 17.7.0; x86_64 and a failure to reproduce under Windows 10, dmd 2.082.0 and ldc2 1.12.0-beta2. The crashes on MacOs and LDC, effectively excluding backend issues and kernel bugs, strongly hint at a druntime bug. See the forum thread for some backtraces.
Comment #1 by atila.neves — 2019-11-25T11:16:12Z
I copied the code, compiled it with dmd 2.089.0, and ran the shell script in the description. It's still running minutes later. No crashes to report on Arch Linux, kernel version 5.13.12-arch1-1.
Comment #2 by default_357-line — 2019-11-25T18:48:56Z
Still reproducible for me with dmd/druntime/phobos master on 4.14.118 on Linux+NixOS. Surely not a kernel bug? Maybe Arch changed something. If you're bored, try in a Ubuntu VM?
Comment #3 by default_357-line — 2020-05-08T12:53:15Z
Okay, got it. The problem is with a daemon thread, D doesn't join it when shutting down. As a result, the GC shutdown sequence deletes the thread's memory right under it. Why do we do this anyway? (gc_term>os_mem_unmap) Why not let the OS handle the frees?
Comment #4 by razvan.nitu1305 — 2023-02-21T16:19:16Z
(In reply to FeepingCreature from comment #3) > Okay, got it. > > The problem is with a daemon thread, D doesn't join it when shutting down. > As a result, the GC shutdown sequence deletes the thread's memory right > under it. > > Why do we do this anyway? (gc_term>os_mem_unmap) Why not let the OS handle > the frees? The problem is that we have (shared) static constructors/destructors to worry about. Currently, I haven't found any information about the behavior of daemon threads in the presence of module constructors, however, my expectation is that the behavior should be the same as with normal threads (although one could argue that this goes against the spirit of daemon threads - you create them and then you stop caring about them). If we just let the OS do the cleanup then we will not be able to call the static destructors. I see 2 possible solutions to this: 1. Before tearing down the process, we just stop the daemon threads and call static destructors. Since the program is exiting, I don't see a problem since the OS would stop them anyway. 2. Daemon threads do not call static destructors (in the spirit of not caring about them) - I don't really like this since it deviates from the general rule. Essentially, once you start tearing down the process, you cannot allow daemon threads to run, if you hope for graceful exit.
Comment #5 by default_357-line — 2023-02-21T17:31:31Z
How do you run module dtors without stopping the threads anyways? I don't see how that'd ever be safe.
Comment #6 by dlang-bot — 2023-02-22T12:49:19Z
@RazvanN7 created dlang/dmd pull request #14907 "Fix Issue 19978 - D sometimes just crashes on exit with daemon threads" fixing this issue: - Fix Issue 19978 - D sometimes just crashes on exit with daemon threads https://github.com/dlang/dmd/pull/14907
Comment #7 by robert.schadek — 2024-12-07T13:39:25Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/17385 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB