Bug 4307 – spawn()'ed thread doesn't terminate

Status
RESOLVED
Resolution
FIXED
Severity
major
Priority
P2
Component
druntime
Product
D
Version
D2
Platform
x86
OS
All
Creation time
2010-06-13T18:50:00Z
Last change time
2013-05-21T01:01:19Z
Keywords
patch
Assigned to
nobody
Creator
torhu

Attachments

IDFilenameSummaryContent-TypeSize
797thread_d_detached_patch.txtPatch for core.thread for linux with changes based off of dmd 2.050.text/plain8312
798concurrency_d_detached_patch.txtPatch for std.concurrency for linux with changes based off of dmd 2.050.text/plain406
799thread_d_detached_patch.txtPatch for core.thread for linux with changes based off of dmd 2.050.text/plain8310

Comments

Comment #0 by torhu — 2010-06-13T18:50:11Z
Using DMD 2.047. This example hangs after printing '9'. From reading Andrei's book, my understanding is that the spawned thread should terminate automatically when its owner thread terminates. But that doesn't happen here. --- import std.concurrency; import std.stdio; void f() { for (;;) { int i = receiveOnly!int(); writeln(i); } } void main() { Tid tid = spawn(&f); foreach (int i; 0..10) { send(tid, i); } } ---
Comment #1 by issues.dlang — 2010-11-01T01:13:14Z
I don't know if this is the same problem or not, but the most threads that I seem to be able to create in a program is 505 or 506. import std.concurrency; import std.stdio; void main() { int currThreads = 0; enum maxThreads = 6; size_t totalThreads = 0; auto recProc = (Tid tid) { writeln(++totalThreads); }; for(size_t i = 0; i < 1_000; ++i) { if(currThreads < maxThreads) receiveTimeout(1, recProc); else receive(recProc); spawn(&threadFunc, thisTid); } while(currThreads > 0) receive(recProc); } void threadFunc(Tid parentTid) { send(parentTid, thisTid); } prints out to either 505 or 506 and then throws an exception: core.thread.ThreadException: Error creating thread ---------------- ./d(void core.thread.Thread.start()) [0x80902b0] ./d(_D3std11concurrency34__T6_spawnTS3std11concurrency3TidZ6_spawnFbPFS3std11concurrency3TidZvS3std11concurrency3TidZS3std11concurrency3Tid+0x7f) [0x808c3c7] ./d(_D3std11concurrency33__T5spawnTS3std11concurrency3TidZ5spawnFPFS3std11concurrency3TidZvS3std11concurrency3TidZS3std11concurrency3Tid+0x10) [0x808c344] ./d(_Dmain+0x6b) [0x8087d0f] ./d(extern (C) int rt.dmain2.main(int, char**)) [0x8091a56] ./d(extern (C) int rt.dmain2.main(int, char**)) [0x80919b0] ./d(extern (C) int rt.dmain2.main(int, char**)) [0x8091a9a] ./d(extern (C) int rt.dmain2.main(int, char**)) [0x80919b0] ./d(main+0x96) [0x8091956] /usr/lib32/libc.so.6(__libc_start_main+0xe6) [0xf756ec76] ./d() [0x8087bf1] My best guess is that the threads aren't really terminating, so the OS is running out of threads.
Comment #2 by issues.dlang — 2010-11-01T02:12:09Z
If I put a print statement at the end of exec() in std.concurrency._spawn(), it appears to print out just fine every time that a spawned thread is supposed to be terminating, so whatever the problem is, I think that it pretty much has to be in core.Thread (or maybe dmd itself, depending on what's causing the issue) rather than std.concurrency.
Comment #3 by issues.dlang — 2010-11-01T02:39:58Z
I think what's happening is that pthread_join() never gets called on a thread started with spawn(), and IIRC, if pthread_join() never gets called on a thread, it never actually terminates. The only place that I see pthread_join() getting called is in Thread.join(), and that never gets called for threads started with spawn(). What we really want here, I think, is for threads which succesfully terminate to just join to their parent thread themselves without the parent thread having to call join() on them, but I'm not sure that you can really do that with pthreads. Assuming that the parent thread has to join() on all threads created with pthread_create(), we're going to need to find a way to get the parent thread to call join() on its spawned threads. About all I can think of is to have a thread whose entire job is to create threads and and make sure that they join. But it's been too long since I had to deal with pthreads for me to remember all of the details. In any case, I believe that the problem stems from the fact that the spawned threads are never actually joined.
Comment #4 by issues.dlang — 2010-11-01T03:17:58Z
Actually, what I think needs to happen is for there to be a way to start threads as detached rather than joinable and have spawn() start detached threads rather than joinable threads.
Comment #5 by issues.dlang — 2010-11-02T20:33:50Z
Actually, the program I posted earlier was buggy (that's what I get for simplifying code and not studying the simplified version enough, I guess). Here's the corrected version: import std.concurrency; import std.stdio; void main() { int currThreads = 0; enum maxThreads = 6; size_t totalThreads = 0; auto recProc = (Tid tid) { writeln(++totalThreads); --currThreads; }; for(size_t i = 0; i < 1_000; ++i) { if(currThreads < maxThreads) receiveTimeout(1, recProc); else receive(recProc); spawn(&threadFunc, thisTid); ++currThreads; } while(currThreads > 0) receive(recProc); } void threadFunc(Tid parentTid) { send(parentTid, thisTid); }
Comment #6 by issues.dlang — 2010-11-02T21:06:34Z
Created attachment 797 Patch for core.thread for linux with changes based off of dmd 2.050. Here's a patch for core.thread which appears to solve the problem on Linux. Essentially, I made it so that you can give start() a bool to tell it to start the thread as detached (as in pthread_detach(), not detached from the D runtime), which I chose to call joinable (to distinguish from being detached from the D runtime). So, spawn() can call thread.start(false), and then the thread should terminate properly without needing to be joined. The changes are essentially Posix only, so if this problem is also on Windows, then other changes may be required for Windows and/or MacOS X, but I know essentially nothing about threads on either of those systems. So, I don't know if this patch is the best overall solution, but it does appear to fix the problem on Linux, so even if it's not enough of a fix, it should at least help solve the problem.
Comment #7 by issues.dlang — 2010-11-02T21:07:34Z
Created attachment 798 Patch for std.concurrency for linux with changes based off of dmd 2.050. The patch for std.concurrency to go with the patch for core.thread.
Comment #8 by issues.dlang — 2010-11-02T22:30:04Z
Created attachment 799 Patch for core.thread for linux with changes based off of dmd 2.050. Apparently, I made a couple of errors in my patch, so here's update version. Unfortunately, while these changes seem to fix all of the simple cases that I've tried, my more complicated programs still fail to terminate properly, so there appear to be issues beyond spawned threads being joinable instead of detached.
Comment #9 by issues.dlang — 2010-11-02T23:52:22Z
Ah, it looks like the problems with my more complex applications have to do with passing functions with incorrect signatures to receive(), so my patch does seem to do the trick (on Linux at least).
Comment #10 by issues.dlang — 2011-01-24T03:19:32Z
Actually, I'm going to bump this up to major, since spawn is pretty much useless in my experience without this being fixed. Also, I'm 99.99999999999999% sure it's a druntime bug, so I'm moving it to druntime (which I probably would have done before had I noticed that it was marked as dmd).
Comment #11 by sean — 2011-01-25T10:53:43Z
I don't think threads can start as detached, because the GC needs to interact with them. I'll give this another look though. I also fixed a major timing-related bug since 2.051 was released, which may help here as wells.
Comment #12 by issues.dlang — 2011-01-25T11:32:30Z
I don't know exactly what the GC requires with regards to threads, but when I was talking about starting a thread as detached, I meant detached in the pthreads sense, not the GC sense, like core.thread generally talks about with functions like thread_detachThis(). Spawned threads obviously need to be attached to the GC. The problem is that they can't have to be joined unless there's somehow a thread somewhere which cleans them up. Spawned threads are essentially supposed to clean themselves up and go away when they terminate, and that essentially means that they need to be detached in the pthread sense, since the programmer isn't going to have the parent thread join it when it's done.
Comment #13 by sean — 2011-02-03T15:51:51Z
Fixed in the current revision. The threads exit with a MessageMismatch exception though, which seems like a QOI issue. They should probably exit with an OwnerTerminated message instead (which is the message that triggered the mismatch). I've just changed the behavior so LinkTerminated and OwnerTerminated messages are still thrown as-is instead of being considered a MessageMismatch.
Comment #14 by issues.dlang — 2011-02-03T22:39:19Z
What about joining the spawned threads? From what I recall, there was no place that join was called on them, so if I understand correctly, they'll continue to exist until the program terminates. And I believe that there are a finite number of threads allowed at one time, so it would be a problem if spawned threads continued to exist after they're done executing. If spawned threads are joinable, they need to be joined or they'll never actually terminate (with pthreads anyway) - unless I'm misunderstanding something.
Comment #15 by issues.dlang — 2013-05-21T01:01:19Z
I believe that this has been long since fixed - certainly all of the programs here seem to work properly. If someone runs into similar problems, please open a new bug for them.