Bug 17037 – std.concurrency has random segfaults

Status
RESOLVED
Resolution
FIXED
Severity
major
Priority
P1
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2016-12-28T04:17:15Z
Last change time
2021-10-19T13:29:55Z
Keywords
pull
Assigned to
No Owner
Creator
Seb

Comments

Comment #0 by greeenify — 2016-12-28T04:17:15Z
In the Phobos test suite, std.concurrency randomly segfaults from time to time: timelimit -t 90 generated/linux/release/32/unittest/test_runner std.concurrency generated/linux/release/32/unittest/libphobos2-ut.so(_D4core7runtime18runModuleUnitTestsUZ19unittestSegvHandlerUNbiPS4core3sys5posix6signal9siginfo_tPvZv+0x50)[0xf6b30d60] [0xf77c3cc0] generated/linux/release/32/unittest/libphobos2-ut.so(_D3std11concurrency8thisInfoFNbNcNdZS3std11concurrency10ThreadInfo+0x4a)[0xf5274d72] generated/linux/release/32/unittest/libphobos2-ut.so(_D3std11concurrency12unregisterMeFZv+0x32)[0xf52759ea] generated/linux/release/32/unittest/libphobos2-ut.so(_D3std11concurrency10ThreadInfo7cleanupMFZv+0x88)[0xf5275f40] generated/linux/release/32/unittest/libphobos2-ut.so(_D3std11concurrency12_staticDtor1FZv+0x27)[0xf5274daf] generated/linux/release/32/unittest/libphobos2-ut.so(_D3std11concurrency9__moddtorFZv+0x1f)[0xf529ee57] generated/linux/release/32/unittest/libphobos2-ut.so(_D2rt5minfo74__T17runModuleFuncsRevS482rt5minfo11ModuleGroup11runTlsDtorsMFZ9__lambda1Z17runModuleFuncsRevMFAxPyS6object10ModuleInfoZv+0x51)[0xf6b71611] generated/linux/release/32/unittest/libphobos2-ut.so(_D2rt5minfo11ModuleGroup11runTlsDtorsMFZv+0x26)[0xf6b70f06] generated/linux/release/32/unittest/libphobos2-ut.so(_D2rt5minfo16rt_moduleTlsDtorUZ14__foreachbody1MFKS2rt19sections_elf_shared3DSOZi+0x2b)[0xf6b71343] generated/linux/release/32/unittest/libphobos2-ut.so(_D2rt19sections_elf_shared3DSO14opApplyReverseFMDFKS2rt19sections_elf_shared3DSOZiZi+0x72)[0xf6b73142] generated/linux/release/32/unittest/libphobos2-ut.so(rt_moduleTlsDtor+0x2c)[0xf6b7130c] generated/linux/release/32/unittest/libphobos2-ut.so(thread_entryPoint+0x328)[0xf6b31a08] /lib/libpthread.so.0(+0x6b0c)[0xf1d13b0c] /lib/libc.so.6(clone+0x5e)[0xf1bf574e] make[1]: *** [unittest/std/concurrency.run] Error 139
Comment #1 by safety0ff.bugz — 2016-12-28T04:27:50Z
I've only seen this in the stable branch, it is possible it was fixed in master by: https://github.com/dlang/phobos/pull/4191 ?
Comment #2 by greeenify — 2016-12-28T04:38:50Z
(In reply to safety0ff.bugz from comment #1) > I've only seen this in the stable branch, it is possible it was fixed in > master by: https://github.com/dlang/phobos/pull/4191 ? Just spotted it here: https://github.com/dlang/phobos/pull/5001 (sorry for forgetting to include this exact link)
Comment #3 by safety0ff.bugz — 2016-12-29T20:06:03Z
Seems to be a race involving the global scheduler: __gshared Scheduler scheduler; @property ref ThreadInfo thisInfo() nothrow { 1: if ( scheduler is null ) 2: return ThreadInfo.thisInfo; 3: return scheduler.thisInfo; } If a thread sets scheduler to null after another has evaluated line 1 to false but hasn't run line 3, then the other thread tries to run scheduler.thisInfo with a null scheduler. I'm not sure what the design is for the global scheduler is with regard to concurrent access. I.e. I'm wondering if all the `scheduler is null` checks be changed to: auto lscheduler = atomicLoad(scheduler); if (lscheduler is null) return ...; lscheduler. ... //
Comment #4 by safety0ff.bugz — 2016-12-29T21:49:05Z
Comment #5 by safety0ff.bugz — 2016-12-30T04:42:31Z
Comment #6 by bugzilla — 2019-12-14T10:02:12Z
In the last two months I havn't seen this in the results of the test suite. Is it still there?
Comment #7 by bugzilla — 2021-02-20T19:49:25Z
Running "make -f posix.mak -j3 style" on my local computer I got this today: double free or corruption (out) Error: program killed by signal 6 make: *** [posix.mak:651: std/concurrency.publictests] Fehler 1
Comment #8 by bugzilla — 2021-07-26T01:17:42Z
Still seeing it in FreeBSD running the Phobos test suite: FreeBSD 11.4 x64, DMD (bootstrap) Failing after 11m — Task Summary make[1]: *** [posix.mak:412: unittest/std/concurrency.run] Segmentation fault (core dumped)
Comment #9 by dlang-bot — 2021-10-16T10:35:14Z
@WalterWaldron updated dlang/phobos pull request #5004 "Fix issue 17037 - std.concurrency has random segfaults" fixing this issue: - Fix issue 17037 - std.concurrency has random segfaults https://github.com/dlang/phobos/pull/5004
Comment #10 by dlang-bot — 2021-10-19T13:29:55Z
dlang/phobos pull request #5004 "Fix issue 17037 - std.concurrency has random segfaults" was merged into master: - 2c6051da1023f535544de5f575c013803286f62c by WalterW: Fix issue 17037 - std.concurrency has random segfaults https://github.com/dlang/phobos/pull/5004