Don't have more information but a failed auto-tester run where this test didn't complete within a minute (usually only takes a few ms).
https://auto-tester.puremagic.com/show-run.ghtml?projectid=14&runid=2129483&isPull=true
It was the release64 build that failed.
Commits:
dmd: 5a16fbbd9bcc65e52aabd517e6be8a77130cbc40
druntime: 0eade7404fa8bdea0d5088c3367eae7f7805ddce
phobos: 01eb06bb3897cd359d01a6c268785e5ee42789c0
Comment #1 by schveiguy — 2016-08-05T17:44:05Z
Created attachment 1606
Capture of log from failed test.
I'm pretty sure those logs go away. I've attached the log in any case.
Comment #2 by code — 2016-08-06T06:17:35Z
They do, thanks. There is not much information in the log other than it did hang at the commit hashes.
Comment #3 by john.loughran.colvin — 2016-12-12T22:25:31Z
After a bunch of testing I've managed to reproduce this reliably, stop it, attach gdb and get a backtrace.
The hang happens here:
https://github.com/dlang/phobos/blob/19445fc71e8aabdbd42f0ad8a571a57601a5ff39/std/experimental/allocator/building_blocks/free_list.d#L1025
In the backtrace you'll se std.experimental.allocator.building_blocks.free_list.__unittestL1020_10, that's just a consequence of some accidental reformatting before i tested, the real line number is 1025 as in the link above
#0 0x0000667f4afa810f in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib/libpthread.so.0
#1 0x000000000042e8cd in core.sync.condition.Condition.wait() ()
#2 0x0000000000414b80 in std.concurrency.MessageBox.get!(void(bool) pure nothrow @nogc @safe delegate, void(std.concurrency.LinkTerminated) pure @nogc @safe function, void(std.concurrency.OwnerTerminated) pure @nogc @safe function, void(std.variant.VariantN!(32uL).VariantN) function).get(void(bool) pure nothrow @nogc @safe delegate, void(std.concurrency.LinkTerminated) pure @nogc @safe function, void(std.concurrency.OwnerTerminated) pure @nogc @safe function, void(std.variant.VariantN!(32uL).VariantN) function) ()
#3 0x0000000000414146 in std.concurrency.receiveOnly!(bool).receiveOnly() ()
#4 0x0000000000402faa in std.experimental.allocator.building_blocks.free_list.__unittestL1020_10() ()
#5 0x0000000000419eba in std.experimental.allocator.building_blocks.free_list.__modtest() ()
#6 0x000000000042c5a1 in core.runtime.runModuleUnitTests().__foreachbody2(object.ModuleInfo*) ()
#7 0x000000000041bb6c in object.ModuleInfo.opApply(scope int(object.ModuleInfo*) delegate).__lambda2(immutable(object.ModuleInfo*)) ()
#8 0x0000000000421fb3 in rt.minfo.moduleinfos_apply(scope int(immutable(object.ModuleInfo*)) delegate).__foreachbody2(ref rt.sections_elf_shared.DSO) ()
#9 0x00000000004221b5 in rt.sections_elf_shared.DSO.opApply(scope int(ref rt.sections_elf_shared.DSO) delegate) ()
#10 0x0000000000421f44 in rt.minfo.moduleinfos_apply(scope int(immutable(object.ModuleInfo*)) delegate) ()
#11 0x000000000041bb48 in object.ModuleInfo.opApply(scope int(object.ModuleInfo*) delegate) ()
#12 0x000000000042c493 in runModuleUnitTests ()
#13 0x000000000041eab3 in rt.dmain2._d_run_main(int, char**, extern(C) int(char[][]) function).runAll() ()
#14 0x000000000041ea51 in rt.dmain2._d_run_main(int, char**, extern(C) int(char[][]) function).tryExec(scope void() delegate) ()
#15 0x000000000041e9cb in _d_run_main ()
#16 0x0000000000419ff6 in main ()
#17 0x0000667f4a4fa291 in __libc_start_main () from /usr/lib/libc.so.6
#18 0x000000000040280a in _start ()
Comment #4 by john.loughran.colvin — 2016-12-12T22:34:55Z
To reproduce on linux x86_64:
% ../dmd/src/dmd -conf= -I../druntime/import -w -dip25 -m64 -O -release -main -unittest generated/linux/release/64/libphobos2.a -defaultlib= -debuglib= -L-ldl std/experimental/allocator/building_blocks/free_list.d
% seq 10000 | xargs -Iz ./free_list
Comment #5 by john.loughran.colvin — 2016-12-13T12:48:09Z
It seems that all the threads exit but (in my tests) one message either is never sent or is never received by the main thread, so it sits in receiveOnly!bool
Comment #6 by safety0ff.bugz — 2016-12-22T10:37:30Z
SharedFreeList.allocate looks ABA prone:
A thread does:
do
{
oldRoot = _root; // atomic load
if (!oldRoot) return allocateFresh(bytes);
next = oldRoot.next; // atomic load
}
while (!cas(&_root, oldRoot, next));
But the value of `next` could have changed between the load and the cas.
Comment #7 by safety0ff.bugz — 2016-12-22T10:46:36Z
(In reply to safety0ff.bugz from comment #6)
>
> But the value of `next` could have changed between the load and the cas.
I meant `oldRoot.next`. i.e. next != oldRoot.next after the cas succeeds.
Comment #8 by r.sagitario — 2016-12-22T17:17:38Z
> SharedFreeList.allocate looks ABA prone:
I agree. The actual pattern to use depends on the hardware, but x86 usually uses a modification counter modified in lock step.
Comment #9 by safety0ff.bugz — 2016-12-22T18:12:19Z
(In reply to Rainer Schuetze from comment #8)
>
> I agree. The actual pattern to use depends on the hardware, but x86 usually
> uses a modification counter modified in lock step.
I'm just going to slap core.internal.spinlock on it for now.
Somebody else can improve it later.
I just don't want the autotester choking on unrelated changes.
There's also the issue on x86_64 that we can't use the upper bits (because ParentAllocator could be GCAllocator,) and not all x86_64 machines have cmpxchg16b.
AFAIK shared free lists aren't very good for high contention regardless.
Comment #10 by safety0ff.bugz — 2016-12-22T18:23:28Z