Bug 14476 – core.thread unit tests failing on FreeBSD 9+

Status
RESOLVED
Resolution
FIXED
Severity
major
Priority
P1
Component
druntime
Product
D
Version
D2
Platform
All
OS
FreeBSD
Creation time
2015-04-21T06:12:00Z
Last change time
2017-07-19T17:41:36Z
Keywords
pull
Assigned to
nobody
Creator
issues.dlang

Comments

Comment #0 by issues.dlang — 2015-04-21T06:12:32Z
I am consistently seeing this when I try and run druntime's unit tests on FreeBSD for either 2.067 or master (2.068 alpha): 0.000s PASS release64 object 0.000s PASS release64 core.atomic 0.008s PASS release64 core.bitop 0.000s PASS release64 core.checkedint 0.000s PASS release64 core.demangle 0.000s PASS release64 core.exception 0.000s PASS release64 core.math 0.000s PASS release64 core.memory posix.mak:230: recipe for target 'obj/64/core/thread' failed gmake: *** [obj/64/core/thread] Illegal instruction gmake: *** Deleting file 'obj/64/core/thread' The druntime unit tests for 2.066 run just fine, so whatever the problem is was either introduced in 2.067, or a new test that triggers it was introduced. I'm running the latest PC-BSD on x86_64 (so FreeBSD 10.1), and someone else in the newsgroup sees the same thing on their 9.1 i386 machine: http://forum.dlang.org/post/[email protected] The autotester is not hitting this problem, so clearly, it doesn't exist on all FreeBSD systems However, apparently, the autotester is currently running FreeBSD 8.4, so that would imply that the problem only exists in FreeBSD 9+. I narrowed it down to the last test in core.thread: unittest { auto thr = new Thread(function{}, 10).start(); thr.join(); } And if I remove the ", 10" from the constructor call, then it works - but then the druntime unit test build fails later: Testing link Testing load Testing linkD Testing linkDR Testing loadDR Testing host Testing finalize Testing link_linkdep Makefile:28: recipe for target 'obj/freebsd/64/link_linkdep.done' failed gmake[1]: *** [obj/freebsd/64/link_linkdep.done] Segmentation fault gmake[1]: Leaving directory '/usr/home/jmdavis/Programming/github/druntime/test/shared' posix.mak:242: recipe for target 'test/shared/.run' failed gmake: *** [test/shared/.run] Error 2 I have no idea if it's a related problem or not, but if it isn't, then another problem was introduced in 2.067 which only exists on FreeBSD 9+. But regardless, something about setting the stack size for threads isn't working properly on FreeBSD 9+.
Comment #1 by dbugz — 2015-04-21T07:52:16Z
I can confirm that that last unittest is the one causing the problem on i386 also, as commenting it out gets core.thread to pass and the tests to fail here instead: Testing link Testing load Testing linkD Testing linkDR Testing loadDR Testing host gmake[1]: *** [obj/freebsd/32/host.done] Segmentation fault: 11 (core dumped) This is with dmd/druntime HEAD on FreeBSD 9.1 stable from a couple years ago running in a VM.
Comment #2 by issues.dlang — 2015-04-26T11:34:59Z
It looks like it's this commit in druntime which broke things: commit 5c96aca53bf63a9abc58fd45b59156e605c5fa3a Author: Martin Nowak <[email protected]> Date: Tue Jan 20 08:56:25 2015 +0100 round thread stack size to pagesize and min stack size The second failure with Testing link_linkdep is there before that commit, but that's the commit that introduces the failure in core.thread.
Comment #3 by issues.dlang — 2015-04-26T11:42:26Z
It looks like this was the pull request that introduced the change: https://github.com/D-Programming-Language/druntime/pull/1109
Comment #4 by code — 2015-04-26T14:01:41Z
(In reply to Jonathan M Davis from comment #2) > It looks like it's this commit in druntime which broke things: I hope you used https://github.com/CyberShadow/Digger to bisect this.
Comment #5 by code — 2015-04-26T14:14:32Z
(In reply to Jonathan M Davis from comment #2) > The second failure with > > Testing link_linkdep 2.067.0 comes with shared library support for FreeBSD, not sure why they fail on 9.1. The ugly runtime liker bug is fixed in both 8.4 and 9.1. http://svnweb.freebsd.org/base?view=revision&revision=226155
Comment #6 by issues.dlang — 2015-04-26T14:50:12Z
(In reply to Martin Nowak from comment #4) > (In reply to Jonathan M Davis from comment #2) > > It looks like it's this commit in druntime which broke things: > > I hope you used https://github.com/CyberShadow/Digger to bisect this. I've never used Digger, and if I knew about it, I'd forgotten. But git-bisect was plenty, and given that that commit adds the test that fails and the code that it's testing, it's not exactly surprising. The harder part was figuring out what pull request the commit was associated with. But unfortunately, I don't know much about what the code is doing, which makes it harder for me to be helpful.(In reply to Martin Nowak from comment #5) > (In reply to Jonathan M Davis from comment #2) > > The second failure with > > > > Testing link_linkdep > > 2.067.0 comes with shared library support for FreeBSD, not sure why they > fail on 9.1. The ugly runtime liker bug is fixed in both 8.4 and 9.1. > http://svnweb.freebsd.org/base?view=revision&revision=226155 Hmmm. I'm using 10.1, so I would _hope_ that it would be fine given that the older versions are, but then again, the code in core.thread seems to work fine on 8.4 and not 9.1 or 10.1. However, looking at Joakim's post, his 32-bit 9.1 VM is failing in a different place earlier in the tests if he comments out the failing core.thread test, so for that problem, 9.1 and 10.1 don't seem to be acting the same (though maybe it's a 32-bit vs 64-bit problem, since he's using 32-bits, whereas I'm using 64). Regardless, it's clear that 8.4 is not acting the same as later versions, so the version of FreeBSD seems to matter more than would be desirable. Maybe I should figure out a way to get a FreeBSD 10.1 setup available for Brad on the autotester so that we're not just testing on an older version - though if he wants the current machines to be supported, he'll have to update the current FreeBSD machines by July according to what freebsd.org says about the support lifecycle of 8.4. But for better or worse, I'm now using FreeBSD 10.1 as my main OS, so I'm likely to start noticing some of these problems that have been getting passed the autotester.
Comment #7 by code — 2015-05-02T18:21:34Z
This is a simple stack overflow. The unittest creates a thread with the minumum stack size of 4KiB and that thread crashes when calling pthread_attr_get_np. https://github.com/freebsd/freebsd/blob/01e375543f2cca888435d33af45404f00296ca0c/lib/libthr/thread/thr_attr.c#L139 The failure happens in a syscall. pthread_attr_get_np -> calloc -> syscall -> sbrk
Comment #8 by code — 2015-05-02T18:27:36Z
Comment #9 by dbugz — 2015-05-12T21:23:33Z
After running the tests some more on my 9.1 i386 VM, it turns out this test is not the issue. The FreeBSD-only test added in issue 13416 seems to be the real culprit, reopened that one instead.
Comment #10 by github-bugzilla — 2015-05-14T01:07:42Z
Commits pushed to master at https://github.com/D-Programming-Language/druntime https://github.com/D-Programming-Language/druntime/commit/fc1013763ff485844b1fef4696498b677a44b4db fix Issue 14476 - core.thread unit tests failing on FreeBSD 9+ - the newly created thread fails because of a stack overflow in a syscall to sbrk triggered by pthread_attr_get_np/calloc https://github.com/D-Programming-Language/druntime/commit/8e5df2b76a1c3be5b6f18f4ba832a740649a6ca8 Merge pull request #1248 from MartinNowak/fix14476 fix Issue 14476 - core.thread unit tests failing on FreeBSD 9+
Comment #11 by github-bugzilla — 2015-06-17T21:02:41Z
Comment #12 by github-bugzilla — 2017-07-19T17:41:36Z
Commits pushed to dmd-cxx at https://github.com/dlang/druntime https://github.com/dlang/druntime/commit/fc1013763ff485844b1fef4696498b677a44b4db fix Issue 14476 - core.thread unit tests failing on FreeBSD 9+ https://github.com/dlang/druntime/commit/8e5df2b76a1c3be5b6f18f4ba832a740649a6ca8 Merge pull request #1248 from MartinNowak/fix14476