Bug 6443 – [GSoC] Catching exceptions in fibers broken on Windows/Linux x86_64
Status
RESOLVED
Resolution
FIXED
Severity
critical
Priority
P2
Component
druntime
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2011-08-06T12:23:00Z
Last change time
2011-10-29T00:56:55Z
Assigned to
nobody
Creator
code
Comments
Comment #0 by code — 2011-08-06T12:23:46Z
The following program crashes on Windows for me, while working as expected on Linux and OS X:
---
import core.thread;
import std.stdio;
void main() {
(new Fiber({
try {
throw new Exception("Foo!");
} catch (Exception e) {
stderr.writefln("Caught: %s", e);
}
})).call();
}
---
DMD/druntime from Git master, running on Windows Server 2008 R2 x86_64 (inside a VirtualBox VM, but that shouldn't matter). When building/debugging with Visual D, I get a stack overflow in release mode (somewhere inside KernelBase.dll), and a »Unhandled exception at 0x7547b9bc in ConsoleApp1.exe: 0xE0440001: 0xe0440001« in debug mode.
Since 0xe044001 is STATUS_DIGITAL_MARS_D_EXCEPTION, the obvious guess would be that the fiber context switching code is somehow messing with SEH, so that the exception is never actually caught. Unfortunately, I don't know enough about the Win32 internals to be able to efficiently track this down.
Comment #1 by code — 2011-08-06T12:28:00Z
(the writefln() was only for demonstration purposes and is not needed in order to trigger the bug)
Comment #2 by code — 2011-08-10T21:39:58Z
I unfortunately stand corrected: The snippet crashes on Linux x86_64 as well:
---
#0 0x00007ffff7dec9d4 in _dl_sysdep_read_whole_file () from /lib/ld-linux-x86-64.so.2
#1 0x00007ffff7de68aa in _dl_load_cache_lookup () from /lib/ld-linux-x86-64.so.2
#2 0x00007ffff7de5ebe in _dl_map_object () from /lib/ld-linux-x86-64.so.2
#3 0x00007ffff7defa9b in dl_open_worker () from /lib/ld-linux-x86-64.so.2
#4 0x00007ffff7deb9e6 in _dl_catch_error () from /lib/ld-linux-x86-64.so.2
#5 0x00007ffff7def63a in _dl_open () from /lib/ld-linux-x86-64.so.2
#6 0x00007ffff74df600 in ?? () from /lib/libc.so.6
#7 0x00007ffff7deb9e6 in _dl_catch_error () from /lib/ld-linux-x86-64.so.2
#8 0x00007ffff74df69f in ?? () from /lib/libc.so.6
#9 0x00007ffff74df707 in __libc_dlopen_mode () from /lib/libc.so.6
#10 0x00007ffff74bbbb5 in ?? () from /lib/libc.so.6
#11 0x00007ffff79c5ff0 in pthread_once () from /lib/libpthread.so.0
#12 0x00007ffff74bbcd4 in backtrace () from /lib/libc.so.6
#13 0x0000000000419cf3 in core.runtime.defaultTraceHandler() ()
#14 0x0000000000000000 in ?? ()
---
From the above trace, this seems to be a problem in the default backtrace handler, and indeed, when I disable it by calling rt_setTraceHandler(0) (e.g. from GDB), the program works as expected (on Windows, it crashes regardless of whether the handler is enabled or not).
Comment #3 by code — 2011-08-10T22:03:39Z
*** Issue 6025 has been marked as a duplicate of this issue. ***
Comment #4 by sean — 2011-08-19T14:42:38Z
It sounds like the Fiber code isn't properly setting something the backtrace code needs to know where the base of the stack is. It may take a while to figure this one out.
Comment #5 by code — 2011-10-29T00:56:55Z
Fixed in 2.056. Sean, do you really insist on not closing »your« bugs until the release is out? Immediately closing them after the fix is in would reduce the chance of bugs accidentally staying open, and help Walter compile the changelog.