Bug 6215 – LLVM-compiled DMD segfaults due to mem.c alignment issues

Status
RESOLVED
Resolution
FIXED
Severity
blocker
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
Other
OS
Mac OS X
Creation time
2011-06-26T06:28:00Z
Last change time
2017-08-07T16:12:39Z
Assigned to
nobody
Creator
robert
See also
https://issues.dlang.org/show_bug.cgi?id=17726

Comments

Comment #0 by robert — 2011-06-26T06:28:18Z
As of XCode 4.2 (maybe 4.1, I skipped it), Apple has made gcc a symlink for llvm-gcc. When built using llvm-gcc, dmd sefaults in el.c:211: void test(){} void main(){} $ dmd test.d Segmentation fault It fails in el.c:211.
Comment #1 by doob — 2011-06-26T08:34:11Z
GCC is a symlink for LLVM-GCC with XCode 4.1 as well.
Comment #2 by robert — 2011-06-26T11:33:46Z
The following patch is a workaround, it seems something's going wrong with the elem recycling system: ---- diff --git a/src/backend/el.c b/src/backend/el.c index f5fa66d..9cc34fc 100644 --- a/src/backend/el.c +++ b/src/backend/el.c @@ -195,6 +195,7 @@ elem *el_calloc() static elem ezero; elcount++; +#if 0 if (nextfree) { e = nextfree; nextfree = e->E1; @@ -209,6 +210,9 @@ elem *el_calloc() eprm_cnt++; #endif *e = ezero; /* clear it */ +#else + e = (elem *)mem_fmalloc(sizeof(elem)); +#endif #ifdef DEBUG e->id = IDelem; ---- If you print e and *e, *e is NULL, hence the segfault when assigned to.
Comment #3 by sean — 2011-08-11T16:16:23Z
The first problem is in el_calloc(): Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: 13 at address: 0x00000000 0x0009fc2f in el_calloc () at el.c:189 189 *e = ezero; /* clear it */ (gdb) p e $1 = (elem *) 0xa4e7bc Current language: auto; currently c++ (gdb) I can't explain why this isn't working, but it's easily fixed by replacing the assignment with: memset(e, 0, sizeof(elem)); That gets us to the next error: Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: 13 at address: 0x00000000 0x000aa08d in evalu8 (e=0xa50988) at evalu8.c:625 625 esave = *e; (gdb) p e $1 = (elem *) 0xa50988 Current language: auto; currently c++ (gdb) Which is similarly fixed by replacing the assignment with: memcpy(&esave, e, sizeof(elem)); These two changes are enough for LLVM-DMD to build druntime. Given the two errors above, the problem seems to be with the default assignment operator LLVM generates for the elem struct. It's a very weird problem though.
Comment #4 by braddr — 2011-08-11T16:41:20Z
Would you take that info and try the same sort of code in a standalone test case? If struct assignment is indeed the problem, that's a pretty embarrassing llvm bug, imho, and clearly should be reported to either llvm directly or apple as the provider of that version of the compiler.
Comment #5 by code — 2011-08-12T02:25:33Z
(In reply to comment #3) > Given the two errors above, the problem seems to be with the default assignment > operator LLVM generates for the elem struct. It's a very weird problem though. Note that at least the first error happens with both LLVM-GCC, which uses the GCC frontend, and Clang.
Comment #6 by code — 2011-08-12T04:24:54Z
The difference in the LLVM IR generated by Clang for the ezero change is only: - call void @llvm.memcpy.p0i8.p0i8.i32(i8* %17, i8* getelementptr inbounds (%struct.elem* @_ZZ9el_callocvE5ezero, i32 0, i32 0), i32 80, i32 16, i1 false) + call void @llvm.memset.p0i8.i32(i8* %17, i8 0, i32 80, i32 1, i1 false) Note that the second last parameter to memcpy is the alignment (16 bit), but GDB says that »(int)e % 16« is 8.
Comment #7 by code — 2011-08-12T05:06:27Z
And indeed, __alignof__(*e) gives 16, patching the allocator to 16-byte align everything is easy: --- a/src/tk/mem.c +++ b/src/tk/mem.c @@ -758,7 +758,7 @@ void *mem_fmalloc(unsigned numbytes) if (sizeof(size_t) == 2) numbytes = (numbytes + 1) & ~1; /* word align */ else - numbytes = (numbytes + 3) & ~3; /* dword align */ + numbytes = (numbytes + 15) & ~15; /* This ugly flow-of-control is so that the most common case drops straight through.
Comment #8 by code — 2011-08-12T06:17:46Z
A preliminary patch which only 16 byte aligns allocations when building with a LLVM backend is at: https://github.com/D-Programming-Language/dmd/pull/301.
Comment #9 by doob — 2011-08-12T06:24:59Z
Is this specific to Mac OS X or is it like this with LLVM in general?
Comment #10 by code — 2011-08-12T06:53:35Z
(In reply to comment #9) > Is this specific to Mac OS X or is it like this with LLVM in general? Happens on my Linux x86_64 box too.
Comment #11 by sean — 2011-08-12T09:08:45Z
Awesome. I figured it was an alignment mistake for the copy, but ran out of time to investigate. What an embarrassing bug for LLVM.
Comment #12 by code — 2011-08-12T13:50:59Z
(In reply to comment #11) > Awesome. I figured it was an alignment mistake for the copy, but ran out of > time to investigate. What an embarrassing bug for LLVM. Just for clarity, let me note that this is definitely _not_ a bug in LLVM, it just happens with two compilers using LLVM as their backend.
Comment #13 by bugzilla — 2011-08-12T18:28:32Z