Bug 8596 – Indeterministic assertion failure in rehash

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-08-28T12:28:00Z
Last change time
2014-08-06T16:57:38Z
Keywords
ice
Assigned to
nobody
Creator
timon.gehr

Attachments

IDFilenameSummaryContent-TypeSize
1160valgrind.logvalgrind outputtext/x-log17094

Comments

Comment #0 by timon.gehr — 2012-08-28T12:28:46Z
DMD 2.060 The following assertion failure occurs indeterministically: dmd: ../ztc/aa.c:423: void AArray::rehash_x(aaA*, aaA**, size_t): Assertion `0' failed. The frequency seems to increase as project size grows.
Comment #1 by deadalnix — 2012-09-02T08:03:16Z
It tend to become a serious problem to me.
Comment #2 by timon.gehr — 2012-09-02T08:03:54Z
(In reply to comment #1) > It tend to become a serious problem to me. +1.
Comment #3 by bugzilla — 2012-11-12T11:50:58Z
The assert happens when more than one item in the hash table has the same key. A hash table may only have one item per key. It would be helpful if you could run it under linux or osx, compile dmd with -g, and get a stack trace.
Comment #4 by deadalnix — 2012-11-12T12:20:48Z
(In reply to comment #3) > The assert happens when more than one item in the hash table has the same key. > A hash table may only have one item per key. > > It would be helpful if you could run it under linux or osx, compile dmd with > -g, and get a stack trace. I fail to trigger the error when using gdb. Seems like gdb is influencing on what does trigger the error. The error trigger as well when dmd is compiled with -g . Looks like a race condition.
Comment #5 by bugzilla — 2012-11-12T12:39:20Z
dmd isn't multithreaded, so it could not be a race condition. It does, however, sound like memory corruption. Can you run it under valgrind?
Comment #6 by deadalnix — 2012-11-12T12:45:50Z
(In reply to comment #5) > dmd isn't multithreaded, so it could not be a race condition. > > It does, however, sound like memory corruption. > > Can you run it under valgrind? Already tryed, but I don't have enough RAM on my machine to do so. Consider that compiling the project require more than 2.2Gb of RAM without valgrind.
Comment #7 by bugzilla — 2012-11-12T12:51:20Z
If you're running the 64 bit dmd, shouldn't it be able to use far more virtual memory (very slowly)?
Comment #8 by alex — 2012-11-12T12:54:31Z
Well, only if paging is enabled (i.e. there is a swap partition in use). I know I set my systems up without paging because I practically never need it. @deadalnix How much RAM does your system have?
Comment #9 by deadalnix — 2012-11-12T14:00:55Z
(In reply to comment #8) > Well, only if paging is enabled (i.e. there is a swap partition in use). I know > I set my systems up without paging because I practically never need it. > > @deadalnix How much RAM does your system have? I have 4Gb. Minus what the system uses plus some programs running, it is already swapping when dmd uses 2.2Gb.
Comment #10 by deadalnix — 2012-11-12T14:06:19Z
(In reply to comment #9) > (In reply to comment #8) > > Well, only if paging is enabled (i.e. there is a swap partition in use). I know > > I set my systems up without paging because I practically never need it. > > > > @deadalnix How much RAM does your system have? > > I have 4Gb. Minus what the system uses plus some programs running, it is > already swapping when dmd uses 2.2Gb. OK, I did ran valgrind into a terminal without any graphical interface. I hope I'll not run out of swap because of the memory I won disabling the graphical interface.
Comment #11 by deadalnix — 2012-11-12T15:48:04Z
Created attachment 1160 valgrind output
Comment #12 by bugzilla — 2012-11-12T16:56:17Z
(In reply to comment #11) > Created an attachment (id=1160) [details] > valgrind output Fixed the valgrind reported issue. It's definitely a corruption bug in the aa.c code. Testing now. If you want to try it out: ============================== diff aa.bak aa.c 72c72 < delete en; --- > delete [] en; 79c79 < delete en; --- > delete [] en; ============================== I love valgrind.
Comment #13 by bugzilla — 2012-11-12T17:09:08Z
https://github.com/D-Programming-Language/dmd/commit/80884506df7a020e879ba3adda5a98d0465e7164 I won't mark it fixed until you guys can verify, as I don't have your test code.
Comment #14 by deadalnix — 2012-11-12T17:21:20Z
(In reply to comment #13) > https://github.com/D-Programming-Language/dmd/commit/80884506df7a020e879ba3adda5a98d0465e7164 > > I won't mark it fixed until you guys can verify, as I don't have your test > code. Bad news : just tested the patch and got 6 out of 10 compilation failed on this assert.
Comment #15 by bugzilla — 2012-11-12T17:30:24Z
(In reply to comment #14) > > Bad news : just tested the patch and got 6 out of 10 compilation failed on this > assert. What do you mean? Dmd won't compile?
Comment #16 by deadalnix — 2012-11-12T17:35:10Z
(In reply to comment #15) > (In reply to comment #14) > > > > Bad news : just tested the patch and got 6 out of 10 compilation failed on this > > assert. > > What do you mean? Dmd won't compile? No, dmd compile :D But my program didn't using that patched dmd 6 time out of 10. It means that the issue remains.
Comment #17 by bearophile_hugs — 2012-11-12T18:22:33Z
(In reply to comment #16) > It means that the issue remains. Then I suggest to run it again in Valgrind :-) Maybe there are other bugs to fix it spots.
Comment #18 by deadalnix — 2012-11-13T02:01:24Z
(In reply to comment #17) > (In reply to comment #16) > > > It means that the issue remains. > > Then I suggest to run it again in Valgrind :-) Maybe there are other bugs to > fix it spots. I did, but found nothing.
Comment #19 by bugzilla — 2012-11-13T13:32:12Z
Ok, the next step is to compile your app without -g. The reason is because hash tables are used in the dwarf debug generation, and I want to see if that one is the problem or other uses.
Comment #20 by deadalnix — 2012-11-14T04:40:08Z
(In reply to comment #19) > Ok, the next step is to compile your app without -g. The reason is because hash > tables are used in the dwarf debug generation, and I want to see if that one is > the problem or other uses. Tested with flags : -m64 -w -debug -unittest (removing the -gc flag I usually use). I « successfully » triggered the error as well.
Comment #21 by bugzilla — 2012-11-14T16:59:25Z
I've tried various schemes to induce it to fail, and done a code review. I can't find anything wrong. I need a test case.
Comment #22 by deadalnix — 2012-11-25T21:56:49Z
(In reply to comment #21) > I've tried various schemes to induce it to fail, and done a code review. I > can't find anything wrong. I need a test case. Sorry for being unavailable the past few days, I was moving from France to USA. I can trigger the error on a regular basis with the following codebase : git clone git://github.com/deadalnix/SDC.git cd SDC git checkout aa_assert make the program is made to be compiled on linux. My computer is a dual core athlon with 4Gb of RAM. I can't come up with a simple test case that trigger the error, because it seems that size matter here.
Comment #23 by bugzilla — 2012-12-10T04:58:03Z
When I try your instructions, I get: make ~/cbx/mars/dmd -ofbin/sdc src/sdc/*.d src/sdc/ast/*.d src/sdc/pass/*.d src/d/ast/*.d src/d/backend/*.d src/d/parser/*.d src/d/pass/*.d src/util/*.d src/etc/linux/*.d import/llvm/c/target.d -m64 -w -debug -gc -unittest -Iimport -L-L`llvm-config-3.1 --libdir` `llvm-config-3.1 --libs | sed 's/-l/-L-l/g'` -L-lstdc++ -L-ldl -L-lffi /bin/sh: llvm-config-3.1: not found /bin/sh: llvm-config-3.1: not found src/sdc/compilererror.d(188): Warning: statement is not reachable dmd: func.c:1200: virtual void FuncDeclaration::semantic3(Scope*): Assertion `type == f' failed. Aborted make: *** [bin/sdc] Error 134 which is not the error you reported.
Comment #24 by renezwanenburg — 2013-04-01T18:25:08Z
Running into this issue since today. I'm developing on a 64 bit windows machine targeting 32 bit, production machine is running 64 bit debian. Using DMD 2.062 on both boxes. The bug pops up at the strangest moments. I can confirm it's indeterministic, but so far the production box never failed to build. That said, I don't build very often on production, esp. if the build failed on the dev box ;). For the sake of completeness I've just tried compiling a version of the project on the production box which almost always fails on the dev box. So far it keeps succeeding. On the windows box I recently flipped the LARGE_ADDRESS_AWARE bit on dmd.exe to work around issue 6498, perhaps this has something to do with it? As an indication of project size, the project is medium sized: vibe.d + 1937 lines, but it's very CTFE heavy: we're using vibe.d's diet template parser to generate 21 rather large web pages. Compilation requires a little over two GB of memory. Is there any additional info I can provide?
Comment #25 by maxim — 2013-04-02T07:13:47Z
I think somebody should provide a (snapshot of) project to compile to investigate the problem, deadalnix's link is outdated and guessing based on valgrind output is not a very good idea.
Comment #26 by renezwanenburg — 2013-04-02T07:24:10Z
Yeah I was afraid of that :) I'd have to take it up with the PHB's, not sure what the answer will be. TBH I doubt I can make the code public. Not that there's anything of value in it, but you know how it is... I'll report back when I've got an answer.
Comment #27 by bugzilla — 2013-11-15T14:59:28Z
Without a reproducible test case, this issue is really dead in the water.
Comment #28 by deadalnix — 2013-11-15T15:03:35Z
(In reply to comment #27) > Without a reproducible test case, this issue is really dead in the water. I assume you know what indeterministic means. Sadly these are very hard issue to solve. I haven't seen it triggered for a long time now. It may be solve, or my code evolved to not trigger the error anymore.
Comment #29 by yiannos_tgu — 2013-11-20T06:22:40Z
I also have the same errors: I run a quite large vibe.d app with dmd in DUB. Assertion failed: (0), function rehash_x, file ztc/aa.c, line 423. Error: DMD compile run failed with exit code -6 Full exception: object.Exception@source/dub/compilers/dmd.d(169): DMD compile run failed with exit code -6 ---------------- 5 dub 0x0000000103a12437 pure @safe bool std.exception.enforce!(bool).enforce(bool, lazy const(char)[], immutable(char)[], ulong) + 107 6 dub 0x00000001039c407f void dub.compilers.dmd.DmdCompiler.invoke(const(dub.compilers.compiler.BuildSettings), const(dub.compilers.compiler.BuildPlatform)) + 883 7 dub 0x00000001039cd1a9 void dub.generators.build.BuildGenerator.generateProject(dub.generators.generator.GeneratorSettings) + 3013 8 dub 0x000000010399ea6c void dub.dub.Dub.generateProject(immutable(char)[], dub.generators.generator.GeneratorSettings) + 160 9 dub 0x0000000103992f55 _Dmain + 6773 10 dub 0x0000000103a771b1 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll().void __lambda1() + 33 11 dub 0x0000000103a770fd void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 45 12 dub 0x0000000103a7715d void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll() + 45 13 dub 0x0000000103a770fd void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 45 14 dub 0x0000000103a77079 _d_run_main + 449 15 dub 0x0000000103993162 main + 34 16 libdyld.dylib 0x00007fff8f0287e1 start + 0 17 ??? 0x0000000000000002 0x0 + 2
Comment #30 by yiannos_tgu — 2013-11-21T09:27:05Z
I don't know if that helps but i also get this with rdmd. I use dub --rdmd --v. Undefined symbols for architecture x86_64: "_D4vibe8internal4meta6traits12__ModuleInfoZ", referenced from: _D4vibe8internal4meta3uda12__ModuleInfoZ in 3717175639-gigdive-develop.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) --- errorlevel 1 Error: Build command failed with exit code 1 Full exception: object.Exception@source/dub/generators/rdmd.d(92): Build command failed with exit code 1 ---------------- 5 dub 0x00000001093b2437 pure @safe bool std.exception.enforce!(bool).enforce(bool, lazy const(char)[], immutable(char)[], ulong) + 107 6 dub 0x000000010937220f void dub.generators.rdmd.RdmdGenerator.generateProject(dub.generators.generator.GeneratorSettings) + 2155 7 dub 0x000000010933ea6c void dub.dub.Dub.generateProject(immutable(char)[], dub.generators.generator.GeneratorSettings) + 160 8 dub 0x0000000109332f55 _Dmain + 6773 9 dub 0x00000001094171b1 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll().void __lambda1() + 33 10 dub 0x00000001094170fd void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 45 11 dub 0x000000010941715d void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll() + 45 12 dub 0x00000001094170fd void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 45 13 dub 0x0000000109417079 _d_run_main + 449 14 dub 0x0000000109333162 main + 34 15 libdyld.dylib 0x00007fff8b3197e1 start + 0 16 ??? 0x0000000000000003 0x0 + 3
Comment #31 by hsteoh — 2014-08-06T14:30:46Z
Comment #32 by github-bugzilla — 2014-08-06T16:57:37Z