Bug 17523 – Sporadic ICEs with inline asm

Status
RESOLVED
Resolution
FIXED
Severity
regression
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Windows
Creation time
2017-06-19T11:17:00Z
Last change time
2017-07-03T21:01:53Z
Keywords
ice
Assigned to
nobody
Creator
dlang-bugzilla
See also
https://issues.dlang.org/show_bug.cgi?id=17522

Comments

Comment #0 by dlang-bugzilla — 2017-06-19T11:17:19Z
With a 64-bit dmd.exe (as built using win64.mak), some inline asm instructions can randomly cause an ICE. For example, consider the given file: ////// test.d ///// void fun() { asm { fstp ST(0); } } /////////////////// Compiling the file as usual (dmd.exe -c test.d) can result in any of the following results: 63% chance: the file compiles successfully 25% chance: "Internal error: ddmd\backend\cod3.c 6869" 6% chance: "FLunde Internal error: ddmd\backend\cod3.c 5527" 6% chance: "FLunde Internal error: ddmd\backend\cod3.c 6757" This affects building Druntime and Phobos - building them will likely fail with one of the above errors, or (less probably) succeed. Introduced in https://github.com/dlang/dmd/pull/6379
Comment #1 by uplink.coder — 2017-06-20T15:55:24Z
My advice revert the PR. It's not like iasm is a bottleneck in most cases.
Comment #2 by bugzilla — 2017-06-21T00:39:51Z
Cannot duplicate with the Win32 build of DMD.
Comment #3 by uplink.coder — 2017-06-21T00:50:27Z
(In reply to Walter Bright from comment #2) > Cannot duplicate with the Win32 build of DMD. Some iasm test. It fails on my linux system sometimes. There is definitely something wrong. Maybe try compiling the code about 100-200 times ?
Comment #4 by dlang-bugzilla — 2017-06-21T01:02:31Z
(In reply to Walter Bright from comment #2) > Cannot duplicate with the Win32 build of DMD. It is not reproduced with a 32-bit dmd.exe; you must make a Win64 build as the issue description says.
Comment #5 by bugzilla — 2017-06-21T01:33:13Z
25% chance: "Internal error: ddmd\backend\cod3.c 6869" 6% chance: "FLunde Internal error: ddmd\backend\cod3.c 5527" 6% chance: "FLunde Internal error: ddmd\backend\cod3.c 6757" These lines are nowhere near asserts in the current HEAD. Can you try with HEAD?
Comment #6 by bugzilla — 2017-06-21T01:36:35Z
(In reply to uplink.coder from comment #3) > Some iasm test. It fails on my linux system sometimes. > There is definitely something wrong. > Maybe try compiling the code about 100-200 times ? I tried a 64 bit dmd build on Linux, several thousand times. No errors.
Comment #7 by bugzilla — 2017-06-21T01:38:10Z
Ran it under valgrind on Linux. No errors.
Comment #8 by dlang-bugzilla — 2017-06-21T01:49:31Z
(In reply to Walter Bright from comment #5) > 25% chance: "Internal error: ddmd\backend\cod3.c 6869" > 6% chance: "FLunde Internal error: ddmd\backend\cod3.c 5527" > 6% chance: "FLunde Internal error: ddmd\backend\cod3.c 6757" > > These lines are nowhere near asserts in the current HEAD. Can you try with > HEAD? At the time I couldn't due to issue 17522. Here are the line numbers with HEAD: FLunde Internal error: ddmd\backend\cod3.c 5500 FLunde Internal error: ddmd\backend\cod3.c 6730 Internal error: ddmd\backend\cod3.c 6842
Comment #9 by bugzilla — 2017-06-21T01:50:52Z
Also tried memset'ing the stack variables opnd1 .. opnd4 to 0xFF at the close of asmSemantic(). No errors. There is no host dependent code in iasm.d. So this is a bit mystifying.
Comment #10 by bugzilla — 2017-06-21T02:02:11Z
(In reply to Vladimir Panteleev from comment #8) > At the time I couldn't due to issue 17522. 17522 has been fixed. > Here are the line numbers with HEAD: > > FLunde Internal error: ddmd\backend\cod3.c 5500 > FLunde Internal error: ddmd\backend\cod3.c 6730 > Internal error: ddmd\backend\cod3.c 6842 These suggest a bad opcode in code.Iop. At the end of asmSemantic(), I suggest inserting: printf("asmcode: %02x\n", s.asmcode.Iop); and at the beginning of pinholeopt() before L1: c->print();
Comment #11 by dlang-bugzilla — 2017-06-21T02:32:39Z
Here is the output with the changes you requested from 10 runs: asmcode: e0 code 0000019B245D80B8: nxt=0000000000000000 op=0xE0 flg=40 FLunde code 0000019B245D80F0: nxt=0000000000000000 op=0xC3 code 0000019B245D80F0: nxt=0000000000000000 op=0xC3 Internal error: ddmd\backend\cod3.c 6843 === Exited with status 1 === asmcode: 40 code 00000277E13B8EE8: nxt=0000000000000000 op=0x40 flg=40 code 00000277E13B8F20: nxt=0000000000000000 op=0xC3 code 00000277E13B8F20: nxt=0000000000000000 op=0xC3 === Exited with status 0 === asmcode: d0 code 000001E109C46C18: nxt=0000000000000000 op=0xD0 flg=40 rm=0xD8=3,3,0 code 000001E109C46C50: nxt=0000000000000000 op=0xC3 code 000001E109C46C50: nxt=0000000000000000 op=0xC3 === Exited with status 0 === asmcode: 60 code 000002285BE98B38: nxt=0000000000000000 op=0x60 flg=40 code 000002285BE98B70: nxt=0000000000000000 op=0xC3 code 000002285BE98B70: nxt=0000000000000000 op=0xC3 === Exited with status 0 === asmcode: c0 code 00000244E18AC278: nxt=0000000000000000 op=0xC0 flg=40 rm=0xD8=3,3,0 FLunde code 00000244E18AC2B0: nxt=0000000000000000 op=0xC3 code 00000244E18AC2B0: nxt=0000000000000000 op=0xC3 Internal error: ddmd\backend\cod3.c 6843 === Exited with status 1 === asmcode: 70 code 000002295B099D28: nxt=0000000000000000 op=0x70 flg=40 FLunde FLunde Internal error: ddmd\backend\cod3.c 5501 === Exited with status 1 === asmcode: 80 code 0000022F4FE57D38: nxt=0000000000000000 op=0x80 flg=40 rm=0xD8=3,3,0 FLunde code 0000022F4FE57D70: nxt=0000000000000000 op=0xC3 code 0000022F4FE57D70: nxt=0000000000000000 op=0xC3 Internal error: ddmd\backend\cod3.c 6843 === Exited with status 1 === asmcode: 50 code 000001E7F37DC868: nxt=0000000000000000 op=0x50 flg=40 code 000001E7F37DC8A0: nxt=0000000000000000 op=0xC3 code 000001E7F37DC8A0: nxt=0000000000000000 op=0xC3 === Exited with status 0 === asmcode: 60 code 00000292EE418E48: nxt=0000000000000000 op=0x60 flg=40 code 00000292EE418E80: nxt=0000000000000000 op=0xC3 code 00000292EE418E80: nxt=0000000000000000 op=0xC3 === Exited with status 0 === asmcode: 00 code 0000025538918CE8: nxt=0000000000000000 op=0x00 flg=40 rm=0xD8=3,3,0 code 0000025538918D20: nxt=0000000000000000 op=0xC3 code 0000025538918D20: nxt=0000000000000000 op=0xC3 === Exited with status 0 === If you're having trouble getting win64.mak to work, the latest version of Digger should be able to build a 64-bit DMD on any Windows system with no other dependencies. Run "digger checkout", edit the source files under work/repo/dmd/src/ddmd, and run "digger -c build.components.dmd.dmdModel=64 rebuild --model=32 --without=phobos --without=druntime --without=rdmd --without=phobos-includes" to build a 64-bit dmd.exe under work/result/bin.
Comment #12 by bugzilla — 2017-06-21T03:31:15Z
Hmmm, the opcode should be DD: DD D8 fstp ST Which suggests the problem is confined to iasm.d
Comment #13 by bugzilla — 2017-06-21T03:36:42Z
So the way forward is to follow the flow backwards to where c.Iop is being set and print that. I'd start with line 1384: pc.Iop = opcode; and insert: printf("test1: opcode = 0x%x\n", opcode); Ain't debugging fun? :-)
Comment #14 by dlang-bugzilla — 2017-06-21T03:40:18Z
(In reply to Walter Bright from comment #13) > Ain't debugging fun? :-) Doing it through Bugzilla comments isn't very effective, though. Have you tried building a 64-bit DMD using win64.mak or Digger yet?
Comment #15 by dlang-bugzilla — 2017-06-21T04:02:32Z
(In reply to Walter Bright from comment #13) > printf("test1: opcode = 0x%x\n", opcode); After adding this printf, the bug can no longer be reproduced. Since that pointed at a codegen bug in the host compiler, I tried building DMD with 2.074.1 (instead of Digger's default of 2.070.2 for Win64), and I can no longer reproduce the bug either. I'm going to bisect what caused the behaviour change, just to ensure that the bug disappeared because of a codegen fix, and not because some random change made it no longer manifest for that test case.
Comment #16 by bugzilla — 2017-06-21T04:12:13Z
I haven't tried building Win64 dmd. I figured that would be time consuming :-)
Comment #17 by bugzilla — 2017-06-21T04:13:06Z
(In reply to Vladimir Panteleev from comment #15) > Since that pointed at a codegen bug in the host compiler, I tried building > DMD with 2.074.1 (instead of Digger's default of 2.070.2 for Win64), and I > can no longer reproduce the bug either. > > I'm going to bisect what caused the behaviour change, just to ensure that > the bug disappeared because of a codegen fix, and not because some random > change made it no longer manifest for that test case. That's a good plan. There's a lot that has changed from 70 to 74.
Comment #18 by dlang-bugzilla — 2017-06-28T21:50:33Z
(In reply to Vladimir Panteleev from comment #15) > I'm going to bisect what caused the behaviour change, just to ensure that > the bug disappeared because of a codegen fix, and not because some random > change made it no longer manifest for that test case. Done. Bisecting the version to bootstrap DMD really put Digger to the test, and I needed to make some upgrades :) The bug was fixed in this PR: https://github.com/dlang/dmd/pull/5924 It does seem to be a fix in the backend, but it fixes an ICE, not resulting codegen. Does that make sense to you?
Comment #19 by bugzilla — 2017-07-03T21:01:53Z
> It does seem to be a fix in the backend, but it fixes an ICE, not resulting codegen. Does that make sense to you? Yes, because the compiler itself was corrupted by the Win64 build.