Bug 4443 – Optimizer produces wrong code for || or && with struct arrays

Status
RESOLVED
Resolution
FIXED
Severity
critical
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
Other
OS
Windows
Creation time
2010-07-10T01:27:00Z
Last change time
2015-06-09T05:11:43Z
Keywords
patch, wrong-code
Assigned to
nobody
Creator
r.sagitario
Blocks
3761

Comments

Comment #0 by r.sagitario — 2010-07-10T01:27:42Z
The code compiled from this source crashes when compiled with "dmd -O -release test.d": module test; struct TokenInfo { // seems important to have a "complicated" struct size (so a calculation is necessary) int type; int StartIndex; int EndIndex; } int GetTokenInfoAt(TokenInfo[] infoArray, int col, ref TokenInfo info) { for (int i = 0; i < infoArray.length; i++) { int start = infoArray[i].StartIndex; int end = infoArray[i].EndIndex; if (i == 0 && start > col) return -1; if (col >= start && col < end) { info = infoArray[i]; return i; } } if (infoArray.length > 0 && col == infoArray[$-1].EndIndex) { info = infoArray[$-1]; return infoArray.length-1; /* code generated for the 2 statements above: ov ECX,024h[ESP] // infoArray.ptr mov EBX,020h[ESP] // infoArray.length mov EDX,ECX lea ESI,[ECX*2][ECX] // uses pointer instead of length!!! mov EAX,EBX lea ESI,-0Ch[ESI*4][EDX] // crashes here! mov EDI,014h[ESP] movsd movsd movsd lea EAX,-1[EBX] */ } return -1; } int main() { TokenInfo[] arr = new TokenInfo[9]; arr[8].EndIndex = 11; TokenInfo info; return GetTokenInfoAt(arr, 11, info); } The crash goes away if - one of the conditions in the loop are removed - the struct size of TokenInfo is reduced to 8 bytes - dmd is executed without "-O" - dmd is executed without "-release" happens for all dmd version back to 2.032 and also in dmd 1.056
Comment #1 by clugdbug — 2010-07-12T07:58:26Z
Reduced test case. The for loop never runs, and gets optimized away, yet it still corrupts the code. The struct must be large enough that it doesn't fit in a register. The bug can be disabled in gloop.c, intronvars(), by commenting out line 2873, but I'm not yet sure what the root cause is. Probably failing to copy all of the data into the newly created variable. --- struct Struct4443 { int x; char[5] unused; } void foo4443(Struct4443[] arr, Struct4443 *dest) { for (int i = 0; false; ) { int junk = arr[i].x; } if (dest || arr[$-1].x) { *dest = arr[$-1]; } } void main() { Struct4443[1] a; Struct4443 info; foo4443(a, &info); }
Comment #2 by clugdbug — 2010-07-12T13:08:53Z
Further reduction shows that it is unrelated to for loops. Seems to require either || or && in the if statement. This bug existed in prehistoric times (tested on DMD0.140). -------------- struct Struct4443 { int x; char[5] unused; } void foo4443(Struct4443 *dest, Struct4443[] arr) { int junk = arr[$-1].x; if (dest || arr[$-1].x) { *dest = arr[$-1]; } } void main() { Struct4443[1] a; Struct4443 info; foo4443(&info, a); }
Comment #3 by clugdbug — 2010-08-03T21:56:32Z
This is a really subtle code generation bug. The buggy bit of code is below. In this section of code, the expression size is one register. In the bit of code at L13, it checks to see if the addition factor is already in a register. In this case, it is, BUT it's actually in a double register: the {ptr, length} pair is a ulong, and here it's been cast to uint to get the length. So isregvar() returns BOTH registers in regm. Then, at the end of this section of code, the register to use is selected with a call to findreg(). But it just returns "the first register" which is correct only if there is only one register. In this patch, I just use the LSW returned from isregvar. It would also be possible to change isregvar() to only return the LSW in the case where a cast to smaller type has occured, but I don't know what other code expects from isregvar. Not sure if the tysize() check is necessary. Maybe it could just be retregs = 1 << reg1; PATCH: cod2.c, cdorth(), line 362. (Does 1<<reg1 work if reg1 is 0?). L13: regm_t regm; if (e11->Eoper == OPvar && isregvar(e11,&regm,&reg1)) { + if (tysize[tybasic(e11->Ety)]<= REGSIZE) + retregs = 1 << reg1; // Only want the LSW + else retregs = regm; c1 = NULL; freenode(e11); } else c1 = codelem(e11,&retregs,FALSE); } rretregs = ALLREGS & ~retregs; c2 = scodelem(ebase,&rretregs,retregs,TRUE); { regm_t sregs = *pretregs & ~rretregs; if (!sregs) sregs = ALLREGS & ~rretregs; c3 = allocreg(&sregs,&reg,ty); } + // BUG: We should ensure there is only register in retregs reg1 = findreg(retregs); Also noticed that in cod4.c, cdmsw() line 2838, there's an incorrect comment. retregs &= mMSW; // want LSW only This should of course be // want MSW only
Comment #4 by clugdbug — 2010-08-04T00:07:00Z
(In reply to comment #3) > if (!sregs) > sregs = ALLREGS & ~rretregs; > c3 = allocreg(&sregs,&reg,ty); > } > + // BUG: We should ensure there is only register in retregs > reg1 = findreg(retregs); That should be: if (!sregs) sregs = ALLREGS & ~rretregs; c3 = allocreg(&sregs,&reg,ty); } + assert( (retregs & (retregs-1)) == 0); // Must be only one register reg1 = findreg(retregs); There are probably contrived cases where this bug could occur in C++ code, but I don't think it would ever occur in practice.
Comment #5 by bugzilla — 2010-08-05T14:22:53Z
Comment #6 by clugdbug — 2010-08-06T01:24:53Z
*** Issue 3761 has been marked as a duplicate of this issue. ***