Bug 9387 – Compiler switch -O changes behavior of correct code

Status
RESOLVED
Resolution
FIXED
Severity
regression
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
All
Creation time
2013-01-24T07:56:00Z
Last change time
2013-01-31T00:52:53Z
Keywords
ice, wrong-code
Assigned to
nobody
Creator
stephan.schiffels

Attachments

IDFilenameSummaryContent-TypeSize
1182brent_test.dSource file with program thatapplication/octet-stream2320

Comments

Comment #0 by stephan.schiffels — 2013-01-24T07:56:09Z
Created attachment 1182 Source file with program that The attached program implements a part of Brent's minimization algorithm for one-dimensionsal functions. The code is from Numerical Recipes 3rd edition. I use dmd 2.061/ When I run the program with "rdmd brent_test.d" it runs fine and gives the correct result. When I run it with optimization, i.e. with "rdmd -O brent_test.d", it behaves differently. It enters some infinite loop and eventually throws the expected exception for too many iterations. You can see that I placed a writefln() into line 45, which outputs the value of variable a. When you move this writefln statement just one line below, i.e. below the if-statement, the code runs fine, even with optimization. I colleague of mine suggested that there might be a bug related to a large number of local variables. Maybe some limiting number of registers causes the machine to cache things into memory and pulling them back in a wrong way or something. Appreciate help! Stephan
Comment #1 by stephan.schiffels — 2013-01-24T08:06:00Z
During debugging, I actually looked at the value of every single local variable, and you can actually see how the value of some variables (for example "a") changes from one iteration to the next, without any assignment.
Comment #2 by stephan.schiffels — 2013-01-24T10:00:41Z
I just checked: The bug definitely was introduced with version 2.061! With dmd version 2.060, everything works fine, with and without the "-O" switch.
Comment #3 by bugzilla — 2013-01-25T00:30:21Z
I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you can try.
Comment #4 by stephan.schiffels — 2013-01-25T05:01:07Z
(In reply to comment #3) > I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you > can try. What actually seems to be corrupted are the precompiled executables on the zip-file on the web. We checked this for the osx and the linux version. Both of these precompiled versions produce this bug. When we compile dmd from source, even for version 2.061 from the web, this bug does not occur. Stephan
Comment #5 by stephan.schiffels — 2013-01-25T06:53:50Z
(In reply to comment #3) > I can't reproduce this with the latest dmd. I'll upload a new beta tomorrow you > can try. Sorry to jump back and forth here. I have to again correct my previous statement: With the latest version of dmd/druntime/phobos (2.062 from git), this bug does occur! But only when you compile and run separately. When you use dmd -run, both versions with and without -O work fine. This is quite weird. So: dmd -O brent_test.d ./brent_test should produce a different outcome than dmd brent_test.d ./brent_test I will try use bisect to find out when this bug was introduced. Stephan
Comment #6 by bugzilla — 2013-01-25T10:37:02Z
When I compile and run separately, it works fine. You should also clarify whether you are using -m64 or not.
Comment #7 by stephan.schiffels — 2013-01-25T10:43:26Z
Right, I use the 64bit model. And I tested this on OSX and on linux, with same outcomes on both platforms. It's frustating that you can't reproduce. Thanks for responding quickly on this anyway. I will see what I can find out with bisect. Stephan
Comment #8 by clugdbug — 2013-01-28T01:01:42Z
BTW you might be interested in std.numeric.findRoot, which is the root-finding-by-bracketing algorithm (in contrast to "Brent's algorithm" which is minima-finding-by-bracketing). In terms of number of calls, I believe it beats all published algorithms (in some cases, by an order of magnitude). I should really publish it. I did some work on the minima problem as well, and put it into Tango, but it isn't in Phobos. The code is very old now, dating from a time where there were many compiler limitations, and it could use a review.
Comment #9 by clugdbug — 2013-01-28T01:03:34Z
...and I can reproduce your bug.
Comment #10 by clugdbug — 2013-01-28T01:13:35Z
I think there is an uninitialized variable in there. When I compile with -O, if I run the same executable multiple times, sometimes it passes, sometimes it fails.
Comment #11 by stephan.schiffels — 2013-01-28T06:57:56Z
Hi Don, glad to hear that you can reproduce the bug! I tested initializing all variables by hand, and the bug still occurs. Thanks for the suggestion to use std.numeric. Looks very useful! The Numerical Recipes Code style is worse than horrible! All those 1-letter variables...
Comment #12 by clugdbug — 2013-01-29T15:13:51Z
Here is a more reduced test case (still enormous): Without -O, it returns on the first pass through the loop. With -O, one of two things happen: (a) it hits the assert(0) on the first pass through the loop; or (b) it generates an alignment hardware exception. It looks as though it is a issue with misalignment of SSE registers. Removing the assert(0) causes an ICE. --- import std.math : abs; void minimize() { double a,b,d=0.0,etemp,fu,fv,fw,fx; double p; double q,r,tol1,tol2,u,v,w,x,xm; double e=0.0; double ax,bx,cx,fa,fb,fc; double tol; ax = 2.8541; bx = 3; cx = 3.0458; fa = 0.145898; fb = 0; fc = 0.381966; tol = 3.0e-8; a= ax; b= cx; v = bx; w = bx; x = bx; fx = 0; fv = fx; fw = fx; a = 2.97871347812973974456; b = 3.0458; v =2.9442711606; w =2.9787134781; x = 3; fx= 0; fv = 0.00310570354087098691; fw = 0.00045311601333306815; e =-0.0557288394; d = -0.0212865219; for (int iter=0;iter<1;iter++) { xm=0.5*(a+b); tol1=tol*abs(x); tol2=2.0*(tol1); if (abs(x-xm) <= (tol2-0.5*(b-a))) { return; } if (abs(e) > tol1) { r=(x-w)*(fx-fv); q=(x-v)*(fx-fw); p=(x-v)*q-(x-w)*r; q=2.0*(q-r); if (q > 0.0) p = -p; q=abs(q); etemp=e; e=d; if (abs(p) >= abs(0.5*q*etemp) || q < p) { d= b-x; } else { d=p/q; u=x+d; if (u-a < tol2 || b-u < tol2) d = xm - x; } } else { d= (e=(x >= xm ? a-x : b-x)); } u= (abs(d) >= tol1) ? x+d : x+3.0e-8; if (u < 3.01) return; else assert(0); // FAILS HERE fu = (u-3.0)*(u-3.0); if (fu <= fx) { assert(0); } } } void main() { minimize(); }
Comment #13 by clugdbug — 2013-01-29T23:51:04Z
A reduced test case for the ICE: import std.math : abs; void bug9387() { double x = 3; double r = (x-2.1)*0.1; double q = (x-2.1)*0.1 - r; double p = (x-2.1)*q - (x-2.1)*r; if (q > 0.0) p = -p; if (abs(p) >= q ) { } } --- dmd -O -m64 bug.d Internal error: backend/cgcod.c 769
Comment #14 by clugdbug — 2013-01-30T00:20:15Z
ICE, further reduced: -------------- void bug9387a(double x) { } void ice9387() { double x = 0.3; double r = x*0.1; double q = x*0.1 + r; double p = x*0.1 + r*0.2; if ( q ) p = -p; bug9387a(p); }
Comment #15 by clugdbug — 2013-01-30T03:58:21Z
And a reduction for the wrong-code case. This sometimes segfaults but usually hangs. Looks like the saved RBX register gets trampled: double brent(double x) { return x; } void wrong9387() { for (int iter=0; iter<1; iter++) { double v =2.94; if (brent(v)<= 2.9) { return; } double w = 2.97; double r = (0.2-w) * 0.1; double q = (0.2-v) * 0.1 - r; double p = 0.7*q - (0.2-v)*0.3; if (q > 0.0) p = -p; q = brent(q); double d = p-q; if (2.94 + d) w = v -v; brent(w); } } void main() { wrong9387(); }
Comment #16 by github-bugzilla — 2013-01-30T14:40:23Z
Commit pushed to dmd-1.x at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/bfa5d0f0ba80c7ff6e0d67806714763584666fb2 fix Issue 9387 - Compiler switch -O changes behavior of correct code
Comment #17 by bugzilla — 2013-01-30T14:42:56Z
https://github.com/D-Programming-Language/dmd/pull/1584 Thanks, Don, for the minimizations which made it easy for me to find the problem. It was not a regression, although it looked like one. The bug is nasty and I'm glad to get it fixed.
Comment #18 by stephan.schiffels — 2013-01-30T15:29:45Z
Don and Walter, thanks for reducing the code and fixing the bug, all on a very short timescale! This is going to be a very important fix for me. Using the optimization switch is critical for me. Stephan
Comment #19 by github-bugzilla — 2013-01-30T17:18:24Z
Commits pushed to master at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/06d991f039eab23561398aea4ea764ea49a6dea4 fix Issue 9387 - Compiler switch -O changes behavior of correct code https://github.com/D-Programming-Language/dmd/commit/9f3ab3f0b4713bd12a3ada71ca783bab1edae663 Merge pull request #1584 from WalterBright/b45 fix Issue 9387 - Compiler switch -O changes behavior of correct code
Comment #20 by clugdbug — 2013-01-31T00:52:53Z
> Don and Walter, thanks for reducing the code and fixing the bug, all on a very short timescale! Thanks. Optimizer bugs get top priority, and this was the one of the worst bugs of all time. I found test cases where the executable was wrong, yet still produced correct results in 90% of runs. I don't think I've ever seen a bug that was so difficult to reduce.