Bug 6189 – [64bit] optimizer: register content destroyed in function prolog
Status
RESOLVED
Resolution
FIXED
Severity
critical
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
All
Creation time
2011-06-21T05:52:00Z
Last change time
2015-06-09T05:11:35Z
Keywords
wrong-code
Assigned to
nobody
Creator
code
Comments
Comment #0 by code — 2011-06-21T05:52:01Z
struct FPoint {
float x, y;
}
void constructBezier(FPoint p0, FPoint p1, FPoint p2, ref FPoint[3] quad) {
quad[0] = p0;
quad[1] = FPoint(p1.x, p1.y);
quad[$-1] = p2;
}
void main() {
auto p0 = FPoint(0, 0);
auto p1 = FPoint(1, 1);
auto p2 = FPoint(2, 2);
// avoid inline of call
FPoint[3] quad;
auto f = &constructBezier;
f(p0, p1, p2, quad);
assert(quad == [p0, p1, p2]);
}
---
This code will fail if compiled with optimization.
The issue is that quad variable is assigned to a register during the function.
In the function prolog quad is move from it's parameter register to the target register while another parameter still resides in that register.
Comment #1 by code — 2011-06-21T11:08:30Z
*** Issue 6042 has been marked as a duplicate of this issue. ***
Comment #2 by code — 2011-08-29T09:56:49Z
I've further dissected this bug.
Chain of infection.
- p1 (passed in RDX) is marked as being not register candidate
because it is used in an OPrelconst (probably p1.x/p1.y)
- => Symbol for p1 doesn't get a live range
- => blcodgen doesn't mark regcon.used for RDX because parameter isn't marked
alive in entry block
if (s->Sclass & SCfastpar &&
regcon.params & mask[s->Spreg] &&
vec_testbit(dfoidx,s->Srange))
{
regcon.used |= mask[s->Spreg];
}
- => cgreg_assign for quad figures DX is a neat register to assign quad to
(passed in RDI)
- => nobody is responsible for saving fastpars and the function prolog creates
a mov RDX, RDI before RDX is saved
There are two things involved which work suboptimal for the ABI64 conventions.
I. The current way of marking a fastpar register as being used effectively prevents cgreg_assign to leave them in this register.
II. With the 32 LinkD ABI there was only one register parameter. So moving it in the function prolog couldn't conflict with other parameters.
Both of them can be improved but still they won't guarantee a proper fix for this bug.
Comment #3 by code — 2011-08-29T10:02:49Z
That is you can not have working prolog code if parameter register locations and function register locations are crossing each other without temporary storage, e.g. swap(RDI, RSI).
Comment #4 by code — 2011-08-29T10:31:41Z
Rough sketch of improvements.
I. cgreg_assign/cgreg_benefit (cgreg.c)
When doing register benefit calculation, add block weights for the fastpar register if the symbol is still contained in it. Decrease benefit by -1 for other registers.
II. prolog (cod3.c)
Strictly sort parameter movings in the following order.
Register to stack, Register to register, Stack to register.
Keep track of used registers and add an assertion that moving to a register
is not conflicting.
Comment #5 by code — 2011-11-22T12:53:46Z
This test case doesn't reproduce the bug since xmmregs
are used for floating point.
Disabling fpxmmregs still reproduces the bug.
Comment #6 by code — 2012-01-13T04:43:54Z
struct Point(T)
{
T x, y;
}
alias Point!int IPoint;
alias Point!float FPoint;
void calcCoeffs(uint half, IPoint pos, ref FPoint[2] pts, uint=0)
{
pos.x &= ~(half - 1);
pos.y &= ~(half - 1);
immutable float xo = pos.x;
immutable float yo = pos.y;
pts[0].x -= xo;
pts[0].y -= yo;
pts[1].x -= xo;
pts[1].y -= yo;
}
void main()
{
auto pos = IPoint(2, 2);
FPoint[2] pts;
pts[0] = pts[1] = FPoint(3, 3);
auto f = &calcCoeffs;
f(2, pos, pts);
assert(pts[0].x == 1);
assert(pts[0].y == 1);
assert(pts[1].x == 1);
assert(pts[1].y == 1);
}
----
This one happens with xmmregs too.
My reduction in comment 9 was valid only for D2, this one is valid for D1 as well (again compile with -m64 -O):
---------------------------
struct IPoint {
int x, y;
}
void bug6189(uint half, IPoint pos, float[4] *pts, uint unused) {
pos.y += half;
float xo = pos.x;
float yo = pos.y;
(*pts)[0] = xo;
(*pts)[1] = yo;
(*pts)[2] = xo;
}
void main()
{
auto pos = IPoint(2, 2);
float[4] pts;
pts[0] = pts[1] = pts[2] = pts[3] = 0;
bug6189(0, pos, &pts, 0);
assert(pts[0] == 2);
}
Comment #11 by leandro.lucarella — 2012-05-22T11:14:28Z
(In reply to comment #9)
> Reduced test case (compile with -m64 -O):
>
> void bug6189(int half, int[2] pos, float[3] *pts, int unused)
> {
> pos[0] += half;
>
> (*pts)[0] = pos[0];
> (*pts)[1] = pos[1];
> (*pts)[2] = half;
> }
>
> void main()
> {
> int[2] pos = [2,2];
> float[3] pts = [0.0, 0.0, 0.0];
> bug6189(0, pos, &pts, 0);
> assert(pts[0] == 2);
> }
This is working on latest dmd2 (42d8967) and d1 (4351a58), but the testcase in comment 8 still fails on both.
Comment #12 by leandro.lucarella — 2012-05-22T11:17:37Z
(In reply to comment #10)
> My reduction in comment 9 was valid only for D2, this one is valid for D1 as
> well (again compile with -m64 -O):
> ---------------------------
> struct IPoint {
> int x, y;
> }
>
> void bug6189(uint half, IPoint pos, float[4] *pts, uint unused) {
> pos.y += half;
> float xo = pos.x;
> float yo = pos.y;
>
> (*pts)[0] = xo;
> (*pts)[1] = yo;
> (*pts)[2] = xo;
> }
>
> void main()
> {
> auto pos = IPoint(2, 2);
> float[4] pts;
> pts[0] = pts[1] = pts[2] = pts[3] = 0;
> bug6189(0, pos, &pts, 0);
>
> assert(pts[0] == 2);
> }
OK, this one fails too in latest D1 and D2, but interestingly enough, it works with -O -inline (in both D1 and D2)!
Comment #13 by code — 2012-05-22T12:00:35Z
>OK, this one fails too in latest D1 and D2, but interestingly enough, it works with -O -inline (in both D1 and D2)!
This bug depends completely on register allocation and defies logic or intuitive understanding.
I need to revisit my patch so it merges again.
Comment #14 by github-bugzilla — 2012-05-22T23:42:39Z