Bug 2905 – [PATCH] Faster +-*/ involving a floating-pointing literal

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2009-04-27T07:56:00Z
Last change time
2015-06-09T01:26:26Z
Keywords
patch, performance
Assigned to
bugzilla
Creator
clugdbug

Attachments

IDFilenameSummaryContent-TypeSize
342floatliteralpatch.patchPatch against DMD2.029text/plain3896
412test.dtest program for fp patchtext/plain939

Comments

Comment #0 by clugdbug — 2009-04-27T07:56:16Z
The front-end converts all floating-point literals into 80-bit real values, if they are specified either (1) with an L suffix, or (2) they are involved in an operation with a 'real'. It does this _even if the number could be represented as a double without loss of precision_. Unfortunately, the x87 can only do fused load-and-multiply, load-and-add, etc of float and double operands, so mixed-precision operations are a bit slower than they could be. So, something like "x + 3.0" (where x is real), becomes: static real THREE=3.0; fld real ptr THREE; faddp ST(1), ST; and with this patch it becomes: static double THREE=3.0; fadd ST, double ptr THREE; The patch is only applied in the case of +,-,*,/, so that something like printf("%Lg", 2.0L); will continue to work correctly. The patch also includes the patch for 2888; they are very closely related.
Comment #1 by clugdbug — 2009-04-27T07:57:01Z
Created attachment 342 Patch against DMD2.029 Includes patch for 2888 as well.
Comment #2 by bugzilla — 2009-07-03T23:34:45Z
Created attachment 412 test program for fp patch
Comment #3 by bugzilla — 2009-07-03T23:36:52Z
The patch is incomplete, compile the test program with -O and it fails with an internal error. Haven't investigated why.
Comment #4 by clugdbug — 2009-07-06T04:17:32Z
(In reply to comment #3) > The patch is incomplete, compile the test program with -O and it fails with an > internal error. Haven't investigated why. OK, I'm on it.
Comment #5 by clugdbug — 2009-07-06T04:54:46Z
This is a minimal case which ICEs with the patch. Order of terms in the expression is important. void foo(real z) {} void main (){ real F = 1; foo( 1 + (F*3*2.1) ); }
Comment #6 by clugdbug — 2009-07-10T07:37:23Z
Revised patch against DMD2.031, which fixes the problem, and optimises a larger number of cases. This patch significantly improves performance of floating-point operations involving reals. ---- Index: cg87.c =================================================================== --- cg87.c (revision 192) +++ cg87.c (working copy) @@ -914,7 +914,8 @@ case X(OPadd, TYdouble, TYdouble): case X(OPadd, TYdouble_alias, TYdouble_alias): case X(OPadd, TYldouble, TYldouble): -// case X(OPadd, TYldouble, TYdouble): + case X(OPadd, TYldouble, TYdouble): + case X(OPadd, TYdouble, TYldouble): case X(OPadd, TYifloat, TYifloat): case X(OPadd, TYidouble, TYidouble): case X(OPadd, TYildouble, TYildouble): @@ -925,8 +926,8 @@ case X(OPmin, TYdouble, TYdouble): case X(OPmin, TYdouble_alias, TYdouble_alias): case X(OPmin, TYldouble, TYldouble): -// case X(OPmin, TYldouble, TYdouble): -// case X(OPmin, TYdouble, TYldouble): + case X(OPmin, TYldouble, TYdouble): + case X(OPmin, TYdouble, TYldouble): case X(OPmin, TYifloat, TYifloat): case X(OPmin, TYidouble, TYidouble): case X(OPmin, TYildouble, TYildouble): @@ -937,7 +938,8 @@ case X(OPmul, TYdouble, TYdouble): case X(OPmul, TYdouble_alias, TYdouble_alias): case X(OPmul, TYldouble, TYldouble): -// case X(OPmul, TYldouble, TYdouble): + case X(OPmul, TYldouble, TYdouble): + case X(OPmul, TYdouble, TYldouble): case X(OPmul, TYifloat, TYifloat): case X(OPmul, TYidouble, TYidouble): case X(OPmul, TYildouble, TYildouble): @@ -954,8 +956,8 @@ case X(OPdiv, TYdouble, TYdouble): case X(OPdiv, TYdouble_alias, TYdouble_alias): case X(OPdiv, TYldouble, TYldouble): -// case X(OPdiv, TYldouble, TYdouble): -// case X(OPdiv, TYdouble, TYldouble): + case X(OPdiv, TYldouble, TYdouble): + case X(OPdiv, TYdouble, TYldouble): case X(OPdiv, TYifloat, TYifloat): case X(OPdiv, TYidouble, TYidouble): case X(OPdiv, TYildouble, TYildouble): @@ -1364,7 +1366,12 @@ #undef X e2oper = e2->Eoper; - if (e1->Eoper == OPconst || + // Move double-sized operand into the second position if there's a chance it will allow + // combining a load with an operation (DMD Bugzilla 2905) + if ( ((tybasic(e1->Ety)==TYdouble) + && ((e1->Eoper==OPvar) || (e1->Eoper==OPconst)) + && (tybasic(e2->Ety) != TYdouble)) + || (e1->Eoper == OPconst ) || (e1->Eoper == OPvar && ((e1->Ety & (mTYconst | mTYimmutable) && !OTleaf(e2oper)) || (e2oper == OPd_f && Index: el.c =================================================================== --- el.c (revision 192) +++ el.c (working copy) @@ -2052,7 +2052,6 @@ * operations, since then it could change the type (eg, in the function call * printf("%La", 2.0L); the 2.0 must stay as a long double). */ -#if 0 void shrinkLongDoubleConstantIfPossible(elem *e) { if (e->Eoper == OPconst && e->Ety == TYldouble) @@ -2072,7 +2071,6 @@ } } } -#endif /************************* @@ -2115,7 +2113,7 @@ */ break; } -#if 0 + case OPdiv: case OPadd: case OPmin: @@ -2125,7 +2123,6 @@ if (tyreal(e->Ety)) shrinkLongDoubleConstantIfPossible(e->E2); // fall through... -#endif default: if (OTbinary(op)) {
Comment #7 by bugzilla — 2009-10-06T02:16:07Z
Fixed dmd 1.048 and 2.033