← Back to index | Original Bugzilla link

Bug 360 – Compile-time floating-point calculations are sometimes inconsistent

Status: RESOLVED
Resolution: INVALID
Severity: normal
Priority: P2
Component: dmd
Product: D
Version: D1 (retired)
Platform: x86
OS: Windows
Creation time: 2006-09-21T17:46:00Z
Last change time: 2014-02-15T13:20:08Z
Assigned to: bugzilla
Creator: digitalmars-com

Comments

Comment #0 by digitalmars-com — 2006-09-21T17:46:47Z

The following code should print false before it exits. import std.stdio; void main() { const float STEP_SIZE = 0.2f; float j = 0.0f; while (j <= ( 1.0f / STEP_SIZE)) { j += 1.0f; writefln(j <= ( 1.0f / STEP_SIZE)); } } This problem does not occur when: 1. the code is optimized 2. STEP_SIZE is not a const 3. STEP_SIZE is a real

Comment #1 by bugzilla — 2006-09-21T18:31:35Z

The example is mixing up 3 different precisions - 32, 64, and 80 bit. Each involves different rounding of unrepresentable numbers like 0.2. In this case, the 1.0f/STEP_SIZE is calculated at different precisions based on how things are compiled. Constant folding, for example, is done at compile time and done at max precision even if the variables involved are floats. The D language allows this, the guiding principle is that algorithms should be designed to not fail if precision is increased. Not a bug.

Comment #2 by digitalmars-com — 2006-09-21T18:46:52Z

*** Bug 361 has been marked as a duplicate of this bug. ***

Comment #3 by digitalmars-com — 2006-09-21T18:51:23Z

Why are the expressions in the while and writefln statements calculated at different precisions? Wouldn't the constant folding be done the same for both?

Comment #4 by bugzilla — 2006-09-21T20:17:49Z

while (j <= (1.0f/STEP_SIZE)) is at double precision, writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

Comment #5 by clugdbug — 2006-09-22T02:25:54Z

(In reply to comment #4) > while (j <= (1.0f/STEP_SIZE)) is at double precision, > writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision. I don't understand where the double precision comes from. Since all the values are floats, the only precisions that make sense are float and reals. Really, 0.2f should not be the same number as 0.2. When you put the 'f' suffix on, surely you're asking the compiler to truncate the precision. It can be expanded to real precision later without problems. Currently, there's no way to get a low-precision constant at compile time. (In fact, you should be able to write real a = 0.2 - 0.2f; to get the truncation error). Here's how I think it should work: const float A = 0.2; // infinitely accurate 0.2, but type inference on A should return a float. const float B = 0.2f; // a 32-bit approximation to 0.2 const real C = 0.2; // infinitely accurate 0.2 const real D = 0.2f; // a 32-bit approximation to 0.2, but type inference will give an 80-bit quantity.

Comment #6 by aldacron — 2006-09-22T05:30:18Z

Walter Bright wrote: > [email protected] wrote: >> ------- Comment #5 from [email protected] 2006-09-22 02:25 ------- >> (In reply to comment #4) >>> while (j <= (1.0f/STEP_SIZE)) is at double precision, >>> writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision. >> I don't understand where the double precision comes from. Since all >> the values >> are floats, the only precisions that make sense are float and reals. > > The compiler is allowed to evaluate intermediate results at a greater > precision than that of the operands. > >> Really, 0.2f should not be the same number as 0.2. > > 0.2 is not representable exactly, the only question is how much > precision is there in the representation. > >> When you put the 'f' suffix >> on, surely you're asking the compiler to truncate the precision. > > Not in D. The 'f' suffix only indicates the type. And therefore, it only matters in implicit type deduction, and in function overloading. As I discuss below, I'm not sure that it's necessary even there. In many cases, it's clearly a programmer error. For example in real BAD = 0.2f; where the f has absolutely no effect. The compiler may > maintain internally as much precision as possible, for purposes of > constant folding. Committing the actual precision of the result is done > as late as possible. > >> It can be >> expanded to real precision later without problems. Currently, there's >> no way to >> get a low-precision constant at compile time. > > You can by putting the constant into a static, non-const variable. Then > it cannot be constant folded. Actually, in this case you still want it to be constant folded. > >> (In fact, you should be able to write real a = 0.2 - 0.2f; to get the >> truncation error). > > Not in D, where the compiler is allowed to evaluate using as much > precision as possible for purposes of constant folding. The vast > majority of calculations benefit from delaying rounding as long as > possible, hence D's bias towards using as much precision as possible. > > The way to write robust floating point calculations in D is to ensure > that increasing the precision of the calculations will not break the > result. > > Early versions of Java insisted that rounding to precision of floating > point intermediate results always happened. While this ensured > consistency of results, it mostly resulted in consistently getting > inferior and wrong answers. I agree. But it seems that D is currently in a halfway house on this issue. Somehow, 'double' is privileged, and don't think it's got any right to be. const XXX = 0.123456789123456789123456789f; const YYY = 1 * XXX; const ZZZ = 1.0 * XXX; auto xxx = XXX; auto yyy = YYY; auto zzz = ZZZ; // now xxx and yyy are floats, but zzz is a double. Multiplying by '1.0' causes a float constant to be promoted to double. real a = xxx; real b = zzz; real c = XXX; Now a, b, and c all have different values. Whereas the same operation at runtime causes it to be promoted to real. Is there any reason why implicit type deduction on a floating point constant doesn't always default to real? After all, you're saying "I don't particularly care what type this is" -- why not default to maximum accuracy? Concrete example: real a = sqrt(1.1); This only gives a double precision result. You have to write real a = sqrt(1.1L); instead. It's easier to do the wrong thing, than the right thing. IMHO, unless you specifically take other steps, implicit type deduction should always default to the maximum accuracy the machine could do.

Comment #7 by smjg — 2006-09-22T15:06:07Z

(In reply to comment #5) > const float A = 0.2; // infinitely accurate 0.2, but type inference on A > should return a float. > > const float B = 0.2f; // a 32-bit approximation to 0.2 > const real C = 0.2; // infinitely accurate 0.2 > const real D = 0.2f; // a 32-bit approximation to 0.2, but type inference will > give an 80-bit quantity. I agree. Only I'm not sure about A. If you want it to be "infinitely accurate", then why would you declare it to be a float? It appears to me to be a means by which a float can hold more precision than it really can. On the other hand, D should definitely generate a 32-bit approximation to 0.2. By using the 'f' suffix, this is exactly what the programmer asked for.

Comment #8 by digitalmars-com — 2006-09-22T18:50:22Z

To summarize: --- The compiler is allowed to evaluate intermediate results at a greater precision than that of the operands. The literal type suffix (like 'f') only indicates the type. The compiler may maintain internally as much precision as possible, for purposes of constant folding. Committing the actual precision of the result is done as late as possible. For a low-precision constant put the value into a static, non-const variable. Since this is not really a constant, it cannot be constant folded and therefore affected by a possible compile-time increase in precision. However, if mixed with a higher precision at runtime, a increase in precision will still occur. The way to write robust floating point calculations in D is to ensure that increasing the precision of the calculations will not break the result. --- end of summary This is the explanation I was looking for. Although it was clear that during runtime, D evaluates intermediate results at high precision. The compile-time behavior (namely using a const) is different than the runtime behavior (using a static), but I don't think that is clearly explained in the documentation. Would you please add this information to the D documentation? Perhaps an addition to the Floating Point page (http://www.digitalmars.com/d/float.html). Of course, if any of the above is incorrect, please change as necessary. A follow-on question would be: How does one create an low-precision constant that is ensured to actually stay constant? A static won't do since a static is really non-const, and a programming error would change the value. Thanks, Bradley

Comment #9 by aldacron — 2006-09-23T09:30:21Z

Walter Bright wrote: > Don Clugston wrote: >> Walter Bright wrote: >>> Not in D. The 'f' suffix only indicates the type. >> >> And therefore, it only matters in implicit type deduction, and in >> function overloading. As I discuss below, I'm not sure that it's >> necessary even there. >> In many cases, it's clearly a programmer error. For example in >> real BAD = 0.2f; >> where the f has absolutely no effect. > > It may come about as a result of source code generation, though, so I'd > be reluctant to make it an error. > > >>> You can by putting the constant into a static, non-const variable. >>> Then it cannot be constant folded. >> >> Actually, in this case you still want it to be constant folded. > > A static variable's value can change, so it can't be constant folded. To > have it participate in constant folding, it needs to be declared as const. But if it's const, then it's not float precision! I want both! >> I agree. But it seems that D is currently in a halfway house on this >> issue. Somehow, 'double' is privileged, and don't think it's got any >> right to be. >> >> const XXX = 0.123456789123456789123456789f; >> const YYY = 1 * XXX; >> const ZZZ = 1.0 * XXX; >> >> auto xxx = XXX; >> auto yyy = YYY; >> auto zzz = ZZZ; >> >> // now xxx and yyy are floats, but zzz is a double. >> Multiplying by '1.0' causes a float constant to be promoted to double. > > That's because 1.0 is a double. A double*float => double. > >> real a = xxx; >> real b = zzz; >> real c = XXX; >> >> Now a, b, and c all have different values. >> >> Whereas the same operation at runtime causes it to be promoted to real. >> >> Is there any reason why implicit type deduction on a floating point >> constant doesn't always default to real? After all, you're saying "I >> don't particularly care what type this is" -- why not default to >> maximum accuracy? >> >> Concrete example: >> >> real a = sqrt(1.1); >> >> This only gives a double precision result. You have to write >> real a = sqrt(1.1L); >> instead. >> It's easier to do the wrong thing, than the right thing. >> >> IMHO, unless you specifically take other steps, implicit type >> deduction should always default to the maximum accuracy the machine >> could do. > > It is a good idea, but isn't that way for the reasons: > > 1) It's the way C, C++, and Fortran work. Changing the promotion rules > would mean that, when translating solid, reliable libraries from those > languages to D, one would have to be very, very careful. That's very important. Still, those languages don't have implicit type deduction. Also, none of those languages guarantee accuracy of decimal->binary conversions, so there's always some error in decimal constants. Incidentally, I recently read that GCC uses something like 160 bits for constant folding, so it's always going to give results that are different to those on other compilers. Why doesn't D behave like C with respect to 'f' suffixes? (Ie, do the conversion, then truncate it to float precision). Actually, I can't imagine many cases where you'd actually want a 'float' constant instead of a 'real' one. > 2) Float and double are expected to be implemented in hardware. Longer > precisions are often not available. I wanted to make it practical for a > D implementation on those machines to provide a software long precision > floating point type, rather than just making real==double. Such a type > would be very slow compared with double. Interesting. I thought that 'real' was supposed to be the highest accuracy fast floating point type, and would therefore be either 64, 80, or 128 bits. So it could also be a double-double? For me, the huge benefit of the 'real' type is that it guarantees that optimisation won't change the results. In C, using doubles, it's quite unpredictable when a temporary will be 80 bits, and when it will be 64 bits. In D, if you stick to real, you're guaranteed that nothing weird will happen. I'd hate to lose that. > 3) Real, even in hardware, is significantly slower than double. Doing > constant folding at max precision at compile time won't affect runtime > performance, so it is 'free'. In this case, the initial issue remains: in order to write code which maintains accuracy regardless of machine precision, it is sometimes necessary to specify the precision that should be used for constants. The original code was an example where weird things happened because that wasn't respected.

Comment #10 by aldacron — 2006-09-24T13:51:36Z

Walter Bright wrote: > Don Clugston wrote: >> Walter Bright wrote: >>> A static variable's value can change, so it can't be constant folded. >>> To have it participate in constant folding, it needs to be declared >>> as const. >> But if it's const, then it's not float precision! I want both! > > You can always use hex float constants. I know they're not pretty, but > the point of them is to be able to specify exact floating point bit > patterns. There are no rounding errors with them. >>> 1) It's the way C, C++, and Fortran work. Changing the promotion >>> rules would mean that, when translating solid, reliable libraries >>> from those languages to D, one would have to be very, very careful. >> >> That's very important. Still, those languages don't have implicit type >> deduction. Also, none of those languages guarantee accuracy of >> decimal->binary conversions, so there's always some error in decimal >> constants. Incidentally, I recently read that GCC uses something like >> 160 bits for constant folding, so it's always going to give results >> that are different to those on other compilers. >> >> Why doesn't D behave like C with respect to 'f' suffixes? >> (Ie, do the conversion, then truncate it to float precision). >> Actually, I can't imagine many cases where you'd actually want a >> 'float' constant instead of a 'real' one. > > A float constant would be desirable to keep the calculation all floats > for speed reasons. I can't think of many reasons one would want reduced > precision. Me, too. In fact I've seen a lot of code where ignorant programmers were adding 'f' to end of every floating point constant. It could be that the number of cases where you actually care about the precision are so small, that hex constants are adequate. >>> 2) Float and double are expected to be implemented in hardware. >>> Longer precisions are often not available. I wanted to make it >>> practical for a D implementation on those machines to provide a >>> software long precision floating point type, rather than just making >>> real==double. Such a type would be very slow compared with double. >> >> Interesting. I thought that 'real' was supposed to be the highest >> accuracy fast floating point type, and would therefore be either 64, >> 80, or 128 bits. So it could also be a double-double? >> For me, the huge benefit of the 'real' type is that it guarantees that >> optimisation won't change the results. In C, using doubles, it's quite >> unpredictable when a temporary will be 80 bits, and when it will be 64 >> bits. In D, if you stick to real, you're guaranteed that nothing weird >> will happen. I'd hate to lose that. > > I don't see how one would lose that if real were done in software. > >>> 3) Real, even in hardware, is significantly slower than double. Doing >>> constant folding at max precision at compile time won't affect >>> runtime performance, so it is 'free'. >> >> In this case, the initial issue remains: in order to write code which >> maintains accuracy regardless of machine precision, it is sometimes >> necessary to specify the precision that should be used for constants. >> The original code was an example where weird things happened because >> that wasn't respected. > > Weird things always happen with floating point. It's just a matter of > where one chooses the seams to show (you pointed out where seams show in > C with temporary precision). I've seen a lot of cases where people were > surprised that 0.2f (or similar) was even rounded off, and got caught by > the roundoff error. > > I used to work in mechanical engineering where a lot of numerical > calculations were done. Accumulating roundoff errors were a huge > problem, and a lot (most?) engineers didn't understand it. They were > using calculators for long chains of calculation, and rounding off after > each step instead of carrying the full calculator precision. They were > mystified by getting answers at the end that were way off. > > It's my experience with that (and also in college where we were taught > to never round off anything but the final answer) that led to the D > design decision to internally carry around consts in full precision, > regardless of type. > > Deliberately reduced precision is something that only experts would > want, and only for special cases. So it's reasonable that that would be > harder to do (i.e. using hex float constants). OK, you've convinced me. It needs to be better documented, though. > P.S. I also did some digital electronic design work long ago. The > cardinal rule there was that since TTL devices got faster all the time, > and old slower TTL parts became unavailable, one designed so that > swapping in a faster chip would not cause the failure of the system. > Hence the rule that increasing the precision of a calculation should not > cause the program to fail <g>. I think it would be useful to specify more precisely what happens in constant folding. Eg, mention that all constant folding will be done in IEEE round-to-nearest, ties-to-even. In the longer term, I've been wondering if the precision for real constants even needs to be the same as for the 'real' type. I can see some distinct benefits that would come if the precision of literals was defined to always be IEEE quadruple precision. Of course they'd always be rounded to 64 or 80-bit reals when the time came for them to actually be used. Looking at the spec for the forthcoming IEEE 754R standard, and the state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a quadruple precision type (they already have 16 128 bit registers, two 64 bit mantissa units, and the quadruple exponent is the same as for x87. So I don't think it would require much silicon, and it would mean they could emulate the x87 stuff entirely on SSE). Some forward-compatibility things to consider in DMD 2.0; ignore for now.

Comment #11 by aldacron — 2006-09-25T01:30:16Z

Walter Bright wrote: > Don Clugston wrote: >> Walter Bright wrote: >> OK, you've convinced me. It needs to be better documented, though. > > I agree with you and Bradley Smith on that. > >>> P.S. I also did some digital electronic design work long ago. The >>> cardinal rule there was that since TTL devices got faster all the >>> time, and old slower TTL parts became unavailable, one designed so >>> that swapping in a faster chip would not cause the failure of the >>> system. Hence the rule that increasing the precision of a calculation >>> should not cause the program to fail <g>. >> >> I think it would be useful to specify more precisely what happens in >> constant folding. Eg, mention that all constant folding will be done >> in IEEE round-to-nearest, ties-to-even. > > Yes. > >> In the longer term, I've been wondering if the precision for real >> constants even needs to be the same as for the 'real' type. I can see >> some distinct benefits that would come if the precision of literals >> was defined to always be IEEE quadruple precision. Of course they'd >> always be rounded to 64 or 80-bit reals when the time came for them to >> actually be used. > > I agree. One consequence of that would be in the name mangling for floating point constants in templates. Currently it's 20 hex characters, which only makes sense for a system with 80-bit reals; might be better to make it 32 hex characters, even if the extra 12 are all '0'. > >> Looking at the spec for the forthcoming IEEE 754R standard, and the >> state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add >> a quadruple precision type (they already have 16 128 bit registers, >> two 64 bit mantissa units, and the quadruple exponent is the same as >> for x87. So I don't think it would require much silicon, and it would >> mean they could emulate the x87 stuff entirely on SSE). Some >> forward-compatibility things to consider in DMD 2.0; ignore for now. > > I was disappointed in the AMD-64 because it didn't do 128 bit floats, in > fact, it relegated 80 bit floats to a backwater in the instruction set. > Few computer people seem to understand the value in high precision > floating point. Intel seems to be better than AMD in this regard. Intel added an 82 bit floating point type to the Itanium so that it could do 80-bit hypot() without overflow (in fact, Itanium seems to have by far the best floating point support that I've seen); AMD's 3DNow! didn't even support subnormals, infinity, or NaN.

Comment #12 by sean — 2006-09-25T11:05:29Z

Don Clugston wrote: > Walter Bright wrote: >> >> I was disappointed in the AMD-64 because it didn't do 128 bit floats, >> in fact, it relegated 80 bit floats to a backwater in the instruction >> set. Few computer people seem to understand the value in high >> precision floating point. > > Intel seems to be better than AMD in this regard. Intel added an 82 bit > floating point type to the Itanium so that it could do 80-bit hypot() > without overflow (in fact, Itanium seems to have by far the best > floating point support that I've seen); AMD's 3DNow! didn't even support > subnormals, infinity, or NaN. I think AMD simply set its sights on the game industry as the battleground, which seems to be supported by the presence of forums on LAN parties and system modding (http://forums.amd.com/). This stands in contrast with the Intel, who has an entire set of forums for software development (http://softwareforums.intel.com/). I decided to ask whether AMD has another location for software development discussion. I have no idea whether science-minded software companies or developers communicate to AMD that they'd like improved floating-point support, but a bit more couldn't hurt. Sean