Bug 4077 – Bugs caused by bitwise operator precedence

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2010-04-10T16:06:00Z
Last change time
2010-07-24T16:25:03Z
Keywords
patch
Assigned to
nobody
Creator
bearophile_hugs

Attachments

IDFilenameSummaryContent-TypeSize
669patch4077.patchPatch against svn 552, D2text/plain4228

Comments

Comment #0 by bearophile_hugs — 2010-04-10T16:06:32Z
This isn't a bug report, and it's not exactly an enhancement request yet. It's a report that a problem exists, but I don't know a solution yet. I think it's useful to have this in Bugzilla, to keep in mind that this problem exists in D. This report is born from a bug done by Adam D. Ruppe, but similar bugs have happened in my code too in the past: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108772 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108781 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108783 The precedence of bitwise operators is low, this makes them error-prone, it's a part of C/C++/D that causes frequent bugs in programs (the solution is to extra parentheses when you use bitwise operators). At the moment I don't see a simple way to remove this source of bugs from the D2 language. This class of bugs is so common that GCC developers have felt the need to improve the situation. When you switch on the warnings GCC warns you about few possible similar errors, suggesting to add parentheses to remove some ambiguity. A small example in C: #include "stdio.h" #include "stdlib.h" int main() { int a = atoi("10"); int b = atoi("20"); int c = atoi("30"); printf("%u\n", a|b <= c); return 0; } If you compile it with GCC 4.4.1: gcc -Wall test.c -o test test.c: In function 'main': test.c:9: warning: suggest parentheses around comparison in operand of '|' You always use -Wall (and other warnings) when you write C code, so here gcc is able to catch such bugs. This class of warnings can be added to the D compiler too.
Comment #1 by braddr — 2010-04-10T17:20:27Z
Care to quantify 'frequent'? Just because something can cause a bug doesn't make it a disaster. I can't recall ever making a bit wise precedence error myself. Of course, that too isn't proof of anything.
Comment #2 by bearophile_hugs — 2010-04-10T19:54:25Z
>Care to quantify 'frequent'?< I'd like to, but finding hard statical data about bugs is hard. Often you just have to use your programming experience and memory of past mistakes. I have programming experience, and for the last years I am writing down all my bugs. You can ask the GCC developers what kind of statical data they have used to decide to recently introduce that warning into gcc. I think they have no reliable statistical data. But they are usually smart people, so you can't just ignore their example. >Just because something can cause a bug doesn't make it a disaster.< Just because something can't cause disasters but just bugs doesn't justify ignoring it. And sometimes silent bugs like this one actually cause disasters. >I can't recall ever making a bit wise precedence error myself. Of course, that too isn't proof of anything.< I have done several of similar bugs. Later I have taken the habit of always putting parentheses around shift and bitwise ops, if they are compound with other things. That post on the D newsgroup shows Adam Ruppe too once has done this bug. See the -Wparentheses here: http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html It says several interesting things. It also says: >Warn if parentheses are omitted in certain contexts, such as when there is an assignment in a context where a truth value is expected, or when operators are nested whose precedence people often get confused about.< They say "often get confused about". That warning switch also warns against probably wrong code like (this is another common source of bugs that's missing in Python): if (a) if (b) foo (); else bar ();
Comment #3 by destructionator — 2010-04-10T20:32:33Z
Yeah, when it bit me today, I wasn't thinking about it at all. The code looked like this: assert( a|b <= max); I meant (a|b) <= max, but the code ended up being a|(b <= max), which was fairly useless. I don't think bitwise being lower than comparison is useful, but we have the difficulty here of maintaining C compatibility. The best fix we can get, if one is really needed*, is to call it an error to have a bitwise operation next to anything that trumps it, unless parenthesis are present. The error brings instant attention to the trouble spot, and adding explicit parens is no big trouble - I, and surely many others, usually do this by habit anyway - so I'd be happy with this solution. * (this is he only time I can recall being bitten by this in all my years of writing C and friends, so it really isn't a big deal to me)
Comment #4 by bearophile_hugs — 2010-04-11T04:34:32Z
Thank you for your comments. Requiring parentheses is one of the few solutions I can see. >* (this is he only time I can recall being bitten by this in all my years > of writing C and friends, so it really isn't a big deal to me) My experience shows that it's easy to forget bugs, because they are seen as something negative, so I suggest you to write them down :-)
Comment #5 by smjg — 2010-04-11T10:15:59Z
(In reply to comment #3) > Yeah, when it bit me today, I wasn't thinking about it at all. The > code looked like this: > > assert( a|b <= max); > > I meant (a|b) <= max, but the code ended up being a|(b <= max), > which was fairly useless. > > I don't think bitwise being lower than comparison is useful, but we > have the difficulty here of maintaining C compatibility. The best > fix we can get, if one is really needed*, is to call it an error to > have a bitwise operation next to anything that trumps it, unless > parenthesis are present. The precedence of bitwise operators is indeed counter-intuitive. Presumably there's a reason C defined them this way. It seems that, in most programming languages, operator precedence is a total ordering. The way to avoid problems like this is to change it into a partial ordering. At the moment, the precedence graph from && down to shifts looks like this: shift . cmp . & . ^ . | . && These changes in the grammar: AndAndExpression: OrExpression AndAndExpression && OrExpression CmpExpression AndAndExpression && CmpExpression AndExpression: ShiftExpression & ShiftExpression AndExpression & ShiftExpression would change it to shift . . & . . . ^ cmp . . | . . . && By changing all occurrences of ShiftExpression in the definition of AndExpression to something else, you can make the path divide higher up the precedence chain. This way, the bitwise operators would retain their precedence relative to each other, but any attempt to mix them with comparison operators will cause an error.
Comment #6 by schveiguy — 2010-04-12T06:09:47Z
(In reply to comment #1) > Care to quantify 'frequent'? Just because something can cause a bug doesn't > make it a disaster. I can't recall ever making a bit wise precedence error > myself. Of course, that too isn't proof of anything. I run into this all the time. It makes me absolutely paranoid about bitops to where I sometimes write things like: if((a | b)) or a = (b | c); Before I realize the extra parens don't do much :) If you write routines that parse protocols or use bitfield flags, you will run into this bug. I always wondered why bitwise operators were lower in precedence than comparison, but you just learn to accept it (and judiciously use parentheses around such things). If D could make strides to help solve this problem, I think it would be great. Probably not earth shattering, but just another feather in the cap. When someone writes something like: if(a | b == c) I'd say it's always an error. Not even almost always, but always. If D could flag this as such, it would be a good thing. I strongly feel, however, that bitwise ops should simply have a higher precedent than comparison, since the current behavior is always an error. You will not find any C code that looks like this on purpose. I don't see any reason to keep the current interpretation regardless.
Comment #7 by destructionator — 2010-04-12T06:42:57Z
(In reply to comment #4) > My experience shows that it's easy to forget bugs, because they are seen as > something negative, so I suggest you to write them down :-) Aye, probably true. I think another reason why too is I usually put the parenthesis around it all the time - probably one of those things I started doing a long time ago after being hit by the bug, then over the years did out of habit without remembering specifically why I started in the first place. Requiring parenthesis or changing the precidence would be nice in any case. There's no cost I can see (outside of implementing it in the compiler, of course), and even a small benefit is better than none.
Comment #8 by dfj1esp02 — 2010-04-14T10:28:27Z
An academic example of use is to NOT short-circuit evaluation of operands.
Comment #9 by clugdbug — 2010-05-07T02:01:48Z
(In reply to comment #5) > The precedence of bitwise operators is indeed counter-intuitive. Presumably > there's a reason C defined them this way. Yes, there is -- backwards compatibility with the B language!!! Denis Ritchie says (http://cm.bell-labs.com/cm/cs/who/dmr/chist.html): -------------------- At the suggestion of Alan Snyder, I introduced the && and || operators to make the mechanism [[short circuit evaluation]] more explicit. Their tardy introduction explains an infelicity of C's precedence rules. In B one writes if (a==b & c) ... to check whether a equals b and c is non-zero; in such a conditional expression it is better that & have lower precedence than ==. In converting from B to C, one wants to replace & by && in such a statement; to make the conversion less painful, we decided to keep the precedence of the & operator the same relative to ==, and merely split the precedence of && slightly from &. Today, it seems that it would have been preferable to move the relative precedences of & and ==, and thereby simplify a common C idiom: to test a masked value against another value, one must write if ((a&mask) == b) ... where the inner parentheses are required but easily forgotten. ----------------------------------- So C did it for an unbelievably silly reason (there was hardly any B code in existence). Note that Ritchie says it is "easily forgotten". We should definitely fix this ridiculous precedence. (IMHO it was very sloppy that ANSI C didn't make (a&b == c) an error).
Comment #10 by clugdbug — 2010-06-21T14:48:40Z
Created attachment 669 Patch against svn 552, D2 This patch implements Stewart Gordon's proposal. Quite simple, since it is just the parser. I'm not sure if there's a better way of doing it, but it still only affects a small number of lines. Most of this patch involves creating nice error messages when ambiguities occur. I have NOT dealt with the code in the 'global.params.Dversion == 1' block inside parseAndExp(). I don't know if it's current; in any case it's completely different to the code in D1. Possibly this needs to change as well, for code inside version(D1) blocks.
Comment #11 by clugdbug — 2010-06-21T14:49:58Z
Note that with this patch in place, I found 6 bugs in Phobos and 1 in druntime.
Comment #12 by bugzilla — 2010-07-24T10:46:47Z
Comment #13 by bearophile_hugs — 2010-07-24T15:30:48Z
Thank you very much to Stewart Gordon, Don and Walter. One more down.
Comment #14 by leandro.lucarella — 2010-07-24T16:25:03Z
Don't forget to update the specs! :)