Bug 1596 – op*Assign should return void

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2007-10-19T17:07:21Z
Last change time
2019-10-24T08:27:14Z
Assigned to
No Owner
Creator
Andrei Alexandrescu

Comments

Comment #0 by andrei — 2007-10-19T17:07:21Z
D's assignment operators opAddAssign, opMulAssign etc. have taken part of C++'s behavior in the area. In order to behave similarly to a primitive type, a user-defined operator typically returns *this as an rvalue (in C++ they'd usually return an lvalue). There are a few problems with this approach: 1. Extra busywork done for no good reason. The code: a += b; translates to: typeof(a) __unused = void; a.opAddAssign(b, __unused); So there is one extra memcpy performed for no reason (the copy is done on the caller site, which has no idea whether the result of the operation will be used or not). This problem will be exacerbated when copy construction semantics will be introduced. 2. Extra work for the programmer, and relying on convention instead of language rules. Experience with C++ shows that user-defined implementations of assignment operators always return *this, and that there is absolutely no reason to return anything else. Fix: The rule belongs to the language. Require all op*Assign to return void. If a value is required on the caller side, pass the left-hand side of the operation.
Comment #1 by aldacron — 2007-10-23T10:35:37Z
[email protected] wrote: > http://d.puremagic.com/issues/show_bug.cgi?id=1596 > > Summary: op*Assign should return void > Product: D > Version: unspecified > Platform: All > OS/Version: All > Status: NEW > Severity: enhancement > Priority: P2 > Component: DMD > AssignedTo: [email protected] > ReportedBy: [email protected] > > > D's assignment operators opAddAssign, opMulAssign etc. have taken part of C++'s > behavior in the area. In order to behave similarly to a primitive type, a > user-defined operator typically returns *this as an rvalue (in C++ they'd > usually return an lvalue). > > There are a few problems with this approach: > > 1. Extra busywork done for no good reason. The code: > > a += b; > > translates to: > > typeof(a) __unused = void; > a.opAddAssign(b, __unused); > > So there is one extra memcpy performed for no reason (the copy is done on the > caller site, which has no idea whether the result of the operation will be used > or not). This problem will be exacerbated when copy construction semantics will > be introduced. > > 2. Extra work for the programmer, and relying on convention instead of language > rules. > > Experience with C++ shows that user-defined implementations of assignment > operators always return *this, and that there is absolutely no reason to return > anything else. > > Fix: > > The rule belongs to the language. Require all op*Assign to return void. If a > value is required on the caller side, pass the left-hand side of the operation. I think this is right. But it's worth mentioning that this would break some of my code, since I'm using the return type of opXXAssign() operations in expression templates. However, since D expression templates don't work for comparison operators (because opCmp() returns an int), changing this is likely to force us to a better expression template solution.
Comment #2 by yebblies — 2011-07-04T21:53:37Z
Is this still valid? I'd imaging opXXXAssign would return ref typeof(this) these days.
Comment #3 by issues.dlang — 2011-07-04T23:36:02Z
It's valid in that Andrei seems to be requesting that op*Assign be _required_ to return ref typeof(this), whereas currently, you can make it return whatever you want.
Comment #4 by yebblies — 2011-07-05T00:09:50Z
(In reply to comment #3) > It's valid in that Andrei seems to be requesting that op*Assign be _required_ > to return ref typeof(this), whereas currently, you can make it return whatever > you want. Well, reason 1 doesn't apply to returning ref typeof(this), and one valid use I can think of for returning something else would be expression templates. As it's four years old, I'd close this if it was my report, but I'll leave it to Andrei to make a decision on this.
Comment #5 by schveiguy — 2011-07-05T05:23:14Z
D should not be forcing behavior on normal functions, which operators are. They are special in that if defined in a specific way, they will be used when using operators, but other than that, they are ordinary functions. If you want to force behavior, make the operators keywords (like C++). Otherwise, you have special rules for ordinary symbols. Lowering is supposed to work by simply rewriting code, not by requiring certain signatures. This relates to Andrei's later bug regarding opEquals - the compiler currently requires a certain signature for opEquals, and it simply gets in the way of writing efficient or usable code. Besides, I think Andrei's original point is that it was impossible to do the most efficient thing (return the lvalue of this), which is no longer the case. So this was an attempt to solve that problem using a language rule instead of adding ref returns. I think this bug is very obsolete.
Comment #6 by clugdbug — 2011-07-05T08:38:38Z
(In reply to comment #5) > D should not be forcing behavior on normal functions, which operators are. > They are special in that if defined in a specific way, they will be used when > using operators, but other than that, they are ordinary functions. If you want > to force behavior, make the operators keywords (like C++). Otherwise, you have > special rules for ordinary symbols. Lowering is supposed to work by simply > rewriting code, not by requiring certain signatures. ??? The existing language doesn't obey these rules, which you've apparently just made up. > This relates to Andrei's later bug regarding opEquals - the compiler currently > requires a certain signature for opEquals, and it simply gets in the way of > writing efficient or usable code. Actually the problem with opEquals is that there is NO option which works properly. Giving you more bad options wouldn't help. > Besides, I think Andrei's original point is that it was impossible to do the > most efficient thing (return the lvalue of this), which is no longer the case. > So this was an attempt to solve that problem using a language rule instead of > adding ref returns. No. The point is this: There is only ONE correct choice for the return value: it must be some form of 'return this'. The compiler is better equipped to work out the most efficient way to do it, than the programmer is. Giving freedom to choose the return value does only two things: (1) it gives you an opportunity to make a mistake. (2) it adds extra noise in the code. The one legitimate use of a return value is in expression templates, but as already mentioned, they don't work with opCmp.
Comment #7 by schveiguy — 2011-07-05T09:02:07Z
(In reply to comment #6) > (In reply to comment #5) > > D should not be forcing behavior on normal functions, which operators are. > > They are special in that if defined in a specific way, they will be used when > > using operators, but other than that, they are ordinary functions. If you want > > to force behavior, make the operators keywords (like C++). Otherwise, you have > > special rules for ordinary symbols. Lowering is supposed to work by simply > > rewriting code, not by requiring certain signatures. > > ??? > The existing language doesn't obey these rules, which you've apparently just > made up. What rules? There should be no "rules" of what an ordinary function should be or should not return. I believe the current compiler works this way, at least in terms of opAddAssign. When the compiler sees a += b, it should rewrite as a.opAddAssign(b) (or whatever the equivalent template should be). Why does it matter what opAddAssign returns? If it returns a C, who cares? There is nothing special about opAddAssign except that it's the vehicle to implement += via lowering. > > This relates to Andrei's later bug regarding opEquals - the compiler currently > > requires a certain signature for opEquals, and it simply gets in the way of > > writing efficient or usable code. > > Actually the problem with opEquals is that there is NO option which works > properly. Giving you more bad options wouldn't help. Of course there is: struct T { bool opEquals(T other) {...} } The issue with opEquals (and this is different from opAddAssign) is the compiler will use opEquals to populate the xequals function pointer in the TypeInfo, if it is the right signature. The error in the current compiler is that opEquals does not even compile unless you use a useless signature. This "feature" was added to fix something else, but I think it was an error to add it. I should be able to do: bool opEquals(int other) if I so wish, and it just won't populate the xequals function. It's a normal symbol, and should be a normal function. opEquals is not a special token, it's a normal symbol. > > Besides, I think Andrei's original point is that it was impossible to do the > > most efficient thing (return the lvalue of this), which is no longer the case. > > So this was an attempt to solve that problem using a language rule instead of > > adding ref returns. > > No. The point is this: There is only ONE correct choice for the return value: > it must be some form of 'return this'. The compiler is better equipped to work > out the most efficient way to do it, than the programmer is. opAddAssign probably should return this. But it could also return void. Or it could return int. It shouldn't matter to the compiler. > Giving freedom to choose the return value does only two things: > (1) it gives you an opportunity to make a mistake. > (2) it adds extra noise in the code. (3) it allows some future use we do not foresee, people can be inventive. (4) it makes the compiler simpler and more consistent. BTW, what is the extra noise in the code? The specification of the return value? I find this a weak argument. What if you want to return by ref or by value? Don't get me wrong, I think in most cases, it should return this by reference. But there could be cases where returning something else would be advantageous. For example, what if you have a struct parameterized on constancy, and you want to return a different const form? There are some crazy amazing things I've seen in phobos and other code that nobody thought of before, and some of it is based on unintuitive uses of operators.
Comment #8 by clugdbug — 2011-07-05T13:22:33Z
(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #5) > > > D should not be forcing behavior on normal functions, which operators are. > > > They are special in that if defined in a specific way, they will be used when > > > using operators, but other than that, they are ordinary functions. If you want > > > to force behavior, make the operators keywords (like C++). Otherwise, you have > > > special rules for ordinary symbols. Lowering is supposed to work by simply > > > rewriting code, not by requiring certain signatures. > > > > ??? > > The existing language doesn't obey these rules, which you've apparently just > > made up. > > What rules? There should be no "rules" of what an ordinary function should be > or should not return. I believe the current compiler works this way, at least > in terms of opAddAssign. The rules you made up about lowering not involving signatures. Almost every case where the compiler recognizes particular names, it requires a particular signature. The existing opAddAssign is one of the few cases where the requirements are loose. > When the compiler sees a += b, it should rewrite as a.opAddAssign(b) (or > whatever the equivalent template should be). Why does it matter what > opAddAssign returns? If it returns a C, who cares? There is nothing special > about opAddAssign except that it's the vehicle to implement += via lowering. Basically, the more you can enforce semantics of operators, the more useful they are. In C++, the STL is based entirely on agreed semantics of operators. > > > > This relates to Andrei's later bug regarding opEquals - the compiler currently > > > requires a certain signature for opEquals, and it simply gets in the way of > > > writing efficient or usable code. > > > > Actually the problem with opEquals is that there is NO option which works > > properly. Giving you more bad options wouldn't help. > > Of course there is: > > struct T > { > bool opEquals(T other) {...} > } That doesn't work with const(T). > > > Besides, I think Andrei's original point is that it was impossible to do the > > > most efficient thing (return the lvalue of this), which is no longer the case. > > > So this was an attempt to solve that problem using a language rule instead of > > > adding ref returns. > > > > No. The point is this: There is only ONE correct choice for the return value: > > it must be some form of 'return this'. The compiler is better equipped to work > > out the most efficient way to do it, than the programmer is. > > opAddAssign probably should return this. But it could also return void. Or it > could return int. It shouldn't matter to the compiler. > > > Giving freedom to choose the return value does only two things: > > (1) it gives you an opportunity to make a mistake. > > (2) it adds extra noise in the code. > > (3) it allows some future use we do not foresee, people can be inventive. > (4) it makes the compiler simpler and more consistent. > > BTW, what is the extra noise in the code? The specification of the return > value? I find this a weak argument. What if you want to return by ref or by > value? Don't get me wrong, I think in most cases, it should return this by > reference. But there could be cases where returning something else would be > advantageous. For example, what if you have a struct parameterized on > constancy, and you want to return a different const form? There are some crazy amazing things I've seen in phobos and other code that nobody thought of > before, and some of it is based on unintuitive uses of operators. (4) is untrue. (3) was discussed by Andrei. Although it sounds plausible, and it's the reason it existed in C++, experience in C++ has shown that this is NOT true. It has been thoroughly explored, and it is not a useful feature. So (3) isn't true either.
Comment #9 by smjg — 2011-07-08T15:27:21Z
(In reply to comment #0) > Experience with C++ shows that user-defined implementations of > assignment operators always return *this, and that there is > absolutely no reason to return anything else. What's C++ to do with anything? This is D. You can't just blindly apply principles learned from one language to another without understanding how the principles depend on characteristics of the language. In this instance, from what I can make out it is because C++ isn't garbage collected, so you're bound to want to implement such operations by modifying the object in-place rather than creating a new one. Arithmetical classes (which should be immutable, so that they behave as values) are an example of something that can benefit from the latter in a garbage collected language such as D, if only the small change I describe below is made. > The rule belongs to the language. Require all op*Assign to return void. If a > value is required on the caller side, pass the left-hand side of the operation. I agree that the design of op*Assign is flawed, but disagree with this change. Here's how it should work IMO: - op*Assign (including plain opAssign) may return either void or typeof(this). - If it's void, the AssignExpression calls this op*Assign and then returns its lvalue. - If it's typeof(this), the AssignExpression calls this op*Assign, sets the lvalue to what op*Assign returns and then returns it.
Comment #10 by issues.dlang — 2011-07-08T16:07:37Z
>- If it's typeof(this), the AssignExpression calls this op*Assign, sets the >lvalue to what op*Assign returns and then returns it. That would be a _bad_ idea. On some level, it defeats the purpose of op*Assign in the first place. There are cases where it is more efficient to do the operation and assign in one operation. Your suggestion basically relegates to op*Assign to being another opBinary except that it allows for the compiler to replace a op= b with a = a op b. Assignment _needs_ to be part of the implementation of op*Assign. That's part of the point. Yes, the return value of op*Assign should be the same as the value as the lvalue, but getting the return value and then doing the assignment makes op*Assign pretty much pointless. If that were the proper way to do it, we wouldn't have op*Assign in the first place. We'd just create op= from opBinary and opAssign. And that's _not_ what we want.
Comment #11 by smjg — 2011-07-08T16:42:24Z
I'm not sure you get the point. It's to give the class designer the choice between in-place modification and creating a new object as the implementation of op=. OK, so a op= b ought to be automatically equivalent to a = a op b in the absence of an opOpAssign. So the rule would be to define opOpAssign iff you want it to modify in-place. But this still doesn't cover cases where whether the modification is in-place isn't a compile-time constant. Look at how ~= works for instance.
Comment #12 by razvan.nitu1305 — 2019-10-24T08:27:14Z
I'm going to close this since changing op*Assign to make it mandatory to return a specific type would break a lot of code.