Bug 3356 – Make pure functions require immutable parameters

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2009-10-01T07:39:00Z
Last change time
2015-06-09T05:13:44Z
Keywords
spec
Assigned to
nobody
Creator
dfj1esp02

Comments

Comment #0 by dfj1esp02 — 2009-10-01T07:39:29Z
Illustration: --- int foo(const int[] bar) pure { return bar[1]; } void goo() { int[2] a; a[1]=1; foo(a); a[1]=2; foo(a); } --- 1. This doesn't affect functions with value type parameters. 2. When a function takes a reference type parameter, the chanses are slim, that the return value doesn't depend on the referenced data. So the referenced data must be immutable. 3. This doesn't require complex flow analysis, only a formal function signature check. 4. (??) Replace immutability of explicit pointer type with constness, since even if the referenced data is immutable, the code doesn't know, where the immutable data ends and can access subsequent possibly mutating data. This will instantly make any function, taking a pointer, impure. This should not apply to objects and byref data.
Comment #1 by clugdbug — 2009-10-01T08:09:24Z
(In reply to comment #0) > 2. When a function takes a reference type parameter, the chanses are slim, that > the return value doesn't depend on the referenced data. Yes. > So the referenced data must be immutable. That conclusion does not follow. I don't think you're seeing all of the benefits of 'pure'. Consider foo(a) + foo(a). This can be changed into 2*foo(a), even though a is not immutable. It is true that in the case where all parameters are immutable, additional optimisations (such as caching) can be performed. But there's more to pure than that. > 4. (??) Replace immutability of explicit pointer type with constness, since > even if the referenced data is immutable, the code doesn't know, where the > immutable data ends and can access subsequent possibly mutating data. This will > instantly make any function, taking a pointer, impure. This should not apply to > objects and byref data. That's a memory integrity issue, not a purity issue. That could only happen in an unsafe module. You are asking for a feature to be removed from the language, but I'm not really sure why.
Comment #2 by dfj1esp02 — 2009-10-02T04:51:21Z
(In reply to comment #1) > (In reply to comment #0) > > 2. When a function takes a reference type parameter, the chanses are slim, that > > the return value doesn't depend on the referenced data. > > Yes. > > > So the referenced data must be immutable. > > That conclusion does not follow. I don't think you're seeing all of the > benefits of 'pure'. > Consider foo(a) + foo(a). This can be changed into 2*foo(a), even though a is > not immutable. You participated in discussion of bug 3057, where I described my view of the problem. The question is *how* can you benefit from advertised pureness of in fact impure functions? > It is true that in the case where all parameters are immutable, additional > optimisations (such as caching) can be performed. But there's more to pure than > that. But what you did with foo(a)+foo(a) if not caching? You effectively cached the result of function without any requirement for immutability. > > 4. (??) Replace immutability of explicit pointer type with constness, since > > even if the referenced data is immutable, the code doesn't know, where the > > immutable data ends and can access subsequent possibly mutating data. This will > > instantly make any function, taking a pointer, impure. This should not apply to > > objects and byref data. > > That's a memory integrity issue, not a purity issue. That could only happen in > an unsafe module. > > You are asking for a feature to be removed from the language, but I'm not > really sure why. I'm asking to remove a bug from the language, because I think it's incorrect to allow marking impure functions as pure (and later "benefit" from this). I'm just proposing an easy solution to the problem.
Comment #3 by clugdbug — 2009-10-02T08:03:51Z
(In reply to comment #2) > (In reply to comment #1) > > (In reply to comment #0) > > > 2. When a function takes a reference type parameter, the chanses are slim, that > > > the return value doesn't depend on the referenced data. > > > > Yes. > > > > > So the referenced data must be immutable. > > > > That conclusion does not follow. I don't think you're seeing all of the > > benefits of 'pure'. > > Consider foo(a) + foo(a). This can be changed into 2*foo(a), even though a is > > not immutable. > > You participated in discussion of bug 3057, where I described my view of the > problem. The question is *how* can you benefit from advertised pureness of in > fact impure functions? If you pass it the same *values*, you get the same results. > > It is true that in the case where all parameters are immutable, additional > > optimisations (such as caching) can be performed. But there's more to pure than > > that. > > But what you did with foo(a)+foo(a) if not caching? You effectively cached the > result of function without any requirement for immutability. You don't need immutability in this case. foo(a) cannot change a, because foo is pure. There is nothing else in the expression which uses a. Therefore a does not change during the expression. (I'm assuming a is not shared). Therefore both calls to foo(a) have the same parameters, and the transformation foo(a) + foo(a) ---> 2*foo(a) is legal. As soon as you have an assignment, or a call to an impure function, you can't do this any more. But if a is immutable, you have a much stronger guarantee. > I'm asking to remove a bug from the language, because I think it's incorrect to > allow marking impure functions as pure (and later "benefit" from this). Caching can only occur if all parameters passed to the function are the same. If the parameters are references, you need to ensure the value being referenced has not changed. Which is trivial if it's an immutable reference, but that's not the _only_ case where it is true. This is just an argument that const references _could_ be allowable in pure functions. But after all that, the spec currently says that parameters have to be implicitly immutable! So you can just change this to an accepts-invalid bug. ...
Comment #4 by dfj1esp02 — 2009-10-05T04:19:45Z
It's hard to check actual immutability of reference type without formal immutability. This is the very reason why formal immutability exists.
Comment #5 by dfj1esp02 — 2009-10-05T04:28:04Z
Well, I must agree, your scheme works if programmer bothered to mark shared data properly.
Comment #6 by clugdbug — 2009-10-05T04:57:26Z
(In reply to comment #5) > Well, I must agree, your scheme works if programmer bothered to mark shared > data properly. It's still guaranteed that pure functions cannot read or write 'shared' or '__gshared' data members. So we know that it cannot be affected by other threads, so we only have to worry about other things which happen in that same expression. Note that you can just look at the parameter list and distinguish levels of purity: if there are any non-immutable reference parameters, it's not cachable. I guess the question is whether that additional complexity is justified. BTW, in the latest DMD, toLower(), toUpper() now take const(char)[] parameters, instead of immutable. Under your scheme, they could not be marked as pure.
Comment #7 by dfj1esp02 — 2009-10-05T06:19:38Z
For me this compiles: ---- __gshared char[] str; char fun(in char[] s) pure { return s[0]; } int main() { char a=fun(str); return 0; } ---- __gshared is not a type modifier. Sharing feature is young, it's still not clear how well it suits real code, recently its behavior has changed to imply synchronization, we don't know whether it will be widely adopted, so I don't think It's a good idea to rely on it so hardly.
Comment #8 by dfj1esp02 — 2009-10-05T06:22:45Z
> BTW, in the latest DMD, toLower(), toUpper() now take const(char)[] parameters, > instead of immutable. Under your scheme, they could not be marked as pure. Well then string functions from bug 3057 can be marked pure.
Comment #9 by dfj1esp02 — 2009-10-05T06:41:16Z
> Note that you can just look at the parameter list and distinguish levels of > purity: if there are any non-immutable reference parameters, it's not cachable. > I guess the question is whether that additional complexity is justified. Yes, two flavors of pureness can be introduces, it should be documented, what optimizations can be applied to them in relation to threading model. But isn't it a lot of work to specify and implement two flavors of pureness at the same time? At this time optimization for non-immutable arguments should be turned off until it's implemented correctly. Does dmd perform any pure optimizations at this time?
Comment #10 by clugdbug — 2009-10-05T07:11:21Z
(In reply to comment #7) > For me this compiles: > ---- > __gshared char[] str; > > char fun(in char[] s) pure > { > return s[0]; > } > > int main() > { > char a=fun(str); > return 0; > } > ---- > > __gshared is not a type modifier. Ouch. That definitely shouldn't compile. Obviously 'pure' isn't doing any parameter checking at all. The parameter isn't even const!! But yes, I agree there's a problem there. It's not going to work if __gshared parameters are passed to pure functions. Note that __gshared isn't allowed in SafeD, but still, this is starting to look ugly. > Sharing feature is young, it's still not clear how well it suits real code, > recently its behavior has changed to imply synchronization, we don't know > whether it will be widely adopted, so I don't think It's a good idea to rely on it so hardly. Yes. Although the only long as 'shared' remains transitive, it'd be OK. (In reply to comment #9) > > Note that you can just look at the parameter list and distinguish levels of > > purity: if there are any non-immutable reference parameters, it's not cachable. > > I guess the question is whether that additional complexity is justified. > > Yes, two flavors of pureness can be introduces, it should be documented, what > optimizations can be applied to them in relation to threading model. But isn't > it a lot of work to specify and implement two flavors of pureness at the same > time? No, because you can get it trivially from the function signature. The big problem is that from the user's point of view, the language is too complicated already. Although, this really only affects optimisation, so users don't really need to know about it. At this time optimization for non-immutable arguments should be turned > off until it's implemented correctly. Does dmd perform any pure optimizations > at this time? Almost none.
Comment #11 by dfj1esp02 — 2009-10-05T07:54:07Z
> Ouch. That definitely shouldn't compile. Obviously 'pure' isn't doing any > parameter checking at all. The parameter isn't even const!! It's const. I use 'in' instead of const. > Note that __gshared isn't allowed in > SafeD, but still, this is starting to look ugly. __gshared is in fact a hack for those not ready to migrate to shared, but still want to use normal global variables with no effect on their types. > > Yes, two flavors of pureness can be introduces, it should be documented, what > > optimizations can be applied to them in relation to threading model. But isn't > > it a lot of work to specify and implement two flavors of pureness at the same > > time? > No, because you can get it trivially from the function signature. You meant, from the argument list? It doesn't really matter, whether the parameter is const or immutable, matters only actual constness, so if caller passes immutable argument as const to a pure function, it's the same as if parameter was declared immutable, so the level of purity actually depends on what arguments are passed to the function and thus changes from call to call as caller passes arguments with different constness. So it's actually the caller who decides pureness of the call and how to optimize it, the good news is caller knows the constness of perameters it passes. Immutability may be required only when callee wants to do immutable optimizations and call immutable methods.
Comment #12 by clugdbug — 2010-11-08T19:10:03Z
With the relaxed purity rules in DMD2.050, the compiler now internally distinguishes const pure functions (which was the old meaning of pure) from immutably pure functions (this issue) and also allows weakly pure. I don't know whether to mark this as FIXED or WONTFIX. But since the compiler internally keeps track of immutability pure, I'm marking as FIXED.