Bug 8803 – map.filter.array run map delegate an incorrect number of time.

Status
RESOLVED
Resolution
WONTFIX
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-10-11T14:12:00Z
Last change time
2012-10-17T09:27:27Z
Assigned to
nobody
Creator
deadalnix

Comments

Comment #0 by deadalnix — 2012-10-11T14:12:44Z
See sample code : import std.algorithm; import std.range; void main() { uint i; uint[] iarray; iarray.length = 10; iarray.map!(a => i++).filter!(a => a > 0).array(); import std.stdio; writeln(i); } This program output 19. In other terms, map delegate is run twice on every element on which filter's delegate returns true.
Comment #1 by peter.alexander.au — 2012-10-16T09:51:00Z
I'm assuming the bug here is that map's ddoc says it caches front, but it clearly isn't. I've updated the ddoc to reflect this. https://github.com/D-Programming-Language/phobos/pull/876
Comment #2 by deadalnix — 2012-10-16T10:17:43Z
(In reply to comment #1) > I'm assuming the bug here is that map's ddoc says it caches front, but it > clearly isn't. I've updated the ddoc to reflect this. > > https://github.com/D-Programming-Language/phobos/pull/876 I don't think fixing the ddoc is the fix. Being unable to do .map.filter defeat the whole point of map and filter.
Comment #3 by peter.alexander.au — 2012-10-16T13:59:38Z
(In reply to comment #2) > Being unable to do .map.filter defeat the whole point of map and filter. Sorry, I'm not following. Why are you saying you can't do a .map.filter? You can, it just calls the map function more than once per element. What exactly is the problem?
Comment #4 by deadalnix — 2012-10-16T14:20:32Z
(In reply to comment #3) > (In reply to comment #2) > > Being unable to do .map.filter defeat the whole point of map and filter. > > Sorry, I'm not following. Why are you saying you can't do a .map.filter? You > can, it just calls the map function more than once per element. > > What exactly is the problem? The problem is that the delegate get executed an impredictable number of time. Which make side effect extremely hard to handle (I ended up using map.array.filter most of the time in my own codebase) in terms of side-effect or performance. Additionally, the problem is dependent of the inner implementation of both map and filter, which should stay unknown for the user.
Comment #5 by peter.alexander.au — 2012-10-16T14:31:13Z
(In reply to comment #4) > The problem is that the delegate get executed an impredictable number of time. > Which make side effect extremely hard to handle (I ended up using > map.array.filter most of the time in my own codebase) in terms of side-effect > or performance. > > Additionally, the problem is dependent of the inner implementation of both map > and filter, which should stay unknown for the user. I think the real problem here is that your mapping function has side effects. The undocumented intention of map is that the mapping function is pure. Using it to produce side-effects on the function call is not the intended use of map (just like having side effects in the comparison function in a sort would be a bad idea). Other than the incorrect documentation, I don't think there is a bug here.
Comment #6 by alex — 2012-10-16T16:03:15Z
I think it would be a good idea to update std.algorithm's docs to reflect that some first-class functions are expected to be pure. Can anyone send a pull request for that?
Comment #7 by deadalnix — 2012-10-17T02:47:53Z
(In reply to comment #5) > (In reply to comment #4) > > The problem is that the delegate get executed an impredictable number of time. > > Which make side effect extremely hard to handle (I ended up using > > map.array.filter most of the time in my own codebase) in terms of side-effect > > or performance. > > > > Additionally, the problem is dependent of the inner implementation of both map > > and filter, which should stay unknown for the user. > > I think the real problem here is that your mapping function has side effects. > The undocumented intention of map is that the mapping function is pure. Using > it to produce side-effects on the function call is not the intended use of map > (just like having side effects in the comparison function in a sort would be a > bad idea). > > Other than the incorrect documentation, I don't think there is a bug here. Well, I do think many valid uses include impure functions, even in sort. For instance for benchmarking purpose, for educational purpose, for caching. More importantly, pure function isn't enough. A pure function can return 2 different objects. Even if the object's content will be the same, its identity will not, which is a problem in many cases (the example above is simplified, in my case that was the issue).
Comment #8 by peter.alexander.au — 2012-10-17T04:54:17Z
(In reply to comment #7) > Well, I do think many valid uses include impure functions, even in sort. For > instance for benchmarking purpose, for educational purpose, for caching. > > More importantly, pure function isn't enough. A pure function can return 2 > different objects. Even if the object's content will be the same, its identity > will not, which is a problem in many cases (the example above is simplified, in > my case that was the issue). I think this issue boils down to this: you are expecting something from map, which is does not advertise to do (any more). There is no bug. You could change this to an enhancement request, because you are asking for an additional feature. Note that if you do add caching for 'front', you still have the same problem if you try to use indexing. e.g. auto xs = [1, 2, 3]; auto m = xs.map!(fun)(); auto ref a = m[1], b = m[1]; The only way this could guarantee to apply fun once is if it caches the entire range, which is totally impractical.
Comment #9 by deadalnix — 2012-10-17T05:24:12Z
(In reply to comment #8) > (In reply to comment #7) > > Well, I do think many valid uses include impure functions, even in sort. For > > instance for benchmarking purpose, for educational purpose, for caching. > > > > More importantly, pure function isn't enough. A pure function can return 2 > > different objects. Even if the object's content will be the same, its identity > > will not, which is a problem in many cases (the example above is simplified, in > > my case that was the issue). > > I think this issue boils down to this: you are expecting something from map, > which is does not advertise to do (any more). There is no bug. > > You could change this to an enhancement request, because you are asking for an > additional feature. > That wasn't an additional feature. Using such method, it is easy to solve the entire bug database within the day. The spec was made that way for good reason : this is how map works in every single language that have map and allow side effects. This violate the least principle and expose filter and map internals. > Note that if you do add caching for 'front', you still have the same problem if > you try to use indexing. e.g. > > auto xs = [1, 2, 3]; > auto m = xs.map!(fun)(); > auto ref a = m[1], b = m[1]; > > The only way this could guarantee to apply fun once is if it caches the entire > range, which is totally impractical. This can be solved by caching only 0 .. index . Or by caching within filter and not map. Or eventually by stating that this is not doable for indices. Most languages don't allow index on map or create a temporary array automatically.
Comment #10 by nyphbl8d — 2012-10-17T06:33:27Z
Changing the documentation to match does not necessarily make the code correct. In this case it means that the documentation is now broken as well.
Comment #11 by andrei — 2012-10-17T09:26:02Z
There are clear advantages and disadvantages of both approaches, and reasonable people may disagree on the choice. I will close this; formerly map did cache its current element, to puzzlement by its users. I changed the behavior a long time ago, to much less puzzlement.
Comment #12 by deadalnix — 2012-10-17T09:27:27Z
(In reply to comment #11) > There are clear advantages and disadvantages of both approaches, and reasonable > people may disagree on the choice. I will close this; formerly map did cache > its current element, to puzzlement by its users. I changed the behavior a long > time ago, to much less puzzlement. Then filter should call .front twice.