← Back to index | Original Bugzilla link

Bug 10670 – std.algorithm.reduce: no-seed initialization wrong design

Status: RESOLVED
Resolution: WONTFIX
Severity: normal
Priority: P2
Component: phobos
Product: D
Version: D2
Platform: All
OS: All
Creation time: 2013-07-19T03:07:10Z
Last change time: 2019-12-13T09:45:52Z
Assigned to: monarchdodra
Creator: monarchdodra

Comments

Comment #0 by monarchdodra — 2013-07-19T03:07:10Z

reduce has a "no-seed" implementation provided: "The one-argument version $(D reduce!fun(range)) works similarly, but it uses the first element of the range as the seed (the range must be non-empty)." I question this design as invalid: The first element is never passed through the transform function. There is no reason it be given any "special treatment". Here is a trivial example of why the design is (IMO) wrong: //---- import std.stdio, std.algorithm, std.range; void main(string[] args) { auto arr = [1, 1]; auto sumDoubles1 = reduce!"a + 2*b"(0, arr); auto sumDoubles2 = reduce!"a + 2*b"(arr); writeln(sumDoubles1); //4 writeln(sumDoubles2); //3 (!) } //---- I really can't see any universe where you could justify that return value... The correct seed value should be "Seed seed = fun(Seed.init, r.front)": This means the first element is correctly "processed". Of course, doing this really just boils down to seed being Seed.init. An added "bonus" is that it makes simple "isIterable" types much more efficient, as they don't have to check "isInitialized" on every iteration. -------- Unsure if this is "ER", "wrong-code" or what. Not sure if changing the behavior is acceptable (IMO, it should be). I was re-writing reduce already, so please provide your feedback on the issue so that I can take it into account :)

Comment #1 by peterneubauer2 — 2013-07-19T07:34:43Z

float.init is nan, so this change would require all (float[]).reduce!"a+b" calls to be rewritten with an explicit seed. I imagine that this is a relatively common use case, while I've never had to reduce!"a+2b" anything. Users of the latter case should get the burden of the explicit seed.

Comment #2 by hsteoh — 2013-07-19T08:07:58Z

I've run into this issue before. I agree that if no seed value is provided, it should use ElementType!range.init, NOT range.front. In the case of floats, well... I'd argue that using the seedless variety of reduce on a float range is already a bug, so it should be fixed anyway. Alternatively, since float.init is always NaN, it makes no sense to use the seedless variety of reduce, so we could just static assert(false) if isFloatingPointType!(ElementType!range). This will break existing code but at least it will point out why the code is breaking, instead of silently changing whatever the current behaviour is.

Comment #3 by monarchdodra — 2013-07-22T03:21:52Z

I've done the dev. Here is my preliminary experience with it: For starters, there *is* some convenience in using "front as seed" for calls such as: //---- double[] a = [ 3, 4, 7, 11, 3, 2, 5 ]; // Compute minimum and maximum in one pass auto r = reduce!(min, max)(a); //---- Here, even a straight up "0.0" is a bad seed. The code needs to be fixed as: //---- double[] a = [ 3, 4, 7, 11, 3, 2, 5 ]; // Compute minimum and maximum in one pass auto r = reduce!(min, max)(tuple(double.max, double.min), a); //---- Not very convenient. But still, nothing surmountable. But at the same time, if you look at the existing unittests, you can see some strange things such as: //---- // two funs auto r2 = reduce!("a + b", "a - b")(tuple(0.0, 0.0), a); writeln(r2); assert(r2[0] == 7 && r2[1] == -7); auto r3 = reduce!("a + b", "a - b")(a); assert(r3[0] == 7 && r3[1] == -1); //---- I don't know what that was trying to prove, but it shows that no-seed can do some really strange things. The "assert not float (or char for that matter)" was very efficient in catching float seeds, and catched almost all of the "wrong usage" cases. The only one that it did not catch was the above case. But such code doesn't make sense to me. I think "silent breakage" would really be minimal.

Comment #4 by bearophile_hugs — 2013-07-22T04:55:04Z

(In reply to comment #2) > In the case of floats, well... I'd argue that using the seedless variety of > reduce on a float range is already a bug, I don't think it's a bug, this is Python: >>> a = [2.0, 3.0, 4.0] >>> reduce(lambda x, y: x * y, a) 24.0 I have a ton of D code that relies on such behavour of D reduce. I suggest to just swap the seed and sequence arguments of reduce, to support UFCS chains, and leave the rest of reduce as it is.

Comment #5 by monarchdodra — 2013-07-22T22:59:37Z

(In reply to comment #4) > (In reply to comment #2) > > > In the case of floats, well... I'd argue that using the seedless variety of > > reduce on a float range is already a bug, > > I don't think it's a bug, this is Python: > > >>> a = [2.0, 3.0, 4.0] > >>> reduce(lambda x, y: x * y, a) > 24.0 > > I have a ton of D code that relies on such behavour of D reduce. You could be right, and I've seen use cases where it makes a lot of sense, eg: reduce!`a ~ " " ~ b`(["hello", "world"]); > I suggest to just swap the seed and sequence arguments of reduce, to support > UFCS chains, and leave the rest of reduce as it is. I've tried in http://d.puremagic.com/issues/show_bug.cgi?id=8755, but the conclusion is that it is not possible without breaking code. And when I say "breaking code", I mean "still compiles but silently generates different results". That, and no migration plan was found. Short of finding a new name for a new function, I have not been able to achieve this goad.

Comment #6 by bearophile_hugs — 2013-07-22T23:25:14Z

(In reply to comment #5) > I've tried in http://d.puremagic.com/issues/show_bug.cgi?id=8755, but the > conclusion is that it is not possible without breaking code. And when I say > "breaking code", I mean "still compiles but silently generates different > results". That, and no migration plan was found. Short of finding a new name > for a new function, I have not been able to achieve this goad. I understand. I suggest to introduce a new function named like your "fold" and slowly deprecate "reduce".

Comment #7 by monarchdodra — 2014-02-21T13:58:38Z

(In reply to comment #6) > I understand. I suggest to introduce a new function named like your "fold" and > slowly deprecate "reduce". Done. https://github.com/D-Programming-Language/phobos/pull/1955 "fold" will use T.init, and "fold1" will use the front explicity: [1, 1].fold!"a + 2 * b".fold() => 4 [1, 1].fold!"a + 2 * b".fold1() => 3

Comment #8 by greensunny12 — 2018-02-10T22:13:34Z

fold has been introduced as deprecating reduce wasn't an option. I don't think we can modify or deprecate the behavior of reduce, so I'm inclined to close this as WONTFIX.