Reduce cannot have a seed applied elegantly when used with UFCS due to the argument order being range, seed. A newly named version with the argument order reversed would allow the use of the full range of reduce's capabilities with UFCS arguments.
Suggestion:
auto fold(range, seed) {}
to allow:
[1,2,3,4,5].fold!((a, b) => a + b * b)(0).writeln;
With the possibility of a longer chain in place of the array argument, a normal use with UFCS. At present only a few uses such as sum, max and min are usable with the unseeded form, limiting the usefulness of reduce.
Giving it a default of "a + b" would also improve its use as summing is probably the most common use case.
Comment #1 by bearophile_hugs — 2012-10-04T09:39:15Z
(In reply to comment #0)
> auto fold(range, seed) {}
Good.
> Giving it a default of "a + b" would also improve its use as summing is
> probably the most common use case.
This is not a good idea. Invisible defaults are magic, and magic is bad.
It's better to introduce an optimized sum() function, as in Haskell:
Prelude> sum [1,2,3]
6
And Python:
>>> sum([1,2,3])
6
A sum() function needs to work with fixed sized arrays too (like reduce/fold), and it needs a specialization for short fixed-sized arrays, so this code:
int[4] a;
auto b = sum(a);
gets compiled about as:
int[4] a;
auto b = a[0] + a[1] + a[2] + a[3];
Comment #2 by adamsibson — 2012-10-04T10:03:04Z
> This is not a good idea. Invisible defaults are magic, and magic is bad.
Why is it any different to sort being "a < b" by default? Should we require that sort is always sort!"a < b"?
Comment #3 by bearophile_hugs — 2012-10-04T12:26:41Z
(In reply to comment #2)
> Why is it any different to sort being "a < b" by default? Should we require
> that sort is always sort!"a < b"?
It's different because it's widely accepted that "just sorting" a sequence returns it ordered from the min value. In Python, Haskell, Ruby and several other languages "just sort" has such definite meaning.
But I don't know of any language where reduce/fold has a default function that sums.
When people not expert of D look at code like this, they understand its meaning:
auto data = [3, 2, 1];
data.sort();
writeln(data.sum());
But when they see this, they can't know/see what this reduce is doing:
auto data = [3, 2, 1];
writeln(data.reduce());
Summing items of an iterable is a very common operation, and having a specialized function (as in Python, Haskell, and other languages, even in Fortran) with a short clear name as sum() is good.
Comment #4 by verylonglogin.reg — 2013-03-22T00:00:19Z
*** Issue 9687 has been marked as a duplicate of this issue. ***
Comment #8 by bearophile_hugs — 2014-01-25T02:20:53Z
(In reply to comment #7)
> Maybe we could rename it to flipArgs to make it useable?
Beside changing the order of reduce arguments you can also shorten the name of that flipping function. But "flipArgs" seems a nice name for a function that works on 2, 3, ... arguments.
Comment #9 by monarchdodra — 2014-02-19T14:00:05Z
(In reply to comment #7)
> Haskell has a function call flip for this.
> http://hackage.haskell.org/package/base-4.6.0.1/docs/Prelude.html#v:flip
>
> It turns a function taking (a, b) into one taking (b, a) instead.
>
> In phobos this is called binaryReverseArgs.
> http://dlang.org/phobos/std_functional.html#.binaryReverseArgs
>
> So one can use
> rng.binaryReverseArgs!(reduce!fun)(seed);
>
> Maybe we could rename it to flipArgs to make it useable?
Yeah... but who would actually use that in our code? I don't think it's an acceptable solution.
After having thought and worked on this for about a year, I *think* the only solution that wouldn't silently break code, is a rename. That or drop the issue as "won't fix".
I suggest we use "accumulate". It's the perfect synonym.
Can we agree to go with this solution?
Comment #10 by code — 2014-02-19T14:05:12Z
How about "fold" instead, that's the classical functional programming name for that. Also there is foldl and foldr in Haskell.
Comment #11 by monarchdodra — 2014-02-19T14:53:47Z
(In reply to comment #10)
> How about "fold" instead, that's the classical functional programming name for
> that. Also there is foldl and foldr in Haskell.
Took a quick look at the doc, and I like what I see. I'm convinced with fold. It also has documented variants, which we could also implement.
Comment #12 by monarchdodra — 2014-02-21T13:56:43Z
(In reply to comment #10)
> How about "fold" instead, that's the classical functional programming name for
> that. Also there is foldl and foldr in Haskell.
"Introduce fold to algorithm":
https://github.com/D-Programming-Language/phobos/pull/1955
Comment #13 by monarchdodra — 2014-03-20T12:32:49Z
Comment #14 by bearophile_hugs — 2014-03-20T18:28:41Z
>Furthermore, it also improves usability by making the seeds passed by parameter pack, instead of forcing the use of a tuple.<
OK. (Despite in a modern language tuples should be built-in, so using them should be natural, common, and syntactically very cheap. In Python/Haskell/Scala code you don't see functions that refrain from accepting a tuple).
>Finally, it allows using only 1 seed, in which case, the same seed is replicated and is used for all the functions.<
This is from the unittests:
// Compute sum and sum of squares in one pass.
// This can be used to compute get the average and standard deviation.
// A single seed (0.0) is passed, but it is optional
// if the range is not empty.
r = a.fold!("a + b", "a + b * b")(0.0);
assert(approxEqual(r[0], 35)); // sum
assert(approxEqual(r[1], 233)); // sum of squares
This is ambiguous, it seems that "a + b" has a seed while "a + b * b" doesn't have a seed. So in my opinion if you give N function then you need to give 0 seeds, or one N-tuple, or N seeds. So I don't like this.
>Oh yeah, also, I made it so that when no seed is given, it is an Error to use an empty range. This is the only case of deviation, but I think having nothrow justifies it.<
I am not sure this is a good idea. Throwing when you give no seed is probably acceptable. But I am not sure.
> "iterables" are not supported anymore.
I don't understand what this means.
The ddocs of fold say:
>Note: $(D fold) replaces $(D reduce): It retains the same functionality and
behavior, but uses an updated and more convenient interface.<
So you retain the same functionality or you don't.
If by "iterables" you mean that fold doesn't accept opApply-based iterables then I am against this change, I have plenty of code that opApply-based and I sometimes use reduce on them.
Comment #15 by daniel350 — 2014-03-20T20:02:07Z
Why is reduce (sorry, fold) allowing multiple function arguments in the first place?
If you want to compose functions to avoid another O(n) iteration, then you should compose the reduce function to return a tuple yourself.
That way it is clear what the code is doing, instead of this magic N-tuple special case return type.
Comment #16 by monarchdodra — 2014-03-21T01:44:56Z
(In reply to comment #15)
> Why is reduce (sorry, fold) allowing multiple function arguments in the first
> place?
>
> If you want to compose functions to avoid another O(n) iteration, then you
> should compose the reduce function to return a tuple yourself.
>
> That way it is clear what the code is doing, instead of this magic N-tuple
> special case return type.
You shouldn't have to need to look at what the code is doing. It's a library.
We accept multiple function arguments because:
auto minmax = myRange.reduce!(min, max)();
Is incredibly straight forward and convenient.
That said, the design doesn't actually prevent you from doing it as you are asking for:
//-----
auto first = tuple(myRange.front, myRange.front);
myRange.popFront();
auto minmax = reduce!((a, b) => tuple(min(a[0], b), max(a[1], b)))(first, myRange);
writeln(minmax);
//----
It works, but you'll have a tough time selling it to me.
The only argument in favor of this approach, is if you need the return type to be different from the tuple type. But as I said, both approaches can co-exist, so why hold out?
Comment #17 by monarchdodra — 2014-03-21T01:51:41Z
(In reply to comment #14)
> >Furthermore, it also improves usability by making the seeds passed by parameter pack, instead of forcing the use of a tuple.<
>
> OK. (Despite in a modern language tuples should be built-in, so using them
> should be natural, common, and syntactically very cheap. In
> Python/Haskell/Scala code you don't see functions that refrain from accepting a
> tuple).
The current most spread phobos style is to accept arguments via vararg, and to return tuples when you need to return several args, so I tried to stick to that.
I'm not sure it is possible to accept either a Tuple (that "auto expands" later), and varargs, without potentially creating some ambiguity.
> >Finally, it allows using only 1 seed, in which case, the same seed is replicated and is used for all the functions.
>
> This is from the unittests:
>
> // Compute sum and sum of squares in one pass.
> // This can be used to compute get the average and standard deviation.
> // A single seed (0.0) is passed, but it is optional
> // if the range is not empty.
> r = a.fold!("a + b", "a + b * b")(0.0);
> assert(approxEqual(r[0], 35)); // sum
> assert(approxEqual(r[1], 233)); // sum of squares
>
> This is ambiguous, it seems that "a + b" has a seed while "a + b * b" doesn't
> have a seed. So in my opinion if you give N function then you need to give 0
> seeds, or one N-tuple, or N seeds. So I don't like this.
You think? It made sense to me. I'll have to ask for more input on this. That said, if we turn it down, then it would be possible to make `Tuple` and `args...` co-exist as input argument style. EG:
r = a.fold!("a + b", "a + b * b")(0.0, 0.0); //OK!
r = a.fold!("a + b", "a + b * b")(tuple(0.0, 0.0)); //OK! too!
The second one is more verbose, but I see its justified if your seed is already in the form of a tuple.
So I think you sold me on it.
> >Oh yeah, also, I made it so that when no seed is given, it is an Error to use an empty range. This is the only case of deviation, but I think having nothrow justifies it.
>
> I am not sure this is a good idea. Throwing when you give no seed is probably
> acceptable. But I am not sure.
It's a deviation, but I think it's justified. It makes the code nothrow, and quite frankly, accessign an empty range is an Error, so end of story.
The only argument I'd accept in its favor is stability with reduce, but if we could redesign, I'd never accept throwing an exception in such a case.
> > "iterables" are not supported anymore.
>
> If by "iterables" you mean that fold doesn't accept opApply-based iterables
> then I am against this change, I have plenty of code that opApply-based and I
> sometimes use reduce on them.
OK. I can work them back in.
Comment #18 by bearophile_hugs — 2014-03-21T03:46:15Z
(In reply to comment #17)
> it would be possible to make `Tuple` and
> `args...` co-exist as input argument style. EG:
>
> r = a.fold!("a + b", "a + b * b")(0.0, 0.0); //OK!
> r = a.fold!("a + b", "a + b * b")(tuple(0.0, 0.0)); //OK! too!
>
> The second one is more verbose, but I see its justified if your seed is already
> in the form of a tuple.
What I don't like is to give only 1 single scalar argument if you have N functions and then implicitly multiply the single seed N times. Similar implicit behaviours look handy, but they make the code less clear, and later usually they find some way to bite your rump.
On the other hand I think that it's uncommon to give more than one function to reduce/fold. So this whole sub-feature is could even be chopped away, simplifying fold.
How many real usages of multi-function reduce to you have in Phobos?
Comment #19 by bearophile_hugs — 2014-03-21T04:03:46Z
See also Issue 10670
Comment #20 by monarchdodra — 2014-03-21T05:07:58Z
(In reply to comment #18)
> What I don't like is to give only 1 single scalar argument if you have N
> functions and then implicitly multiply the single seed N times. Similar
> implicit behaviours look handy, but they make the code less clear, and later
> usually they find some way to bite your rump.
I've already agreed to that point, and will be removing the functionality. Furthermore, I doubt it provides any real feature: We never call reduce with so many functions that this would be justified.
> On the other hand I think that it's uncommon to give more than one function to
> reduce/fold. So this whole sub-feature is could even be chopped away,
> simplifying fold.
When you say "sub-feature", are you talking about about the "1-seed" thing, or having multiple functions/
If it's multiple functions, then I have to say that it's an opt-in feature, and it works anyways. It's not actually complicated to implement, and it buys us nothing to remove it. Why want to chop it? I've used it before. It's nice.
If it's the single seed for multiple functions, yeah.
> How many real usages of multi-function reduce to you have in Phobos?
I'm not sure it's a fair reference, because Phobos doesn't "do" anything, it just provides function. I'd be willing to bet there is a fair amount of it being used out there.
Comment #21 by bearophile_hugs — 2014-03-21T05:32:04Z
(In reply to comment #20)
> I'm not sure it's a fair reference, because Phobos doesn't "do" anything, it
> just provides function. I'd be willing to bet there is a fair amount of it
> being used out there.
OK. One usage case I can think of is to compute the sum and count of an input range, to compute its average :-)