Comment #0 by bearophile_hugs — 2014-12-19T14:16:00Z
This shows a different computational complexity using apparently equivalent range-based code:
import std.stdio: writeln;
import std.algorithm: map, join;
uint count1, count2;
const(int)[] foo1(in int[] data, in int i, in int max) {
count1++;
if (i < max) {
typeof(return) result;
foreach (immutable n; data)
result ~= foo1(data, i + 1, max);
return result;
} else {
return data;
}
}
const(int)[] foo2(in int[] data, in int i, in int max) {
count2++;
if (i < max) {
return data.map!(n => foo2(data, i + 1, max)).join;
} else {
return data;
}
}
void main() {
const r1 = foo1([1, 2, 3, 4, 5], 1, 7);
writeln(count1); // 19531
const r2 = foo2([1, 2, 3, 4, 5], 1, 7);
writeln(count2); // 1111111
assert(r1 == r2); // Results are equally correct.
}
I think such performance traps should not be present using ranges.
The right complexity is restored using:
return data.map!(n => foo2(data, i + 1, max)).cache.joiner.array;
Comment #1 by bearophile_hugs — 2014-12-19T14:16:43Z
btw, with such modification both variants becomes equivalent:
auto stripSave(RT) (auto ref RT rng) {
static struct R {
RT r;
@disable void save ();
alias r this;
}
return R(rng);
}
const(int)[] foo2 (in int[] data, in int i, in int max) {
count2++;
if (i < max) {
return data.map!(n => foo2(data, i + 1, max)).stripSave.join;
} else {
return data;
}
}
the code turns MapResult to non-forward range, and then join doesn't calculate everything twice.
the question is: should MapResult has '.save' here in the first place?
or maybe we need some API — as `stripSave` to explicitly remove features from ranges? it's very obvious to the programmer that recalculating range values is costly in this case, so he can just remove `.save` to stop Phobos using it.
Comment #3 by safety0ff.bugz — 2014-12-20T17:00:57Z
IMO the problem is that join assumes that traversal of the range of ranges is cheap.
All the optimization is doing is avoiding reallocation costs while building the array of results (by computing length then allocating appropriately sized array for the results.)
As shown by bearophile, this shouldn't be assumed in the general case.
I ran into an issue where std.parallelism assumed that opIndex was as cheap as popFront, but for my specific range it was easier to compute the next front knowing the previous one than it was to find the element given by the index.
If these type of optimizations are important, perhaps there should be some "performance hint tags" for range operations. e.g. popFront on an array range is "cheap" whereas popFront on a MapResult may vary arbitrarily.
Comment #4 by peter.alexander.au — 2015-01-02T17:32:43Z
https://github.com/D-Programming-Language/phobos/pull/2837
For now, I'm limiting the optimization to arrays only. Any other range is conservatively assumed to have arbitrarily expensive iteration, so the normal appender path is used for those.
Comment #5 by github-bugzilla — 2015-01-08T00:27:03Z