Bug 8405 – Create overload for joiner which is random access for random access ranges

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-07-21T13:25:41Z
Last change time
2018-02-03T15:00:33Z
Keywords
bootcamp
Assigned to
No Owner
Creator
Jonathan M Davis

Comments

Comment #0 by issues.dlang — 2012-07-21T13:25:41Z
std.algorithm.joiner returns a lazy range which is never random access. I believe that at least in the case where joiner doesn't take a separator, it should be possible to make it random access as long as all of the ranges passed in are random access and have length. It might also be possible to do the same thing with the one that takes a separator, but that could get a bit messier. Certainly, if you want the elements to be properly movable (e.g. for sort), then each separator in the range would then have to actually be a separate range as soon as the range is created. But someone was asking in D.Learn why they couldn't sort a range returned by joiner, and after some thought, I do think that we can make that possible, though given that at this point, joiner's result doesn't really affect the original, and it _would_ if it were sorted, so maybe this would be too big a change, but it's certainly something to consider.
Comment #1 by peter.alexander.au — 2014-01-26T09:28:24Z
How do you make opIndex O(1) for a joiner range? Surely you have to count up the lengths of all the sub ranges until you hit the desired index? That's O(n).
Comment #2 by bearophile_hugs — 2014-11-09T22:29:28Z
Some use cases. Given a 2D array, I'd like to sort or shuffle its items seen as sequential: import std.stdio: writeln; import std.algorithm: sort; import std.range: joiner, chain; import std.random: randomShuffle; void sortAll(int[][] m) { m.joiner.sort(); // error } void shuffleAll(int[][] m) { m.joiner.randomShuffle; // error } void main() { auto m = [[10, 2], [30, 4]]; m.writeln; chain(m[0], m[1]).sort(); // OK m.sortAll; m.writeln; m.shuffleAll; m.writeln; } Is it possible and reasonable for "joiner" to return a random access range in such cases (where the input is a random access range of random access ranges, this isn't an uncommon use case because built-in arrays are very common in D code)?
Comment #3 by bearophile_hugs — 2014-11-09T22:32:29Z
(In reply to Peter Alexander from comment #1) > How do you make opIndex O(1) for a joiner range? Surely you have to count up > the lengths of all the sub ranges until you hit the desired index? That's > O(n). In the common use case I've shown, where you have an array of array, all array lengths are known. If joiner caches internally the lengths of all the rows, the successive accesses are O(1). This requires some heap memory...
Comment #4 by john.loughran.colvin — 2014-11-10T09:22:58Z
(In reply to bearophile_hugs from comment #3) > (In reply to Peter Alexander from comment #1) > > How do you make opIndex O(1) for a joiner range? Surely you have to count up > > the lengths of all the sub ranges until you hit the desired index? That's > > O(n). > > In the common use case I've shown, where you have an array of array, all > array lengths are known. If joiner caches internally the lengths of all the > rows, the successive accesses are O(1). This requires some heap memory... Unless there is an exploitable pattern in the lengths (rectangular being the easiest case), given a NxM range of ranges, it's still O(N). Perhaps what we need is a generalisation of length for Ranges of Ranges (of Ranges ...), called `dimensions` (or `dims`, or `shape`, or whatever seems best), which, for an >=n-dimensional RoR, provides n lengths, either as a static array (fixed dimensions) or a dynamic array (runtime number of dimensions). Given this information, joiner could easily provide O(1) random-access. In general, phobos is quite weak for ranges of ranges. A lot more could be done if we had `dimensions` info available where appropriate.
Comment #5 by bearophile_hugs — 2014-11-10T09:50:31Z
(In reply to John Colvin from comment #4) > Unless there is an exploitable pattern in the lengths (rectangular being the > easiest case), given a NxM range of ranges, it's still O(N). If you store the row lengths into an array, and you perform a cumulative of this array, you can perform a binary search on it, to access in O(ln n). At run-time it can also spot the common case of rectangular matrix, and use a O(1) code path for that case. > Perhaps what we need is a generalisation of length for Ranges of Ranges (of > Ranges ...), called `dimensions` (or `dims`, or `shape`, or whatever seems > best), This is Issue 13685 (and it's an easy structure to implement).
Comment #6 by greensunny12 — 2018-02-03T15:00:33Z
> If joiner caches internally the lengths of all the rows, the successive accesses are O(1). This requires some heap memory... Even assuming we could convince Andrei that creating joinerRandomAccess for this is justified. This would still never work because joiner removes empty elements: --- auto j = [null, new int(1), null, new int(2), null]; j.writeln// [7F49B04CD010, 7F49B04CF018] j[1] = 5; // which one? ---- Which would - if supported - would require serious work and allocation from joiner. The other option would be sth. like joinerRandomAccessAssumeNonEmpty Also as a user can always do `r.take(n).front` - that's as efficient as joiner could ever get. I'm going to close this no as WONTFIX, because doing > O(n) work for an opIndex in unacceptable and joinerRandomAccessAssumeNonEmpty ain't happening in the standard library.