Bug 5076 – std.algorithm.sorted / schwartzSorted

Status
ASSIGNED
Severity
enhancement
Priority
P4
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2010-10-18T19:05:59Z
Last change time
2024-12-01T16:13:33Z
Keywords
bootcamp, patch
Assigned to
Andrei Alexandrescu
Creator
bearophile_hugs
Moved to GitHub: phobos#9890 →

Comments

Comment #0 by bearophile_hugs — 2010-10-18T19:05:59Z
I propose to add two new little functions to Phobos, named sorted()/schwartzSorted(). Their purpose is to copy the input items, sort them and return them. The -ed versions are more functional, they don't modify the input data, so they may work with a immutable input sequence too. They may be used as expressions instead of as statements, so in theory they allow code like (currently this doesn't work, see bug 5074 ): auto foo(immutable(int)[] data) { return map!q{ -a }(sorted(data)); } Instead of: auto foo(immutable(int)[] data) { int[] arr = data.dup; sort(arr); return map!q{ -a }(arr); } sorted()/schwartzSorted() may be seen as less efficient than sort()/schwartzSort() because they copy the input data, but there are many situations (like script-like D code) where the programmer wants short and quick code, and knows the number of input items will be low enough to not cause memory shortage. Python too has both list.sort() and sorted() built-ins. A possible simple implementation, for DMD 2.049 (Code not tested much): import std.algorithm: SwapStrategy, schwartzSort, sort; import std.range: isRandomAccessRange, hasLength; import std.array: array; auto sorted(alias less = "a < b", SwapStrategy ss = SwapStrategy.unstable, Range)(Range r) { auto auxr = array(r); sort!(less, ss)(auxr); return auxr; } auto schwartzSorted(alias transform, alias less = "a < b", SwapStrategy ss = SwapStrategy.unstable, Range)(Range r) if (isRandomAccessRange!(Range) && hasLength!(Range)) { auto auxr = array(r); schwartzSort!(transform, less, ss)(auxr); return auxr; } import std.typecons: Tuple; import std.stdio: writeln; alias Tuple!(int, "x", int, "y") P; void main() { P[] data = [P(1,4), P(2,3), P(3,1), P(4,0)]; writeln(data); writeln(schwartzSorted!((x) { return x.y; })(data)); writeln(data); writeln(sorted!q{ a.y < b.y }(data)); writeln(data); }
Comment #1 by peter.alexander.au — 2010-10-19T00:20:20Z
This actually seems to be a common pattern. By "this", I mean: auto foo(T value) { T copy = value.dup; modify(copy); return copy; } "modify" here could be sort, reverse, schwartzSort, partition etc. which would give you sorted, reversed, schwartzSorted and partitioned. It's also the same as defining op+ in terms of op+=. I have no idea what you would call foo though :-( arr2 = transformed!(sort)(arr1); arr2 = mutated!(sort)(arr1); arr2 = modified!(sort)(arr1); arr2 = copyModify!(sort)(arr1); ???
Comment #2 by andrei — 2010-10-19T06:22:08Z
One issue is that there's no standardized "create a copy" function for ranges.
Comment #3 by peter.alexander.au — 2010-10-19T09:15:12Z
(In reply to comment #2) > One issue is that there's no standardized "create a copy" function for ranges. auto copy = array(input); ? Ideally you'd be able to specify what container you want to copy into, but that should do as a default.
Comment #4 by andrei — 2010-10-19T09:42:37Z
(In reply to comment #3) > (In reply to comment #2) > > One issue is that there's no standardized "create a copy" function for ranges. > > auto copy = array(input); > > ? > > Ideally you'd be able to specify what container you want to copy into, but that > should do as a default. array creates an array from anything. We should have a way to say "duplicate and preserve type". Probably the best idea is to define an optional property ".dup" for ranges. Then arrays implement it automatically, and other ranges may define it as they find fit.
Comment #5 by peter.alexander.au — 2010-10-19T11:15:30Z
(In reply to comment #4) > array creates an array from anything. We should have a way to say "duplicate > and preserve type". > > Probably the best idea is to define an optional property ".dup" for ranges. > Then arrays implement it automatically, and other ranges may define it as they > find fit. What if you don't want to preserve type? e.g. you have a list, but you want to get a sorted array of the elements in that list? Why not take the best of both worlds, allowing the user to specify the new container type, but have it default to the original type, something like: auto transformed(alias xform, OutputRange = InputRange, InputRange)(InputRange range) { ... }
Comment #6 by bearophile_hugs — 2010-10-19T13:32:04Z
(In reply to comment #4) > array creates an array from anything. We should have a way to say "duplicate > and preserve type". After thinking about your words for some time I have understood your point. So after your change, for the semantics I am looking for, I'll need to write a bit longer code: sorted(array(some_linked_list)) What if the input collection is not sortable? Like: sorted(some_hash_set) In that case I presume the compilation will fail, and I'll have to use something like: sorted(array(some_hash_set)) Or even: sorted(toList(some_hash_set)) (Where toList() is similar to array() but produces some kind of list out of an iterable). A problem in using this: sorted(array(some_linked_list)) is that array() is supposed to create an array duplicate of the collection, and then sorted() is supposed to create a second useless copy of it.
Comment #7 by bearophile_hugs — 2010-10-19T13:33:57Z
(In reply to comment #6) Another problem is that after your change this code will fail to compile even if some_linked_list is immutable: sorted(some_linked_list)
Comment #8 by bearophile_hugs — 2010-10-20T04:04:21Z
See another case: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22381 This is supposed to not work: sort(map(...)) This is supposed to work: sorted(map(...))
Comment #9 by peter.alexander.au — 2010-10-20T08:49:43Z
(In reply to comment #8) > See another case: > http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22381 > This is supposed to not work: > sort(map(...)) > This is supposed to work: > sorted(map(...)) Just implement sorted et al. something like this: auto sorted(Output = ElementType!InputRange[], InputRange)(InputRange range) { Output output = Output(range); // copy range into new container sort(output); return output; } I don't know if you can construct built-in arrays like that, but you should, and you can always specialise for it if necessary. This allows the input range to be whatever it likes (including Map), and also gives you the choice of the output range.
Comment #10 by bearophile_hugs — 2011-01-24T03:04:20Z
In Python sort() is in-place. To help programmers remember this, sort() returns None (like void in D): >>> a = [10, 30, 5] >>> sorted(a) [5, 10, 30] >>> a [10, 30, 5] >>> a.sort() >>> a [5, 10, 30] But in DMD 2.051 std.algorithm.sort() returns the sequence sorted in-place. Once a sorted/schwartzSorted are present, I suggest to let std.algorithm.sort() return void, as in Python. sorted/schwartzSorted may be tagged with @nodiscard from bug 5464 to further help programmers remember their semantics.
Comment #11 by bearophile_hugs — 2011-05-18T18:21:42Z
See also issue 6035
Comment #12 by bearophile_hugs — 2011-10-28T17:33:16Z
An use case for sorted(). I have to create a function foo() with a int[] argument. Unless foo() is performance-critical the usual API requirements ask for its arguments to be constant (in), to make the program less bug-prone. Inside foo() I need to sort the a copy of items, and then I don't need to modify this array, so I'd like this array copy too to be const. This is an implementation that currently works: import std.algorithm, std.exception; void foo(in int[] unsortedData) { int[] tmpData_ = unsortedData.dup; tmpData_.sort(); const(int[]) data = assumeUnique(tmpData_); // Use array 'data' here. } void main() {} assumeUnique is not safe, and the tmpData_ name is present in the scope still (despite assumeUnique has turned its length to zero, this improves the situation a little). With a pure sorted(), the code becomes more clean and safe: import std.algorithm; void foo(in int[] unsortedData) pure { const(int[]) data = sorted(unsortedData); // Use array 'data' here. } void main() {}
Comment #13 by andrej.mitrovich — 2012-02-07T12:20:02Z
I run into this issue all the time, in particular when doing script-based programming. E.g.: string[string] classes; foreach (name; classes.keys.sorted) { } // ng Having to do this is a chore: auto keys = classes.keys; sort(keys); foreach (name; keys) { } This works ok for my purposes (yada yada about constraints): T sorted(T)(T t) { T result = t; sort(result); return result; } Why do we have replace and replaceInPlace, whereas we have sort which sorts in place implicitly? "findSkip" is another function that I sometimes use but I hate how it hides the fact that it modifies its arguments. "find" returns a range, but "findSkip" returns a bool and *modifies* your argument. It's not at all obvious from the call site.
Comment #14 by robert.schadek — 2024-12-01T16:13:33Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/phobos/issues/9890 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB