Bug 6192 – std.algorithm.sort performance

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2011-06-21T15:51:00Z
Last change time
2016-10-01T14:22:59Z
Keywords
performance
Assigned to
nobody
Creator
bearophile_hugs

Attachments

IDFilenameSummaryContent-TypeSize
1004sort_bench.dsort benchapplication/octet-stream5458

Comments

Comment #0 by bearophile_hugs — 2011-06-21T15:51:37Z
Created attachment 1004 sort bench The small benchmark program shows the performance of std.algorithm.sort compared to a different one (it doesn't accept a key sort function, etc). One output example: sort-sort2 benchmarks (milliseconds), N=6000000: Random distribution: sort: 4131 sort2: 2635 Already sorted arrays: sort: 1979 sort2: 787 Inverted order arrays: sort: 2105 sort2: 1374 Few random doubles appended to the sorted arrays: sort: 2890 sort2: 1551 See also bug 5077
Comment #1 by dmitry.olsh — 2011-06-26T08:35:32Z
Results for me on upcomming 2.054 Phobos: sort-sort2 benchmarks (milliseconds), N=6000000: Random distribution: sort: 1040 sort2: 1012 Already sorted arrays: sort: 357 sort2: 312 Inverted order arrays: sort: 361 sort2: 576 Few random doubles appended to the sorted arrays: sort: 3657 sort2: 482 compiled with: dmd -O -inline -release sort_bench.d
Comment #2 by bearophile_hugs — 2011-07-06T10:42:20Z
Was this update in the sort code caused by this enhancement request, or are they unrelated? Timings with DMD 2.054beta, a different CPU: sort-sort2 benchmarks (milliseconds), N=6000000: Random distribution: sort: 1892 sort2: 1131 Already sorted arrays: sort: 748 sort2: 376 Inverted order arrays: sort: 1048 sort2: 656 Few random doubles appended to the sorted arrays: sort: 3085 sort2: 734 It seems the random case is improved a lot, the already sorted case is improved, the inverted order is about the same, and the few random values added case is slower than before.
Comment #3 by kennytm — 2011-07-06T11:00:55Z
(In reply to comment #2) > Was this update in the sort code caused by this enhancement request, or are > they unrelated? > > Timings with DMD 2.054beta, a different CPU: > > > sort-sort2 benchmarks (milliseconds), N=6000000: > Random distribution: > sort: 1892 > sort2: 1131 > Already sorted arrays: > sort: 748 > sort2: 376 > Inverted order arrays: > sort: 1048 > sort2: 656 > Few random doubles appended to the sorted arrays: > sort: 3085 > sort2: 734 > > > It seems the random case is improved a lot, the already sorted case is > improved, the inverted order is about the same, and the few random values added > case is slower than before. https://github.com/D-Programming-Language/phobos/pull/74
Comment #4 by bearophile_hugs — 2011-07-11T04:58:06Z
Thank you for your work. The new timings (after pull 74): sort-sort2 benchmarks (milliseconds), N=6000000: Random distribution: sort: 1261 sort2: 1201 Already sorted arrays: sort: 300 sort2: 441 Inverted order arrays: sort: 315 sort2: 729 Few random doubles appended to the sorted arrays: sort: 9962 sort2: 632 The last case is a common use case in Python code. Python programmers sometimes append unsorted items to a sorted list, and once in a while they sort it. Because the Timsort used in Python (and Java) is able to face this case very well, this is an efficient strategy to keep a rarely updated list sorted in Python. I don't know how much common this pattern is in real D code (but experience shows that often real-world data is already partially sorted).
Comment #5 by andrei — 2014-02-17T15:01:46Z
Just pasting this for future reference: dual-pivot quicksort http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf Seems to be used in Java 8's sort for primitive types according to http://vkostyukov.ru/posts/dual-pivot-binary-search/
Comment #6 by andrei — 2016-01-12T03:22:19Z
Comment #7 by andrei — 2016-01-12T03:27:23Z
(In reply to Andrei Alexandrescu from comment #6) > https://github.com/D-Programming-Language/phobos/pull/3922 With this PR and the command line: dmd -O -inline -release -run sort-benchmark.d 6000000 the output is: sort-sort2 benchmarks (milliseconds), N=6000000: Random distribution: sort: 696 sort2: 680 Already sorted arrays: sort: 122 sort2: 113 Inverted order arrays: sort: 132 sort2: 224 Few random doubles appended to the sorted arrays: sort: 244 sort2: 234
Comment #8 by andrei — 2016-01-12T06:06:15Z
After a few more tweaks: dmd -O -inline -release -run sort-benchmark.d 6000000 sort-sort2 benchmarks (milliseconds), N=6000000: Random distribution: sort: 702 sort2: 689 Already sorted arrays: sort: 108 sort2: 115 Inverted order arrays: sort: 154 sort2: 215 Few random doubles appended to the sorted arrays: sort: 247 sort2: 251
Comment #9 by andrei — 2016-10-01T05:07:51Z
Comment #10 by andrei — 2016-10-01T14:22:59Z