← Back to index | Original Bugzilla link

Bug 11531 – For a faster std.algorithm.group on strings

Status: NEW
Severity: enhancement
Priority: P4
Component: phobos
Product: D
Version: D2
Platform: All
OS: All
Creation time: 2013-11-16T13:28:06Z
Last change time: 2024-12-01T16:19:07Z
Keywords: performance
Assigned to: No Owner
Creator: bearophile_hugs

Attachments

ID	Filename	Summary	Content-Type	Size
1635	test.d	test.d	text/plain	836

Comments

Comment #0 by bearophile_hugs — 2013-11-16T13:28:06Z

This is a low-priority enhancement request. From my tests I've seen you can speed up std.algorithm.group applied on strings more than twice if you instead apply it on a immutable(ubyte)[] using std.string.representation. Dmitry Olshansky has suggested some ideas that can improve the performance of std.algorithm.group applied on strings: As to group it has to find runs of identical items. It can be speed up for Unicode if you take into account 2 simple tricks: - you don't need to decode - just identify the size of current dchar (stride) and see how many repetitions of such follow it; - special case if the current (w)char ASCII (or BMP for UTF-16) so as to speed up counting (1 char vs variable length slice of 1-4 chars, ditto with wchar).

Comment #1 by jack — 2017-01-27T20:42:21Z

Created attachment 1635 test.d Currently group does not auto-decode, and I have attached a test case which shows that using immutable(ubyte)[] rather than string has a huge performance pessimization being almost 2x slower.

Comment #2 by uplink.coder — 2017-01-27T23:55:59Z

Jack, I do not see anything of the kind. The performance difference is within 2% and will within fluctuations being caused by the gc.

Comment #3 by jack — 2017-01-28T01:28:04Z

(In reply to Stefan Koch from comment #2) > Jack, I do not see anything of the kind. > The performance difference is within 2% and will within fluctuations being > caused by the gc. Hmm dmd does not show this performance problem but ldc does $ dmd -O -inline -release test.d && ./test original 5 secs, 355 ms, 103 μs, and 7 hnsecs new 5 secs, 70 ms, 858 μs, and 6 hnsecs $ ldc2 -O5 -release test.d && ./test original 576 ms, 524 μs, and 6 hnsecs new 992 ms, 676 μs, and 6 hnsecs Odd.

Comment #4 by robert.schadek — 2024-12-01T16:19:07Z

THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/phobos/issues/9617 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB