← Back to index | Original Bugzilla link

Bug 17485 – bringToFront and/or upperBound slow

Status: NEW
Severity: enhancement
Priority: P4
Component: phobos
Product: D
Version: D2
Platform: x86
OS: Mac OS X
Creation time: 2017-06-09T19:04:30Z
Last change time: 2024-12-01T16:30:20Z
Keywords: performance
Assigned to: No Owner
Creator: Steven Schveighoffer

Comments

Comment #0 by schveiguy — 2017-06-09T19:04:30Z

See this thread: https://forum.dlang.org/post/[email protected] Not any idea why it doesn't go fast, or his other issue with -boundscheck=off on ldc, but tinkering with the code, I couldn't get much except: 1. It's not the interaction of SortedRange with bringToFront 2. It's not an issue with using functors instead of lambdas 3. It's not an issue with the initial setup, as that should take nowhere near the 10 seconds I found it takes (consistent with his testing). 4. It *may* be an issue with the comparison, but I switched away from using ==, and didn't get any better performance. Ali's message in the thread may give a clue as to why std::rotate is so good and something we should try and dissect to copy it's speedups: https://stackoverflow.com/questions/21160875/why-is-stdrotate-so-fast

Comment #1 by hsteoh — 2017-06-09T23:30:03Z

From an initial cursory glance: bringToFrontImpl relies only on forward range primitives to do its job. Also, it allows you to transcribe ranges with different element types across each other (as long as the element type is implicitly convertible to the target element type). While this is all great for generality, it sucks for performance when you're dealing with, e.g., built-in arrays. I think there should definitely be a specialization for built-in arrays at the very least, if not for slice-assignable random-access ranges in general, when the element types are identical. Definitely should consider using memmove() in these cases to take advantage of years' worth of work C library authors have put into memmove().

Comment #2 by hsteoh — 2017-06-09T23:48:53Z

More specifically, if (1) both ranges are built-in arrays, (2) they have the same element type, and (3) the back range is a single element, we can optimize this way: If front and back are disjoint: (front.ptr > back.ptr && front.ptr + front.length < back.ptr) ------- ElementType!InputRange e1 = back[0]; back[0] = front[$-1]; memmove(front.ptr+1, front.ptr, front.length-1); front[0] = e1; ------- If back is part of front (i.e., front.ptr < back.ptr < front.ptr + front.length): -------- ElementType!InputRange e1 = back[0]; memmove(front.ptr+1, front.ptr, back.ptr - front.ptr); front[0] = e1; --------

Comment #3 by schveiguy — 2017-06-10T11:08:50Z

Indeed, memmove fixes the specific use case: https://forums.dlang.org/post/[email protected] However, I think we should do a full examination of everything C++ does to optimize rotate and include it all at once, instead of waiting for people to find more cases where std::rotate outperforms D.

Comment #4 by robert.schadek — 2024-12-01T16:30:20Z

THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/phobos/issues/9716 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB