Bug 19617 – [REG 2.085a] Much slower GC compared to 2.084

Status
RESOLVED
Resolution
INVALID
Severity
regression
Priority
P1
Component
druntime
Product
D
Version
D2
Platform
All
OS
All
Creation time
2019-01-26T11:03:57Z
Last change time
2019-03-13T09:18:55Z
Assigned to
No Owner
Creator
Basile-z

Comments

Comment #0 by b2.temp — 2019-01-26T11:03:57Z
I wanted to test something and took this code from the dlang tour: --- // Hey come on, just get the whole army! import std.algorithm : canFind, map, filter, sort, uniq, joiner, chunkBy, splitter; import std.array : array, empty; import std.range : zip; import std.stdio : writeln; import std.string : format; import std.datetime.stopwatch; void main() { StopWatch sw; sw.start; string text = q{This tour will give you an overview of this powerful and expressive systems programming language which compiles directly to efficient, *native* machine code.}; alias pred = c => canFind(" ,.\n", c); auto words = text.splitter!pred .filter!(a => !a.empty); auto wordCharCounts = words .map!"a.count"; zip(wordCharCounts, words) .array() .sort() .uniq() .chunkBy!(a => a[0]) .map!(chunk => format("%d -> %s", chunk[0], chunk[1] .map!(a => a[1]) .joiner(", "))) .joiner("\n") .writeln(); writeln(sw.peek); } --- - dmd args : -O -release -inline -boundscheck=off - exe generated with DMD 2.084 takes on average 110 µs - the one with DMD ~master (aa0c2062499419cc933f9bbf94cf88ec3244e2f9) takes on average 145 µs. Note that the same difference is observed without any DMD arg at all.
Comment #1 by b2.temp — 2019-01-27T09:22:30Z
So it's likely the GC that's slower not the cde generated. If you add a GC.dsiable; before start the stopwatch the timing are similar using 2.084 and ~master.
Comment #2 by r.sagitario — 2019-01-28T22:45:09Z
There is little difference on my system between 2.083 and git HEAD, maybe 110 µs and 120 µs, respectively. These number are way too small for a benchmark as every unrelated detail can make a big difference. If I make the text longer so that the time is a about 1ms (still very small), the absolute difference is about the same if at all noticable in the noise. The most likely related change in the GC is that intermediate allocation sizes are now supported, too (these were power of 2 only so far). This can lead to less memory being used for the same allocations (up to about 30%), but it also changes when collections are run. Appending to arrays might also reallocate a bit more often.
Comment #3 by b2.temp — 2019-01-31T11:31:04Z
I didn't want to blame you or discredit your recent work on the GC Rainer. I'll try bigger benchmark and close the issue if it appears that I have over-reacted to a micro benchmark.
Comment #4 by r.sagitario — 2019-02-01T07:31:14Z
No problem. I also noticed a couple of changes in the druntime benchmarks, some for the better, some for the worse. Those mostly boiled down to how many garbage collections are run which changes if memory is used differently.