Bug 17294 – Incorrect -profile=gc data

Status
NEW
Severity
normal
Priority
P3
Component
druntime
Product
D
Version
D2
Platform
All
OS
All
Creation time
2017-04-03T16:47:48Z
Last change time
2024-12-07T13:37:19Z
Assigned to
No Owner
Creator
Mihails Strasuns
See also
https://issues.dlang.org/show_bug.cgi?id=15481, https://issues.dlang.org/show_bug.cgi?id=16280
Moved to GitHub: dmd#17158 →

Comments

Comment #0 by mihails.strasuns.contractor — 2017-04-03T16:47:48Z
Existing implementation of -profile=gc is somewhat naive in a sense that it assumes that any relevant function call only results in direct immediate allocation for exact data being requested. It can differ from real GC stats a lot, simple example: ==== void main ( ) { void[] buffer; buffer.length = 20; buffer.length = 60; buffer.length = 10; buffer ~= "abcd".dup; } ==== Currently reported trace will look like this: 60 1 void[] D main ./sample.d:7 20 1 void[] D main ./sample.d:6 10 1 void[] D main ./sample.d:8 4 1 void[] D main ./sample.d:9 Which is wrong for variety of reasons: 1) runtime will allocate more data than was requested (32 and 64 bytes for first two length assignments) 2) third length assignment shrinks the array and thus will not result in any allocations despite being reported in log 3) last append will result in re-allocating the array and will thus allocate more than just 4 bytes for "abcd" There are other similar issues which all come from the fact that `-profile=gc` does not in fact track real GC allocations. One idea how that can be fixed without major changes in runtime API is to rely on `synchronized` + `GC.stats`: ``` extern (C) void[] _d_arraysetlengthTTrace(string file, int line, string funcname, const TypeInfo ti, size_t newlength, void[]* p) { import core.memory; synchronized (global_rt_lock) { auto oldstats = GC.stats(); auto result = _d_arraysetlengthT(ti, newlength, p); auto newstats = GC.stats(); if (newstats.usedSize > oldstats.usedSize) { accumulate(file, line, funcname, ti.toString(), newstats.usedSize - oldstats.usedSize); } return result; } } ``` This gives perfect precision of reported allocations but this simple solution comes at cost of considerably changing scheduling of multi-threaded programs with `-profile=gc`. I would be interested to hear if there are any other ideas to fix the problem.
Comment #1 by mihails.strasuns.contractor — 2017-04-06T11:36:46Z
Comment #2 by mihails.strasuns.contractor — 2017-04-06T11:37:45Z
Comment #3 by leandro.lucarella — 2017-04-20T10:12:54Z
I have the feeling that knowing what was requested (and not only what was actually reserved) could be also useful (although I agree that if we need to pick between the 2, the real allocation makes more sense). But maybe a method to track the requested sizes instead of the real allocated memory could be useful. Also, when shrinking the memory it makes little sense to report it as an allocation, but that would be extremely hard to track (or add a lot of overhead) for the "requested size" mode.
Comment #4 by robert.schadek — 2024-12-07T13:37:19Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/17158 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB