Bug 4487 – 16 bytes long structs requires 32 bytes if allocated singularly on the heap
Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2010-07-19T12:46:00Z
Last change time
2012-12-20T16:48:22Z
Keywords
performance
Assigned to
nobody
Creator
bearophile_hugs
Comments
Comment #0 by bearophile_hugs — 2010-07-19T12:46:34Z
(This is my first bug report with 'major' severity, because this is a quite important bug.)
This comes after a report by Steven Schveighoffer. This program allocates a linked list of 10 million structs on the heap (this number is set to high just to improve the measurements).
The presence of GC.disable() doesn't change the total memory allocated, but decreases a lot the run time. On a 32 bit Windows at the end of the list allocation this program has allocated about 326 MB, it means:
326_200_000 bytes / 10_000_000 ~= 32.62 bytes each Foo
This can't be accepted in a serious "system language" (also because 16 bytes long structs are quite common in my 32 bit code).
import core.memory: GC;
struct Foo {
Foo* next;
ubyte[12] arr;
this(Foo* ptr) { this.next = ptr; }
}
static assert(Foo.sizeof == 16);
void main() {
GC.disable();
enum n = 10_000_000;
Foo* lh;
foreach (i; 0 .. n)
lh = new Foo(lh);
GC.enable();
}
Maybe this bug can be fixed introducing a specific allocator function for single structs, that don't sees them as arrays of length 1 (that needs 1 byte of information padding for appends).
Comment #1 by schveiguy — 2010-07-19T13:10:48Z
DMD is the main culprit here, not druntime. And this is not a bug, it's an enhancement. DMD functions exactly as designed.
DMD is the one generating the code to call the arrayNew function with length 1. Druntime cannot tell between someone actually allocating an array of 1 or someone allocating a single struct.
With the "Appendable" bit I just added for druntime, this could be alleviated if the compiler would call a separate function for struct allocators.
As a workaround, you can pre-allocate a large block of nodes, which will only have one byte of pad per block allocated.
Comment #2 by andrej.mitrovich — 2012-12-20T14:14:57Z
I can only see around 160MB used now, bear please verify and close if it's fixed, thanks.
Comment #3 by bearophile_hugs — 2012-12-20T16:48:22Z
(In reply to comment #2)
> I can only see around 160MB used now, bear please verify and close if it's
> fixed, thanks.
It was fixed time ago. Closed.