Comment #0 by bearophile_hugs — 2013-08-15T11:33:49Z
Currently the D GC allocates arrays aligned to 16 bytes fit to be used in XMM registers:
auto a1 = new double2[128];
But I think the D GC should also return this a2 aligned to 32 bytes, as needed for efficiency for code that uses YMM registers, that are 256 bits long:
auto a2 = new double4[64];
Eventually the D GC should return this a3 aligned to 64 bytes for efficiency of code that uses ZMM registers (Intel Xeon Phi), that are 512 bits long:
auto a3 = new double8[32];
Comment #1 by turkeyman — 2013-08-15T20:48:42Z
Yes, double4 should intrinsically be align(32), just like float4/double2 is intrinsically align(16). Likewise, align(64) for ZMM regs.
The GC should respect the explicit alignment of any type. If it doesn't, then that is another bug.
Comment #2 by turkeyman — 2013-08-15T20:51:55Z
For clarity, as a simple compiler rule, all __vector() types should be intrinsically aligned to their .sizeof.
This is correct on all architectures I know of.
There is the occasional architecture that might not mind a smaller alignment, but I think it's still valuable to enforce the alignment on those (rare) platforms for portability (structure consistency across platforms), especially since those platforms are often tested less thoroughly.
Comment #3 by schveiguy — 2020-08-09T12:45:34Z
This came up again: https://forum.dlang.org/post/[email protected]
I have an idea that might work: when allocating an array of items with alignment greater than 16 bytes, just offset the first element when calculating the size.
Comment #4 by dlang-bot — 2020-08-10T02:43:33Z
@schveiguy created dlang/druntime pull request #3192 "fix issue 10826 -- make sure large arrays obey 32-byte or greater alignment" fixing this issue:
- fix issue 10826 -- make sure 32-byte aligned types (such as
__vector(ubyte[32]) ) are aligned to 32-bytes when put into large
arrays.
https://github.com/dlang/druntime/pull/3192
Comment #5 by kinke — 2022-10-28T19:04:44Z
Raising the importance to critical. Greater-than-natural alignments are respected by LDC pretty much everywhere AFAIK (stack and globals) - except for druntime's GC. And are used for optimizations. It's pretty embarrassing that people cannot safely GC-allocate arrays of vectors > 128 bit without potentially hitting segfaults (incl. @safe code obviously).
And it's obviously not limited to vectors or arrays, but applies to all GC allocations of types with alignment > 16.
Comment #6 by robert.schadek — 2024-12-07T13:32:53Z