Bug 23013 – generate optimized SIMD register assignment

Comment #0 by dkorpel — 2022-04-13T20:20:13Z

See: https://github.com/dlang/dmd/pull/13977#issuecomment-1098199644 Consider: ``` import core.simd; double2 set0(double2 x, double* a) { x[0] = *a; return x; } double2 set1(double2 x, double* a) { x[1] = *a; return x; } ``` GDC generates this optimized code: ``` set0: movlpd xmm0, QWORD PTR [rdi] ret set1: movhpd xmm0, QWORD PTR [rdi] ret ``` But DMD -O still does a roundtrip to stack memory: ``` assume CS:.text.set1 push RBP mov RBP,RSP sub RSP,010h movapd -010h[RBP],XMM0 movsd XMM1,[RDI] movsd -8[RBP],XMM1 movapd XMM0,-010h[RBP] mov RSP,RBP pop RBP ret ``` In dmd.backend.cod1.getlvalue, vector variables are prevented from being in a register because the backend doesn't generate the correct assignment instructions yet. For example, it would use movsd for `x[0] = *a`, which clears the upper 64 bits of the XMM0 register and accidentally set `x[1] = 0` (see issue 21673 and issue 23009). When this is fixed, SIMD code gen can be improved by allowing vector variables to be put in registers again.

Comment #1 by robert.schadek — 2024-12-13T19:22:16Z

THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/18098 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB

Bug 23013 – generate optimized SIMD register assignment

Comments