The trouble comes about with the following:
struct S { int x; double d; }
S foo(int x, double d) { return S(x, d); }
The SROA optimization replaces the temp struct created by S(x, d) with two variables. The two variables get combined using OPpair for the return value in registers RAX and XMM0.
cdpair() can't handle this, so the code generator asks for register pair AX,DX, intending to fix up the result later in the BCretexp section of outblkexitcode(). cdpair() obligingly asks loaddata() to load `d` into DX.
This winds up calling loadea() with a MOV opcode 0x8B, which is for GP registers only. But meantime, d is passed in XMM0. getlvalue() sets the EA bits for XMM0, but the opcode is 0x8B, so the following code gets generated:
8B D0 mov RDX,RAX
instead of:
mov RDX,XMM0
A proper fix is to have outblkexitcode() handle the mTYxmmgpr and mTYgprxmm kludge cases directly instead of using OPpair.
In the meantime, I'm disabling SROA when they result in a mixed XMM/GPR pair.
Comment #1 by robert.schadek — 2024-12-13T19:18:53Z