Bug 23048 – [SIMD][CODEGEN] Inline XMM.LODUPD leads to wrong SIMD content

Status
RESOLVED
Resolution
INVALID
Severity
critical
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
All
Creation time
2022-04-23T14:07:59Z
Last change time
2022-04-24T08:51:29Z
Keywords
backend, SIMD, wrong-code
Assigned to
No Owner
Creator
ponce

Attachments

IDFilenameSummaryContent-TypeSize
1848main.dmain sourcetext/plain668

Comments

Comment #0 by aliloko — 2022-04-23T14:07:59Z
Created attachment 1848 main source Using DMD v2.100.0-beta.1-dirty, consider the following program: ----------- main.d ------------- import core.stdc.stdio; import core.simd; double2 _mm_loadr_pd (const(double)* mem_addr) { double2 a = *cast(double2*)(mem_addr); double2 r; r.ptr[0] = a.array[1]; r.ptr[1] = a.array[0]; return r; } unittest { align(16) double[2] A = [56.0, -74.0]; double2 R = _mm_loadr_pd(A.ptr); } double2 _mm_loadu_pd (const(double)* mem_addr) { return cast(double2) __simd(XMM.LODUPD, *mem_addr); } unittest { double[2] A = [56.0, -75.0]; double2 R = _mm_loadu_pd(A.ptr); printf("%f %f\n", R[0], R[1]); double[2] correct = [56.0, -75.0]; assert(R.array == correct); } void main() { } -------------------------------- To reproduce: $ dmd -m64 -inline -O main.d -unittest $ main.exe This outputs: 56.000000 -74.000000 main.d(29): [unittest] unittest failure 1/1 modules FAILED unittests instead of the normal: 56.000000 -75.000000 1 modules passed unittests Notes: - -O, -inline, and -unittest are necessary. - _mm_loadu_pd is inline into the unittest - the 1st unittest is necessary, what happens seems to be that a former variable or register is reused
Comment #1 by bugzilla — 2022-04-24T04:54:30Z
A smaller test with -O -unittest : import core.simd; unittest { align(16) double[2] A = [56.0, -74.0]; } unittest { double[2] A = [56.0, -75.0]; double2 R = cast(double2) __simd(XMM.LODUPD, *A.ptr); assert(R.array == A); } void main() { }
Comment #2 by bugzilla — 2022-04-24T06:28:08Z
The problem is with the lines: double[2] A = [56.0, -75.0]; double2 R = cast(double2) __simd(XMM.LODUPD, *A.ptr); LODUPD (actually MOVUPD) reads two doubles. The code passes it a double lvalue. The optimizer replaces the double with a reference to 56.0. The second double the LODUPD reads is whatever is after the 56.0. This problem can be fixed with a cast to double2 so the optimizer knows it's a 16 byte operation: double2 R = cast(double2) __simd(XMM.LODUPD, *cast(double2*)A.ptr); I'm not really sure what to do about this as __simd does not do type checking on its arguments, which is why it's @system code. I'll leave it open for now.
Comment #3 by bugzilla — 2022-04-24T07:54:56Z
Comment #4 by bugzilla — 2022-04-24T07:57:54Z
Decided to add this as an example in the documentation, so it can be closed.
Comment #5 by aliloko — 2022-04-24T08:51:29Z
Fine for me, I wasn't sure about my code anyway.