Bug 17854 – Suboptimal code generated with constants and SSE

Status
NEW
Severity
enhancement
Priority
P4
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
All
Creation time
2017-09-24T04:54:39Z
Last change time
2024-12-13T18:54:40Z
Keywords
performance, SIMD
Assigned to
No Owner
Creator
basile-z
Moved to GitHub: dmd#19321 →

Comments

Comment #0 by b2.temp — 2017-09-24T04:54:39Z
in both case with -O -release A. FP constant ============== int test(float i) { return cast(int)(2.0 * i * i); } disassembles to: 000000000045B640h push rbp 000000000045B641h mov rbp, rsp 000000000045B644h sub rsp, 10h 000000000045B648h movss xmm3, xmm0 000000000045B64Ch cvtss2sd xmm1, xmm0 000000000045B650h mov rax, 4000000000000000h 000000000045B65Ah mov qword ptr [rbp-10h], rax 000000000045B65Eh movsd xmm2, qword ptr [rbp-10h] 000000000045B663h mulsd xmm1, xmm2 000000000045B667h cvtss2sd xmm4, xmm0 000000000045B66Bh mulsd xmm1, xmm4 000000000045B66Fh cvttsd2si eax, xmm1 000000000045B673h mov rsp, rbp 000000000045B676h pop rbp 000000000045B677h ret B. Integer constant =================== int test(float i) { return cast(int)(2 * i * i); } disassembles to: 000000000045B640h push rbp 000000000045B641h mov rbp, rsp 000000000045B644h sub rsp, 10h 000000000045B648h movss xmm2, xmm0 000000000045B64Ch mov eax, 40000000h 000000000045B651h mov dword ptr [rbp-10h], eax 000000000045B654h movss xmm1, dword ptr [rbp-10h] 000000000045B659h mulss xmm0, xmm1 000000000045B65Dh mulss xmm0, xmm2 000000000045B661h cvttss2si eax, xmm0 000000000045B665h mov rsp, rbp 000000000045B668h pop rbp 000000000045B669h ret case A could clearly be compiled as B since the fractional part of the constant is empty.
Comment #1 by b2.temp — 2017-09-24T04:59:25Z
The was the multiplication happens is terrible anyway. LDC generates cvtss2sd xmm0, xmm0 movapd xmm1, xmm0 addsd xmm1, xmm1 mulsd xmm1, xmm0 cvttsd2si eax, xmm1 ret
Comment #2 by robert.schadek — 2024-12-13T18:54:40Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/19321 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB