Bug 19663 – On x86_64 the fabs intrinsic should use SSE

Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2019-02-09T15:00:57Z
Last change time
2020-07-24T11:12:41Z
Keywords
performance, pull
Assigned to
No Owner
Creator
Basile-z

Comments

Comment #0 by b2.temp — 2019-02-09T15:00:57Z
Currently on x86_64 dmd backend uses the FPU FABS homonymous instruction but since `single` and `double` parameters are passed, as defined by ABI, in SSE registers, the they have to travel from these SSE registers to GP registers then only to FPU registers and depending on what's done with the absolute value that's obtained: back to a GP register (and all of this to clear a bit !), then again back to SSE register if the func has to return the value etc. It would be more wise to use SSE logical AND with a mask. This would be done only for the single and double types. Several options exist 1. generate mask and ANDPS/ANDPD 2. ANDPS/ANDPD on a constant mask (LDC2 does that btw) 3. left shift and right shift by one Forum discussion: https://forum.dlang.org/post/[email protected] Reference for the possible solutions: https://stackoverflow.com/questions/32408665/fastest-way-to-compute-absolute-value-using-sse
Comment #1 by bugzilla — 2020-07-21T07:59:54Z
Comment #2 by b2.temp — 2020-07-21T08:41:38Z
BTW option 3 doesn't work.
Comment #3 by dlang-bot — 2020-07-24T08:28:12Z
@WalterBright created dlang/dmd pull request #11449 "fix Issue 19663 - On x86_64 the fabs intrinsic should use SSE" fixing this issue: - fix Issue 19663 - On x86_64 the fabs intrinsic should use SSE https://github.com/dlang/dmd/pull/11449
Comment #4 by dlang-bot — 2020-07-24T11:12:41Z
dlang/dmd pull request #11449 "fix Issue 19663 - On x86_64 the fabs intrinsic should use SSE" was merged into master: - 549ee99a87bb398cf7f01fda92793261b28f9066 by Walter Bright: fix Issue 19663 - On x86_64 the fabs intrinsic should use SSE https://github.com/dlang/dmd/pull/11449