Bug 8047 – important opcodes missing from core/simd.d

Status
REOPENED
Severity
major
Priority
P2
Component
druntime
Product
D
Version
D2
Platform
x86_64
OS
All
Creation time
2012-05-05T00:31:58Z
Last change time
2024-12-07T13:31:59Z
Keywords
bootcamp, SIMD
Assigned to
No Owner
Creator
Sean Cavanaugh
Moved to GitHub: dmd#17117 →

Comments

Comment #0 by WorksOnMyMachine — 2012-05-05T00:31:58Z
There are a number of opcodes that are missing, but some are far more critical than others, more or less listed here in order of most important first: missing store instructions (and some loads) STOSS STOSD STOAPS STOAPD STOD STOQ (there are a few others scattered in the enum table) movemask (critical for doing branching tests against simd registers): MOVMSKPD MOVMSKPS missing comparisons CMPPS CMPPD CMPSD CMPSS missing conversions CVTPS2PI CVTSD2SI CVTSI2SD CVTSI2SS CVTSS2SI CVTTPD2PI CVTTPS2PI CVTTSD2SI CVTTSS2SI
Comment #1 by Marco.Leise — 2013-11-22T16:43:26Z
Some mnemonics like PMOVMSKB cannot even be expressed with the interface that is offered. It returns a 32-bit word consisting of only the high bit of every byte in the MMX or SSE register. Since I've tried other workarounds up inline asm and hard coding hex values and nothing worked, I've set this bug to 'major'. The inline asm workaround usually ends in this: Internal error: backend/cgcod.c 1561 But that's not what this bug report is about. I'm just stating that there are more SIMD bugs lurking under the surface.
Comment #2 by john.loughran.colvin — 2014-12-10T16:58:20Z
Also missing is PCMPGT[SDQ] Can they just be added to the druntime file or are compiler modifications necessary?
Comment #3 by code — 2014-12-10T19:12:08Z
(In reply to John Colvin from comment #2) > Also missing is PCMPGT[SDQ] > > Can they just be added to the druntime file or are compiler modifications > necessary? Looks like most can simply be added, just have to add the correct opcode. But PCMPGTQ is already there and works for me on 2.066.1. https://github.com/D-Programming-Language/druntime/blob/109a604a08c7592687a9b482ac2a8bb8ded80ccc/src/core/simd.d#L3633
Comment #4 by Marco.Leise — 2015-10-03T19:33:40Z
//PMOVMSKB = 0x660FD7, has been commented out in core.simd. We may as well comment out all instructions returning non-XMM values until this is resolved. The ones I could find so far are: COMISD COMISS CVTSD2SI CVTSS2SI CVTTPD2PI CVTTPS2PI CVTTSD2SI CVTTSS2SI MASKMOVDQU MASKMOVQ MOVMSKPD MOVMSKPS PCMPESTRI PCMPISTRI PMOVMSKB PTEST UCOMISS UCOMISD CRC32, POPCNT and LZCNT don't belong in the XMM enum. They were introduced side-by-side with SSE4.2, but don't work on XMM registers and the latter two have their separate CPUID flags.
Comment #5 by bugzilla — 2016-11-22T01:04:30Z
These have been in core.simd for a while.
Comment #6 by Marco.Leise — 2016-11-22T08:36:34Z
(In reply to Walter Bright from comment #5) > These have been in core.simd for a while. While that is true for the original bug description, the hard issue is not missing enum values themselves, but a lack of support for them, namely returning something else than SIMD vectors as I outlined in comment #1 and #4 above. The XMM enum is still rather messy if you look at it from some distance: There are some non-SSE opcodes in it as noted in their comment (i.e. POPCNT and LZCNT have nothing to do with SSE). They should be handled in core.bitop instead, IMHO. Some non-working opcodes are rightfully commented out until this bug is resolved (i.e. PMOVMSKB). Other non-working opcodes are NOT commented out (i.e. MOVMSKPD from the original description, see comment #4 for a list). AMD's SSE4a seems to have an undecided fate with its opcodes commented out in entirety. This may be consider a separate bug, but then again, whoever works on this bug will probably look at them as well. The ddoc for XMM still says: "XMM opcodes that conform to the following: opcode xmm1,xmm2/mem and do not have side effects (i.e. do not write to memory)." This description doesn't apply to e.g. CRC32 or PREFETCH. DMD + core.simd still need some work to move SIMD support out of proof-of-concept phase. Admittedly I didn't run any tests since 2015, so if any of the above is in good shape now, shame on me. :)
Comment #7 by aliloko — 2021-01-07T13:55:33Z
Hello, Can't implement the following intrinsics for DMD: _mm_movemask_ps needs MOVMSKPS support, as Marco Leise said 7 years ago it is an instruction that return in a general purpose register instead of an XMM register. ---------------------------------------------------- int _mm_movemask_ps (__m128 a) pure @trusted { static if (DMD_with_DSIMD) { // suggested API ? This API returning an int doesn't exist in core.simd int res = __simd_int(XMM.MOVMSKPS, a); return res; } else static if (GDC_with_SSE) { return __builtin_ia32_movmskps(a); } else static if (LDC_with_SSE1) { return __builtin_ia32_movmskps(a); } else { int4 ai = cast(int4)a; int r = 0; if (ai.array[0] < 0) r += 1; if (ai.array[1] < 0) r += 2; if (ai.array[2] < 0) r += 4; if (ai.array[3] < 0) r += 8; return r; } } ---------------------------------------------------- Same remark for: - _mm_movemask_epi8 (pmovmskb), - _mm_movemask_pd (movmskpd),
Comment #8 by robert.schadek — 2024-12-07T13:31:59Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/17117 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB