The following C headers should be accessible from Druntime
mmintrin.h
xmmintrin.h
emmintrin.h
pmmintrin.h
tmmintrin.h
smmintrin.h
nmmintrin.h
ammintrin.h
wmmintrin.h
immintrin.h
zmmintrin.h
Comment #1 by aliloko — 2020-12-11T07:15:10Z
intel-intrinsics has equivalent for:
mmintrin.h
xmmintrin.h
emmintrin.h
pmmintrin.h
Comment #2 by aliloko — 2022-04-25T10:45:40Z
Since then, added support for:
tmmintrin.h
smmintrin.h
nmmintrin.h
And a few others that are only partially done.
Comment #3 by aliloko — 2022-10-27T18:20:56Z
I suggest that this bug should be closed. Moving intel-intrinsics to duntime would be possible of course, but not trivial or desirable I think.
Comment #4 by jack — 2022-10-28T05:02:30Z
(In reply to ponce from comment #3)
> I suggest that this bug should be closed. Moving intel-intrinsics to duntime
> would be possible of course, but not trivial or desirable I think.
I disagree. I think a systems programming language should absolutely have as good out-of-the-box support for SIMD types and intrinsics as clang or gcc (at least for x64, NEON would be nice). Mainstream Intel SIMD is over 20 years old; the idea of a systems language not having great support for this would be as odd as not having support for atomic intrinsics.
BTW core.atomic is a good example of what we already have in druntime. It already special cases atomic operators between inline ASM for DMD and LLVM intrinsic calls for LDC (at least LDC's overloaded version does anyway). So I don't see why we couldn't do the same thing for Intel intrinsics.
Comment #5 by alphaglosined — 2022-10-28T05:16:55Z
(In reply to Jack Stouffer from comment #4)
> BTW core.atomic is a good example of what we already have in druntime. It
> already special cases atomic operators between inline ASM for DMD and LLVM
> intrinsic calls for LDC (at least LDC's overloaded version does anyway). So
> I don't see why we couldn't do the same thing for Intel intrinsics.
core.atomic is a great example of when intrinsics need to be compiler backed and not emulated by inline assembly.
core.atomic can be 3-4 call stacks deep before the inline assembly actually gets executed. This gives plenty of time for whatever was returned to no longer be true. This is a major problem for lock-free concurrent programming.
It is how you get segfaults, please don't ask me how I know.
Comment #6 by aliloko — 2022-11-01T17:38:45Z
> So I don't see why we couldn't do the same thing for Intel intrinsics.
"intel-intrinsics" need not become an unmaintained thing in druntime.
Comment #7 by aliloko — 2022-11-01T17:40:24Z
(In reply to Jack Stouffer from comment #4)
> I don't see why we couldn't do the same thing for Intel intrinsics.
Because it's a ton of work, hence why compiler writers did provide "builtins" but not "intrinsics".
Comment #8 by robert.schadek — 2024-12-07T13:38:17Z