Comment #2 by ilyayaroshenko — 2016-10-09T05:27:11Z
This should be an intrinsic, a part of the language, not DRuntime. Lets each compiler implements it as its devs wants. Also, making this a function breaks LDC's fastmath optimizations.
Comment #3 by bugzilla — 2016-10-09T10:45:34Z
The following code:
import core.simd;
auto load(in float* p) {
enum regsz = 16;
enum N = regsz / float.sizeof;
alias vec = __vector(float[N]);
return __simd(XMM.LODUPS, *cast(const vec*) p);
}
auto foo(float f) {
return load(&f);
}
produces in D with -O -inline for foo():
_D5test23fooFNaNbNifZNhG16v:
push RBP
mov RBP,RSP
movss 010h[RBP],XMM0
movups XMM0,010h[RBP] <=====
pop RBP
ret
So I'm not sure what the problem is.
Comment #4 by ilyayaroshenko — 2016-10-09T12:08:42Z
(In reply to Walter Bright from comment #3)
> So I'm not sure what the problem is.
1. A function can be inlined by LDC but it will break other possible optimisations.
2. Complexity. LLVM provides portable, and generic API.
It allows to write generic SIMD code without multiple library backends for this code.
All target specified logic for Mir GLAS is concentrated in the following file. Comparing with Eigen it is very effective solution in terms of work time per new architecture.
https://github.com/libmir/mir/blob/master/source/mir/glas/internal/config.d
Comment #5 by andrei — 2016-10-09T14:48:05Z
Do you mean unaligned load/store must be generic? Part of the core language or druntime?
Comment #6 by ilyayaroshenko — 2016-10-09T16:01:11Z
(In reply to Andrei Alexandrescu from comment #5)
> Do you mean unaligned load/store must be generic?
Yes. I am really sorry that my explanation was not clear before.
> Part of the core language or druntime?
Does not matter, druntime is OK too. In the same time LDC's developers should confirm that thay are OK with the API and can make it an intrinsics for LDC without any generic function shell on top of existing LLVM intrinsics.
Comment #7 by bugzilla — 2016-10-09T20:03:07Z
I propose putting the 'loadUnaligned' function in druntime. The point of having it there is it only has to be ported once, then everyone can use it. There are plenty of other examples of this in druntime.
> making this a function breaks LDC's fastmath optimizations.
... and ...
> A function can be inlined by LDC but it will break other possible optimisations.
I find this baffling. The whole point of inlining is to enable optimizations, not disable them. Even if this were the case, the DMD front end inlines it, and the LLVM back end will never see the function.
Lastly, the link going to the LDC code has this:
version (LDC)
{
return loadUnaligned!vec(cast(T*) p);
}
meaning that the LDC implementation is already a wrapper template function.
Comment #8 by ilyayaroshenko — 2016-10-10T14:41:00Z
I completely forgot that loadUnaligned is inlineIR alias.
So, LDC can just use an alias for loadUnaligned. So, it can to be generic function, which will an alias for LDC Druntime.
Walter, sorry for the noise!
template loadUnaligned(V)
if(is(typeof(llvmVecType!V)))
{
...
alias inlineIR!(ir, V, const(T)*) loadUnaligned;
}