Sure this code should be allowed to compile?
In C, it is similar to _mm_set1_ps(b[i]);
import core.simd;
import std.stdio;
void load(float4 fl, const float a) {
fl = a;
}
void main() {
float4 fl;
load(fl, 5);
foreach(f; fl.array)
writefln("%f", f);
}
Comment #1 by turkeyman — 2012-03-11T12:19:46Z
I think this should be prohibited at all costs. This is a very slow operation on every architecture other than x64.
I'm carefully crafting std.simd with functions that encourage the best results on all platforms, and I'm addressing this issue there.
Allowing this implicit case would ruin many libraries implementing SIMD code on all non-x64 platforms. They should rather be encouraged to use the library appropriately instead.
Comment #2 by dlang — 2012-03-12T18:33:42Z
How should something like this be done then on other architectures?
I'm creating a matrix multiplication library and I tried using a value like this and it didn't work.
I also tried float4 fl = 5; as a test and that didn't set fl to anything. (See my other bug).
Shouldn't that second one cause a error (either while compiling or at runtime)?
Comment #3 by turkeyman — 2012-03-13T11:40:54Z
Note that as yet, constant's aren't actually properly supported. There are bugs, and the feature is incomplete.
Down the track, if you want to use scalar variables, you should be encouraged to load it into a float4 using a the loadFloat(float f) api as far outside your hot code as possible, and use the produced 4x float vector instead.
I have a fork with std.simd work in progress if you wanna have a peek: https://github.com/TurkeyMan/phobos/commits/master/std/simd.d
Coming together, still a bit to do.
This library will be efficient on all architecture, if only a little archaic, but it follows D conventions quite closely.
I'd encourage people to build higher level maths libraries ontop of std.simd instead of implementing the hardware abstraction themselves. It'll make libraries a whole lot more portable, ctfe-able, and I expect it'll become very highly tuned with use, which will benefit all maths libs.
Actually GDC and LDC are capable of generating optimal code for scalar to vector assignment.
auto foo(float a)
{
__vector(float[4]) va = void;
va = 2 * a;
return va;
}
> I think this should be prohibited at all costs.
It's not helpful that dmd currently disallows this assignment
because it promotes usage of 'vec.array = val' which uses the stack.
Of course one could write a library wrapper but isn't it much
better to leave this to the compiler?
Comment #6 by turkeyman — 2013-04-09T03:45:14Z
(In reply to comment #5)
> Actually GDC and LDC are capable of generating optimal code for scalar to
> vector assignment.
It's not a portable concept. It's an operation that should generally be avoided/discouraged.
I'd rather supply an explicit function like "v = loadScalar(s);", which is documented with its performance characteristics, and is completely clear to the programmer that they're using it.
If programmers think v = s; is a benign operation, they'll merrily write bad code.
> auto foo(float a)
> {
> __vector(float[4]) va = void;
> va = 2 * a;
> return va;
> }
>
> > I think this should be prohibited at all costs.
>
> It's not helpful that dmd currently disallows this assignment
> because it promotes usage of 'vec.array = val' which uses the stack.
> Of course one could write a library wrapper but isn't it much
> better to leave this to the compiler?
If I had my way, .array would be removed ;)
Interacting vectors/scalars should be a deliberate and conservative operation. It's very expensive on most architectures. Used within a tight SIMD loop, it will ruin your code.
I probably won't win this argument though... people like to be able to write slow code conveniently ;)
It's not the worst thing in the world, but it's a slippery slope.
Comment #7 by ibuclaw — 2013-12-09T02:08:15Z
*** Issue 10446 has been marked as a duplicate of this issue. ***
Comment #8 by ibuclaw — 2013-12-09T02:14:41Z
Brief description of problem.
DMD doesn't allow the following code, whereas GDC and LDC accept and are able to generate code for (be it slow or optimised from constfolding) the below:
import core.simd;
void main() {
double x = 1.0, y = 2.0;
double2 a = x; // Error: Floating point constant expression expected
double2 b = [x, y]; // Error: Floating point constant expression expected
}