If an array operation is performed on a short fixed-length array, for example:
float[3] x,y;
x[] += y[] * 4.0;
then it should not become a function call, it should simply be turned into:
x[0] += y[0] * 4.0;
x[1] += y[1] * 4.0;
x[2] += y[2] * 4.0;
I suspect that the threshold for making the function call will be occur at length at least 9, possibly higher, since the overhead for the function call is very large (it needs to check the capabilities of the processor, for example).
This will allow array operations to provide good performance in the commonly-used case of 2D, 3D and 4D vectors.
For x86, when the code generator supports it, such usage should be turned directly into SSE instructions. This issue is a step towards that longer-term goal.
Comment #1 by bearophile_hugs — 2010-07-08T04:37:18Z
Comment #2 by leandro.lucarella — 2010-07-08T06:54:51Z
I'm marking this a a blocker of bug 859 so there is a single bug to track all
the inlining issues. Please do the same if you open more bugs associated to
inlining, or post them directly in bug 859.
Comment #3 by braddr — 2010-07-08T22:50:21Z
undoing false dependency
Comment #4 by leandro.lucarella — 2010-07-09T06:15:55Z
(In reply to comment #3)
> undoing false dependency
Can you elaborate a little on why having bug 859 as a tracker of all missing inline oportunities is a bad thing?
Thanks
Comment #5 by clugdbug — 2010-07-12T07:01:32Z
It's worth noting that this is NOT a problem with the DMD inliner. This bug will be fixed by making the array operation generator more sophisticated. All changes will be confined to arrayop.c and will not involve the inliner in any way.
Comment #6 by andrei — 2010-07-12T07:50:29Z
I suggest (and discussed this with Walter too) to strongly move towards making arrays a library type. This is already happening for hashtables.
The compiler should reduce its role to only (a) translating array syntactic sugar (e.g. literals) to calls to that library type, (b) CTFE for arrays (which would be very difficult if CTFE were using the library array type), and (c) figuring out high-level bulk operations like the one in this bug report and optimize them.
Comment #7 by robert.schadek — 2024-12-13T17:52:21Z