At least on the X86 architecture atomic increments/decrements can be done way more efficient using specialized instruction which are not used by the more generic atomicOp.
Comment #1 by jadit2 — 2015-03-30T15:29:47Z
I might take a look at this. Would a lock:xadd be appropriate for this atomicAdd?
Comment #2 by code — 2015-03-30T20:40:37Z
*** Issue 4769 has been marked as a duplicate of this issue. ***
Comment #3 by code — 2015-03-30T20:56:55Z
Most likely there is no performance difference between "lock xadd [ptr], 1" and "lock inc [ptr]", the increment instruction takes a byte less.
You can ask g++/clang++ for good ideas.
----
#include <atomic>
std::atomic<uint8_t> ubyte;
std::atomic<uint16_t> ushort;
std::atomic<uint32_t> uint;
std::atomic<uint64_t> ulong;
int main() {
++ubyte;
++ushort;
++uint;
++ulong;
}
----
It seems that we should rather implement the more generic fetch_add and fetch_sub though.
http://en.cppreference.com/w/cpp/atomic/atomic/fetch_addhttp://en.cppreference.com/w/cpp/atomic/atomic/fetch_sub
Comment #4 by jadit2 — 2015-03-31T17:58:48Z
I'm having an issue adding an asm block for the fast increment by operation. It's been awhile since I've used asm productively so I'm probably missing something simple.
HeadUnshared!(T) atomicOp(string op, T, V1)( ref shared T val, V1 mod ) pure nothrow @nogc
if( __traits( compiles, mixin( "*cast(T*)&val" ~ op ~ "mod" ) ) )
[....]
get = set = atomicLoad!(MemoryOrder.raw)( val );
auto setptr = &set;
static if(op == "+=") {
// qword ptr
asm pure nothrow @nogc
{
mov EAX, mod; //:678
mov EDX, setptr; //:679
lock;
xadd [EDX], EAX;
}
}
Error:
src/core/atomic.d(678): Error: bad type/size of operands 'mov'
src/core/atomic.d(679): Error: bad type/size of operands 'mov'
Comment #5 by code — 2015-04-05T20:14:00Z
(In reply to Jonathan Dunlap from comment #4)
Depending on the size of mod you need to use a smaller register.
static if (V1.sizeof == 1) asm { mov AL, mod; }
static if (V1.sizeof == 2) asm { mov AX, mod; }
static if (V1.sizeof == 4) asm { mov EAX, mod; }
static if (V1.sizeof == 8) asm { mov RAX, mod; }
Likewise you need to load the pointer (ref) into a fitting register.
mov EDX, val; // x86
mov RDX, val; // x86_64
Comment #6 by code — 2015-04-05T20:16:09Z
Created attachment 1508
atomic fetch add for x64
Implementation of atomicOp!"+=" (fetch_add) for X86_64.
Thanks Martin for the gist sample, I've started to integrate in this branch:
https://github.com/jadbox/druntime/tree/fetchmod
Currently I'm having an issue where a static if check against __traits(isSame,T,V1) is always resulting to false, even though it seems T and V1 are the same in the unittests.
Comment #9 by jadit2 — 2015-04-07T05:39:21Z
I resolved the issue by replacing the "__traits(isSame, T, V1)" with "is(T == V1)".
Okay my branch below seems to pass all the x64 unittests I've made for this. It should also work for x86, but I haven't tested yet. Would someone want to benchmark this?
Comment #10 by jadit2 — 2015-04-07T05:43:56Z
I'm debating if we should remove the atomicFetchSub since it would simply be an alias to atomicFetchAdd with negative sign on the modifier. Thoughts?