this simple test:
void main()
{
int[3] x = [1,2,3];
}
should just initialize x with the values 1,2,3
what happens in reality is not as nice as this output of obj2asm shows:
_Dmain:
push EBP
mov EBP,ESP
sub ESP,0Ch
push EBX
lea EAX,-0Ch[EBP]
push EAX
push 3
push 3
push 2
push 1
push 3
mov ECX,offset FLAT:_D11TypeInfo_Ai6__initZ@SYM32
push ECX
call near ptr _d_arrayliteralT@PC32
add ESP,014h
mov EDX,EAX
mov EBX,3
push EDX
push EBX
push 4
call near ptr _d_arraycopy@PC32
xor EAX,EAX
add ESP,014h
pop EBX
leave
ret
I'm marking it wrong-code, even though the code does compile and link, it's just wrong ... I'm using 1.034 but I'd guess it's an older bug, so I'm not marking a specific version.
Comment #1 by tomas — 2008-09-15T08:52:08Z
I think this issue might have been introduced in DMD 1.034, as I've hit several issues in LLVMDC, that used to work, due to some initializers suddenly being ArrayLiterals instead of ArrayInitializers. Marking version 1.034.
Comment #2 by clugdbug — 2009-04-08T08:01:56Z
This is a performance issue, not wrong-code.
Comment #3 by bearophile_hugs — 2010-07-19T23:18:32Z
*** Issue 4488 has been marked as a duplicate of this issue. ***
After applying my patch, the sample code generates like follows:
(output of ddbg in 64-bit Windows 7)
c:\d\test.d:1 void main()
00402010: c80c0000 enter 0xc, 0x0
c:\d\test.d:3 int[3] x = [1,2,3];
00402014: c745f401000000 mov dword [ebp-0xc], 0x1
0040201b: c745f802000000 mov dword [ebp-0x8], 0x2
00402022: c745fc03000000 mov dword [ebp-0x4], 0x3
00402029: 31c0 xor eax, eax
test.obj
0040202b: c9 leave
0040202c: c3 ret
Comment #6 by code — 2011-10-07T19:09:44Z
This is also a huge issue when assigning .init to a static array.
ubyte[1024] v;
v = typeof(v).init;
This will generate a dynamically allocated [0, 0, 0 ... ] Arrayliteral for the rhs expression before calling _d_arraycopy. Even worse the allocated ArrayLiteral is initialized using 1024 comma expressions. Using -O the compiler will subsequently almost starve from O(N^2) behavior during comsub eliminations.
Comment #7 by yebblies — 2012-02-01T19:27:49Z
*** Issue 4298 has been marked as a duplicate of this issue. ***
Comment #8 by yebblies — 2012-02-01T20:34:11Z
*** Issue 4881 has been marked as a duplicate of this issue. ***
Comment #9 by yebblies — 2012-02-02T01:49:34Z
*** Issue 2237 has been marked as a duplicate of this issue. ***
Comment #10 by k.hara.pg — 2012-05-05T00:42:06Z
Pull #375 was not sufficient, so removed 'patch' keyword.
Comment #11 by issues.dlang — 2012-10-14T11:36:34Z
*** Issue 8820 has been marked as a duplicate of this issue. ***
Comment #12 by yebblies — 2012-10-28T11:14:56Z
*** Issue 8903 has been marked as a duplicate of this issue. ***
Comment #13 by verylonglogin.reg — 2012-10-30T07:17:29Z
Workaround for those who like "a, b, c" initialization but need more performance (not: it still calls `_d_arraycopy`):
---
T[n] makeStaticArray(T, size_t n)(T[n] data...)
// { return data; }
{ T[n] res; res = data; return res; } // Issue 8914 workaround
void setStaticArray(T, size_t n)(ref T[n] array, T[n] data...)
{ array = data; }
void main()
{
auto x = makeStaticArray(1, 2, 3);
static assert(is(typeof(x) == int[3]));
assert(x == [1, 2, 3]);
int[3] y;
y.setStaticArray(1, 2, 3);
assert(y == [1, 2, 3]);
}
---
Comment #15 by bearophile_hugs — 2013-04-10T19:20:51Z
(In reply to comment #14)
> New D2 fix:
> https://github.com/D-Programming-Language/dmd/pull/1883
From the pull request (dmd -O -inline -g test after):
c:\d\test.d:18 int[3] y = [n, n, n];
004020aa: 6a03 push 0x3
004020ac: 6a05 push 0x5
004020ae: 8d4c241c lea ecx, [esp+0x1c]
004020b2: 51 push ecx
004020b3: e880020000 call 0x402338 __memset32
Isn't calling memset for just 3 integers slower than inlining their assignments? I suggest to not call memset if the number of bytes to be copied is so small (I think LDC is already doing similar optimizations). Maybe a benchmark is also useful here.
c:\d\test.d:20 S[3] z = [s2, s2, s2];
004020b8: 8d542418 lea edx, [esp+0x18]
004020bc: 52 push edx
004020bd: 8d442430 lea eax, [esp+0x30]
004020c1: e86affffff call 0x402030 test.S.__cpctor c:\d\test.d:3
004020c6: 8d5c2418 lea ebx, [esp+0x18]
004020ca: 53 push ebx
004020cb: 8d442434 lea eax, [esp+0x34]
004020cf: e85cffffff call 0x402030 test.S.__cpctor c:\d\test.d:3
004020d4: 53 push ebx
004020d5: 8d442438 lea eax, [esp+0x38]
004020d9: e852ffffff call 0x402030 test.S.__cpctor c:\d\test.d:3
004020de: 83c40c add esp, 0xc
004020e1: 31c0 xor eax, eax
If the s2 variable already contains the struct, then what's the purpose of those calls to 0x402030?
In the "before" there are no calls to struct constructors:
c:\d\test.d:20 S[3] z = [s2, s2, s2];
00403913: 8d542474 lea edx, [esp+0x74]
00403917: b960014200 mov ecx, 0x420160
0040391c: 52 push edx
0040391d: 6a03 push 0x3
0040391f: 6a03 push 0x3
00403921: 51 push ecx
00403922: e8fd0a0000 call 0x404424 __d_arrayliteralTX
00403927: 83c408 add esp, 0x8
0040392a: 8d542470 lea edx, [esp+0x70]
0040392e: 52 push edx
0040392f: 89c6 mov esi, eax
Comment #16 by k.hara.pg — 2013-04-10T19:34:08Z
(In reply to comment #15)
> (In reply to comment #14)
> > New D2 fix:
> > https://github.com/D-Programming-Language/dmd/pull/1883
>
> From the pull request (dmd -O -inline -g test after):
>
>
> c:\d\test.d:18 int[3] y = [n, n, n];
> 004020aa: 6a03 push 0x3
> 004020ac: 6a05 push 0x5
> 004020ae: 8d4c241c lea ecx, [esp+0x1c]
> 004020b2: 51 push ecx
> 004020b3: e880020000 call 0x402338 __memset32
>
> Isn't calling memset for just 3 integers slower than inlining their
> assignments? I suggest to not call memset if the number of bytes to be copied
> is so small (I think LDC is already doing similar optimizations). Maybe a
> benchmark is also useful here.
It is lowered to:
int[3] y = void;
y[] = n;
And currently dmd uses memset for `y[] = n;`. It is another optimization issue.
> c:\d\test.d:20 S[3] z = [s2, s2, s2];
> 004020b8: 8d542418 lea edx, [esp+0x18]
> 004020bc: 52 push edx
> 004020bd: 8d442430 lea eax, [esp+0x30]
> 004020c1: e86affffff call 0x402030 test.S.__cpctor c:\d\test.d:3
> 004020c6: 8d5c2418 lea ebx, [esp+0x18]
> 004020ca: 53 push ebx
> 004020cb: 8d442434 lea eax, [esp+0x34]
> 004020cf: e85cffffff call 0x402030 test.S.__cpctor c:\d\test.d:3
> 004020d4: 53 push ebx
> 004020d5: 8d442438 lea eax, [esp+0x38]
> 004020d9: e852ffffff call 0x402030 test.S.__cpctor c:\d\test.d:3
> 004020de: 83c40c add esp, 0xc
> 004020e1: 31c0 xor eax, eax
>
> If the s2 variable already contains the struct, then what's the purpose of
> those calls to 0x402030?
>
>
> In the "before" there are no calls to struct constructors:
>
> c:\d\test.d:20 S[3] z = [s2, s2, s2];
> 00403913: 8d542474 lea edx, [esp+0x74]
> 00403917: b960014200 mov ecx, 0x420160
> 0040391c: 52 push edx
> 0040391d: 6a03 push 0x3
> 0040391f: 6a03 push 0x3
> 00403921: 51 push ecx
> 00403922: e8fd0a0000 call 0x404424 __d_arrayliteralTX
> 00403927: 83c408 add esp, 0x8
> 0040392a: 8d542470 lea edx, [esp+0x70]
> 0040392e: 52 push edx
> 0040392f: 89c6 mov esi, eax
Before, cpctor(==postblit) calls are done in __d_arrayliteralTX, so they are hidden. Now they are directly called on the stack memory z[0..3].
Comment #17 by bearophile_hugs — 2013-04-10T19:44:37Z
(In reply to comment #16)
> It is lowered to:
> int[3] y = void;
> y[] = n;
>
> And currently dmd uses memset for `y[] = n;`. It is another optimization issue.
OK.
> Before, cpctor(==postblit) calls are done in __d_arrayliteralTX, so they are
> hidden. Now they are directly called on the stack memory z[0..3].
Sorry I have missed it was the postblit, thank you.
Comment #18 by github-bugzilla — 2013-04-11T01:36:48Z
Comment #19 by bearophile_hugs — 2013-04-11T05:26:29Z
The patch seems to work. With it I have removed five optimizations from my code. Very good.
Comment #20 by andrei — 2013-04-11T06:12:47Z
Thanks, Kenji!
Comment #21 by code — 2013-06-06T00:04:57Z
Awesome, thanks for fixing this. This was my number 1 most annoying bug in D. Because when not using a GC the old behavior always leaked memory.
Comment #22 by verylonglogin.reg — 2013-10-24T13:21:28Z
The fix is partially reverted in dmd pull 2682 [1] as it introduced Issue 11238.
Only construction is optimized now:
---
void main()
{
int[2] m = [4, 6]; // still optimized
m = [m[1], m[0]]; // swap, currently calls `_d_arrayliteralTX`
assert(m == [6, 4]); // was [6, 6] with original fix (Issue 11238)
}
---
Opened Issue 11345 for assignment case.
Also some duplicates of current issue and duplicates of duplicates of current issue are now really duplicates of Issue 11345.
[1] https://github.com/D-Programming-Language/dmd/pull/2682
Comment #23 by verylonglogin.reg — 2013-10-24T13:45:30Z
(In reply to comment #22)
> Also some duplicates of current issue and duplicates of duplicates of current
> issue are now really duplicates of Issue 11345.
Looks like Issue 4881 was the only one. Changed its "DUPLICATE of".