Comment #0 by witold.baryluk+d — 2020-05-15T15:59:32Z
https://godbolt.org/z/keEyMi
```
private final struct PTest {
ulong j;
pragma(inline, true)
final bool f() {
return !(j & 0xfffffffuL); // Works.
}
pragma(inline, true)
final bool g() {
// Doesn't work in DMD.
if (!(j & 0xfffffffuL)) {
j++;
return true;
} else {
return false;
}
}
}
void test1() {
scope p = PTest();
p.f();
p.g();
}
```
this fail to compile when using `dmd -O -inline`.
:(
1) DMD should be a bit smarter and consider more function types for inlining. It looks like right now it only consider `return (expr);`-style functions only.
2) There should be a switch to dmd compiler, or different pragma, to not make it an error, but just a warning. `pragma(hint_inline, true)` maybe?
I noticed this problem, when I tried to move some of the code into struct on stack and call to its methods, but saw 20% performance drop. assembly revealed missing inlining to one of the function. trying to inline it with pragma failed as above.
Comment #1 by witold.baryluk+d — 2020-05-15T16:00:23Z
BTW. gdc and ldc compile the code perfectly.
Comment #2 by witold.baryluk+d — 2020-12-07T15:42:04Z
DMD 2.094.2 does compile the code now.
But it is mis-compiled.
Here is simplified case:
```
private final struct PTest {
ulong j;
pragma(inline, true)
final bool g() {
if (!(j & 0xfffffffuL)) {
j++;
return true;
} else {
return false;
}
}
}
void test1(ulong k) {
scope p = PTest();
p.j = k;
p.g();
z(&p);
}
void z(PTest* p);
```
produces this output on Linux amd64:
0000000000000000 <a.test1(ulong)>:
0: 55 push %rbp
1: 48 8b ec mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 48 89 7d f8 mov %rdi,-0x8(%rbp)
c: 48 8b 45 f8 mov -0x8(%rbp),%rax
10: 48 89 45 f0 mov %rax,-0x10(%rbp)
14: 48 8d 7d f0 lea -0x10(%rbp),%rdi
18: e8 00 00 00 00 callq 1d <a.test1(ulong)+0x1d>
1d: 48 8d 7d f0 lea -0x10(%rbp),%rdi
21: e8 00 00 00 00 callq 26 <a.test1(ulong)+0x26>
26: 48 8b e5 mov %rbp,%rsp
29: 5d pop %rbp
2a: c3 retq
...
(objdump from binutils 2.35.1)
The assembly doesn't make sense to me. Even if the value returned by p.g() is not used, the function g() has possible side effect, that compiler don't know if it will or will not happen (because it is dependent on unknown value k), and the p is consumed by unknown z, so it can't be optimized away either "because scope" (it also happens with 'auto').
It does happen even without pragma(inline, true), and no matter the compiler switches.
The gcc and ldc produce correct assembly.
Changing the g() to return void (and removing return statements), make DMD generate the correct code:
0000000000000000 <a.test1(ulong)>:
0: 55 push %rbp
1: 48 8b ec mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 48 89 7d f8 mov %rdi,-0x8(%rbp)
c: 48 f7 c7 ff ff ff 0f test $0xfffffff,%rdi
13: 75 04 jne 19 <a.test1(ulong)+0x19>
15: 48 ff 45 f8 incq -0x8(%rbp)
19: 48 8d 7d f8 lea -0x8(%rbp),%rdi
1d: e8 00 00 00 00 callq 22 <a.test1(ulong)+0x22>
22: 48 8b e5 mov %rbp,%rsp
25: 5d pop %rbp
26: c3 retq
...
Comment #3 by robert.schadek — 2024-12-13T19:08:42Z