Bug 20834 – pragma(inline, true) fails to inline simple functions. fails with -inline

Status
NEW
Severity
major
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2020-05-15T15:59:32Z
Last change time
2024-12-13T19:08:42Z
Assigned to
No Owner
Creator
Witold Baryluk
Moved to GitHub: dmd#19706 →

Comments

Comment #0 by witold.baryluk+d — 2020-05-15T15:59:32Z
https://godbolt.org/z/keEyMi ``` private final struct PTest { ulong j; pragma(inline, true) final bool f() { return !(j & 0xfffffffuL); // Works. } pragma(inline, true) final bool g() { // Doesn't work in DMD. if (!(j & 0xfffffffuL)) { j++; return true; } else { return false; } } } void test1() { scope p = PTest(); p.f(); p.g(); } ``` this fail to compile when using `dmd -O -inline`. :( 1) DMD should be a bit smarter and consider more function types for inlining. It looks like right now it only consider `return (expr);`-style functions only. 2) There should be a switch to dmd compiler, or different pragma, to not make it an error, but just a warning. `pragma(hint_inline, true)` maybe? I noticed this problem, when I tried to move some of the code into struct on stack and call to its methods, but saw 20% performance drop. assembly revealed missing inlining to one of the function. trying to inline it with pragma failed as above.
Comment #1 by witold.baryluk+d — 2020-05-15T16:00:23Z
BTW. gdc and ldc compile the code perfectly.
Comment #2 by witold.baryluk+d — 2020-12-07T15:42:04Z
DMD 2.094.2 does compile the code now. But it is mis-compiled. Here is simplified case: ``` private final struct PTest { ulong j; pragma(inline, true) final bool g() { if (!(j & 0xfffffffuL)) { j++; return true; } else { return false; } } } void test1(ulong k) { scope p = PTest(); p.j = k; p.g(); z(&p); } void z(PTest* p); ``` produces this output on Linux amd64: 0000000000000000 <a.test1(ulong)>: 0: 55 push %rbp 1: 48 8b ec mov %rsp,%rbp 4: 48 83 ec 10 sub $0x10,%rsp 8: 48 89 7d f8 mov %rdi,-0x8(%rbp) c: 48 8b 45 f8 mov -0x8(%rbp),%rax 10: 48 89 45 f0 mov %rax,-0x10(%rbp) 14: 48 8d 7d f0 lea -0x10(%rbp),%rdi 18: e8 00 00 00 00 callq 1d <a.test1(ulong)+0x1d> 1d: 48 8d 7d f0 lea -0x10(%rbp),%rdi 21: e8 00 00 00 00 callq 26 <a.test1(ulong)+0x26> 26: 48 8b e5 mov %rbp,%rsp 29: 5d pop %rbp 2a: c3 retq ... (objdump from binutils 2.35.1) The assembly doesn't make sense to me. Even if the value returned by p.g() is not used, the function g() has possible side effect, that compiler don't know if it will or will not happen (because it is dependent on unknown value k), and the p is consumed by unknown z, so it can't be optimized away either "because scope" (it also happens with 'auto'). It does happen even without pragma(inline, true), and no matter the compiler switches. The gcc and ldc produce correct assembly. Changing the g() to return void (and removing return statements), make DMD generate the correct code: 0000000000000000 <a.test1(ulong)>: 0: 55 push %rbp 1: 48 8b ec mov %rsp,%rbp 4: 48 83 ec 10 sub $0x10,%rsp 8: 48 89 7d f8 mov %rdi,-0x8(%rbp) c: 48 f7 c7 ff ff ff 0f test $0xfffffff,%rdi 13: 75 04 jne 19 <a.test1(ulong)+0x19> 15: 48 ff 45 f8 incq -0x8(%rbp) 19: 48 8d 7d f8 lea -0x8(%rbp),%rdi 1d: e8 00 00 00 00 callq 22 <a.test1(ulong)+0x22> 22: 48 8b e5 mov %rbp,%rsp 25: 5d pop %rbp 26: c3 retq ...
Comment #3 by robert.schadek — 2024-12-13T19:08:42Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/19706 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB