Bug 10985 – Compiler doesn't attempt to inline non-templated functions from libraries (even having the full source)

Status
RESOLVED
Resolution
FIXED
Severity
critical
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-09-07T03:37:00Z
Last change time
2014-07-27T22:42:33Z
Keywords
performance, pull
Assigned to
nobody
Creator
dmitry.olsh
Blocks
11418
See also
https://issues.dlang.org/show_bug.cgi?id=13193

Comments

Comment #0 by dmitry.olsh — 2013-09-07T03:37:49Z
First smallish test case that outlines the problem: import mylib; void main() { templated("abcde"); templated("abcde"w); nonTemplated(56); } //a module module mylib; struct MyStuff{ int value; @property int fuzz(){ return value*3/2; } @property int buzz(){ return value*2/3; } } int templated(S)(S string) { auto myf = MyStuff(string.length); return myf.fuzz*myf.buzz; } int nonTemplated(int k) { auto myf = MyStuff(k); return myf.fuzz + 2 * myf.buzz; } Observe in below logs that functions are not scanned for inlining even though the full source is right there and is used for import. This severly cripples Phobos performance. Command line: C:\dmd2\src\phobos>dmd -lib mylib.d C:\dmd2\src\phobos>dmd -v -inline test_inline.d mylib.lib binary C:\dmd2\windows\bin\dmd.exe version v2.064 config C:\dmd2\windows\bin\sc.ini parse test_inline importall test_inline import object (C:\dmd2\windows\bin\..\..\src\druntime\import\object.di) import mylib (mylib.d) semantic test_inline entry main test_inline.d semantic2 test_inline semantic3 test_inline inline scan test_inline code test_inline function D main function test_inline.main function mylib.templated!string.templated function mylib.templated!(immutable(wchar)[]).templated C:\dmd2\windows\bin\link.exe test_inline,,nul,"mylib.lib"+user32+kernel32/noi; C:\dmd2\src\phobos>dmd -v -inline test_inline.d mylib.d binary C:\dmd2\windows\bin\dmd.exe version v2.064 config C:\dmd2\windows\bin\sc.ini parse test_inline parse mylib importall test_inline import object (C:\dmd2\windows\bin\..\..\src\druntime\import\object.di) importall mylib semantic test_inline entry main test_inline.d semantic mylib semantic2 test_inline semantic2 mylib semantic3 test_inline semantic3 mylib inline scan test_inline inline scan mylib code test_inline function D main function test_inline.main code mylib function mylib.MyStuff.fuzz function mylib.MyStuff.buzz function mylib.nonTemplated function mylib.templated!string.templated function mylib.templated!(immutable(wchar)[]).templated C:\dmd2\windows\bin\link.exe test_inline,,nul,user32+kernel32/noi;
Comment #1 by braddr — 2013-09-07T16:27:03Z
I suspect this is a regression, but haven't yet made the time to try some older releases. Any volunteers?
Comment #2 by dmitry.olsh — 2013-09-07T17:48:45Z
(In reply to comment #1) > I suspect this is a regression, but haven't yet made the time to try some older > releases. Any volunteers? I tested on about a dozen of old releases. It either was breaking many times in between or never worked starting at least with 2.030 (where I had to remove @property annotations). E.g. here 2.030 is exactly the same (just less info): D:\D\inline_test>dmd -v -inline test_inline.d mylib.lib parse test_inline semantic test_inline import object (C:\dmd\windows\bin\..\..\src\druntime\import\object.di) import mylib (mylib.d) semantic2 test_inline semantic3 test_inline inline scan test_inline code test_inline function main function templated function templated C:\dmd\windows\bin\link.exe test_inline,,,"mylib.lib"+user32+kernel32/noi; D:\D\inline_test>dmd -v -inline test_inline.d mylib.d parse test_inline parse mylib semantic test_inline import object (C:\dmd\windows\bin\..\..\src\druntime\import\object.di) semantic mylib semantic2 test_inline semantic2 mylib semantic3 test_inline semantic3 mylib inline scan test_inline inline scan mylib code test_inline function main function templated function templated code mylib function fuzz function buzz function nonTemplated C:\dmd\windows\bin\link.exe test_inline,,,user32+kernel32/noi;
Comment #3 by dmitry.olsh — 2013-09-08T07:48:59Z
(In reply to comment #1) > I suspect this is a regression, but haven't yet made the time to try some older > releases. What's horrible is that the same behaviour propogates to other compiler for instance with LDC. I repeated the same commands and tested with valgrind what gets called in both cases. The result is the same - if library is linked there are function calls to these names, if passed explcitly as source - they don't show up. It looks like a deep problem with frontend in general.
Comment #4 by braddr — 2013-09-08T10:28:43Z
Neither GDC nor LDC rely on the frontend inliner. To make sure the code is or isn't inlined properly, you really need to check the resulting generated code.
Comment #5 by dmitry.olsh — 2013-09-08T13:33:31Z
(In reply to comment #4) > Neither GDC nor LDC rely on the frontend inliner. To make sure the code is or > isn't inlined properly, you really need to check the resulting generated code. Well for me full callgraph is as good evidence as any. Anyhow 2 disassmebled mains for LDC are below. The problem is like I said lies deeper - it isn't DMD's _inliner_ fault, rather that the availaible code is not passed to it. Hence the absense of functions in DMDs inline scan, they are *not presented* to it at all. Ditto with other compilers. With libmylib.a passed on the command line: _Dmain: sub RSP,068h lea RDI,050h[RSP] lea RAX,030h[RSP] mov qword ptr 040h[RSP],offset FLAT:.str@32S mov qword ptr 038h[RSP],5 mov ECX,038h[RSP] mov 028h[RSP],ECX mov 030h[RSP],ECX mov 020h[RSP],RDI mov RDI,RAX mov 018h[RSP],RAX call _D5mylib7MyStuff4fuzzMFZi@PC32 mov RDI,018h[RSP] mov 014h[RSP],EAX call _D5mylib7MyStuff4buzzMFZi@PC32 mov qword ptr 060h[RSP],offset FLAT:.str1@32S mov qword ptr 058h[RSP],5 mov RDI,058h[RSP] mov ECX,EDI mov 048h[RSP],ECX mov ECX,048h[RSP] mov 050h[RSP],ECX mov RDI,020h[RSP] mov 010h[RSP],EAX call _D5mylib7MyStuff4fuzzMFZi@PC32 lea RDI,050h[RSP] mov 0Ch[RSP],EAX call _D5mylib7MyStuff4buzzMFZi@PC32 mov EDI,038h mov 8[RSP],EAX call _D5mylib12nonTemplatedFiZi@PC32 mov ECX,0 mov 4[RSP],EAX mov EAX,ECX add RSP,068h ret With mylib.d passed on the command line: _Dmain: mov EAX,0 mov qword ptr -038h[RSP],offset FLAT:.str@32S mov qword ptr -040h[RSP],5 mov ECX,-040h[RSP] mov -050h[RSP],ECX mov -048h[RSP],ECX mov qword ptr -8[RSP],offset FLAT:.str1@32S mov qword ptr -010h[RSP],5 mov RDX,-010h[RSP] mov ECX,EDX mov -020h[RSP],ECX mov ECX,-020h[RSP] mov -018h[RSP],ECX mov dword ptr -024h[RSP],038h mov ECX,-024h[RSP] mov -030h[RSP],ECX mov ECX,-030h[RSP] mov -028h[RSP],ECX ret
Comment #6 by k.hara.pg — 2013-09-15T19:33:24Z
The root cause is in mars.c around line 1580. https://github.com/D-Programming-Language/dmd/blob/a5086fa49c5cd236297584c07e03be8e52208158/src/mars.c#L1579 if (global.params.useInline) { /* The problem with useArrayBounds and useAssert is that the * module being linked to may not have generated them, so if * we inline functions from those modules, the symbols for them will * not be found at link time. * We must do this BEFORE generating the .deps file! */ if (!global.params.useArrayBounds && !global.params.useAssert) { // Do pass 3 semantic analysis on all imported modules, // since otherwise functions in them cannot be inlined for (size_t i = 0; i < Module::amodules.dim; i++) { m = Module::amodules[i]; if (global.params.verbose) printf("semantic3 %s\n", m->toChars()); m->semantic3(); } if (global.errors) fatal(); } } To turn off both useArrayBounds and useAssert flags, specifying `-release -noboundscheck` in command line is necessary.
Comment #7 by k.hara.pg — 2013-09-15T23:26:10Z
Comment #8 by code — 2013-12-06T15:15:44Z
The problem I see is that we don't want to emit code for everything imported, as it would render separate compilation useless. A form of LTO seems like the much more interesting direction to me.
Comment #9 by github-bugzilla — 2013-12-27T17:06:14Z
Comment #10 by github-bugzilla — 2014-05-17T17:59:39Z
Commits pushed to master at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/9cf4601702e24250dff3a61b510bae30a12eb8ae fix Issue 10985 - Compiler doesn't attempt to inline non-templated functions from libraries (even having the full source) https://github.com/D-Programming-Language/dmd/commit/c2794ce23737cc4936762a08be7ccfd9e40e025d Merge pull request #2561 from 9rnsr/fix10985 Issue 10985 - Compiler doesn't attempt to inline non-templated functions from libraries (even having the full source)