First smallish test case that outlines the problem:
import mylib;
void main()
{
templated("abcde");
templated("abcde"w);
nonTemplated(56);
}
//a module
module mylib;
struct MyStuff{
int value;
@property int fuzz(){
return value*3/2;
}
@property int buzz(){
return value*2/3;
}
}
int templated(S)(S string)
{
auto myf = MyStuff(string.length);
return myf.fuzz*myf.buzz;
}
int nonTemplated(int k)
{
auto myf = MyStuff(k);
return myf.fuzz + 2 * myf.buzz;
}
Observe in below logs that functions are not scanned for inlining even though the full source is right there and is used for import. This severly cripples Phobos performance.
Command line:
C:\dmd2\src\phobos>dmd -lib mylib.d
C:\dmd2\src\phobos>dmd -v -inline test_inline.d mylib.lib
binary C:\dmd2\windows\bin\dmd.exe
version v2.064
config C:\dmd2\windows\bin\sc.ini
parse test_inline
importall test_inline
import object (C:\dmd2\windows\bin\..\..\src\druntime\import\object.di)
import mylib (mylib.d)
semantic test_inline
entry main test_inline.d
semantic2 test_inline
semantic3 test_inline
inline scan test_inline
code test_inline
function D main
function test_inline.main
function mylib.templated!string.templated
function mylib.templated!(immutable(wchar)[]).templated
C:\dmd2\windows\bin\link.exe test_inline,,nul,"mylib.lib"+user32+kernel32/noi;
C:\dmd2\src\phobos>dmd -v -inline test_inline.d mylib.d
binary C:\dmd2\windows\bin\dmd.exe
version v2.064
config C:\dmd2\windows\bin\sc.ini
parse test_inline
parse mylib
importall test_inline
import object (C:\dmd2\windows\bin\..\..\src\druntime\import\object.di)
importall mylib
semantic test_inline
entry main test_inline.d
semantic mylib
semantic2 test_inline
semantic2 mylib
semantic3 test_inline
semantic3 mylib
inline scan test_inline
inline scan mylib
code test_inline
function D main
function test_inline.main
code mylib
function mylib.MyStuff.fuzz
function mylib.MyStuff.buzz
function mylib.nonTemplated
function mylib.templated!string.templated
function mylib.templated!(immutable(wchar)[]).templated
C:\dmd2\windows\bin\link.exe test_inline,,nul,user32+kernel32/noi;
Comment #1 by braddr — 2013-09-07T16:27:03Z
I suspect this is a regression, but haven't yet made the time to try some older releases. Any volunteers?
Comment #2 by dmitry.olsh — 2013-09-07T17:48:45Z
(In reply to comment #1)
> I suspect this is a regression, but haven't yet made the time to try some older
> releases. Any volunteers?
I tested on about a dozen of old releases.
It either was breaking many times in between or never worked starting at least with 2.030 (where I had to remove @property annotations).
E.g. here 2.030 is exactly the same (just less info):
D:\D\inline_test>dmd -v -inline test_inline.d mylib.lib
parse test_inline
semantic test_inline
import object (C:\dmd\windows\bin\..\..\src\druntime\import\object.di)
import mylib (mylib.d)
semantic2 test_inline
semantic3 test_inline
inline scan test_inline
code test_inline
function main
function templated
function templated
C:\dmd\windows\bin\link.exe test_inline,,,"mylib.lib"+user32+kernel32/noi;
D:\D\inline_test>dmd -v -inline test_inline.d mylib.d
parse test_inline
parse mylib
semantic test_inline
import object (C:\dmd\windows\bin\..\..\src\druntime\import\object.di)
semantic mylib
semantic2 test_inline
semantic2 mylib
semantic3 test_inline
semantic3 mylib
inline scan test_inline
inline scan mylib
code test_inline
function main
function templated
function templated
code mylib
function fuzz
function buzz
function nonTemplated
C:\dmd\windows\bin\link.exe test_inline,,,user32+kernel32/noi;
Comment #3 by dmitry.olsh — 2013-09-08T07:48:59Z
(In reply to comment #1)
> I suspect this is a regression, but haven't yet made the time to try some older
> releases.
What's horrible is that the same behaviour propogates to other compiler for instance with LDC.
I repeated the same commands and tested with valgrind what gets called in both cases. The result is the same - if library is linked there are function calls to these names, if passed explcitly as source - they don't show up.
It looks like a deep problem with frontend in general.
Comment #4 by braddr — 2013-09-08T10:28:43Z
Neither GDC nor LDC rely on the frontend inliner. To make sure the code is or isn't inlined properly, you really need to check the resulting generated code.
Comment #5 by dmitry.olsh — 2013-09-08T13:33:31Z
(In reply to comment #4)
> Neither GDC nor LDC rely on the frontend inliner. To make sure the code is or
> isn't inlined properly, you really need to check the resulting generated code.
Well for me full callgraph is as good evidence as any. Anyhow 2 disassmebled mains for LDC are below.
The problem is like I said lies deeper - it isn't DMD's _inliner_ fault, rather that the availaible code is not passed to it. Hence the absense of functions in DMDs inline scan, they are *not presented* to it at all. Ditto with other compilers.
With libmylib.a passed on the command line:
_Dmain:
sub RSP,068h
lea RDI,050h[RSP]
lea RAX,030h[RSP]
mov qword ptr 040h[RSP],offset FLAT:.str@32S
mov qword ptr 038h[RSP],5
mov ECX,038h[RSP]
mov 028h[RSP],ECX
mov 030h[RSP],ECX
mov 020h[RSP],RDI
mov RDI,RAX
mov 018h[RSP],RAX
call _D5mylib7MyStuff4fuzzMFZi@PC32
mov RDI,018h[RSP]
mov 014h[RSP],EAX
call _D5mylib7MyStuff4buzzMFZi@PC32
mov qword ptr 060h[RSP],offset FLAT:.str1@32S
mov qword ptr 058h[RSP],5
mov RDI,058h[RSP]
mov ECX,EDI
mov 048h[RSP],ECX
mov ECX,048h[RSP]
mov 050h[RSP],ECX
mov RDI,020h[RSP]
mov 010h[RSP],EAX
call _D5mylib7MyStuff4fuzzMFZi@PC32
lea RDI,050h[RSP]
mov 0Ch[RSP],EAX
call _D5mylib7MyStuff4buzzMFZi@PC32
mov EDI,038h
mov 8[RSP],EAX
call _D5mylib12nonTemplatedFiZi@PC32
mov ECX,0
mov 4[RSP],EAX
mov EAX,ECX
add RSP,068h
ret
With mylib.d passed on the command line:
_Dmain:
mov EAX,0
mov qword ptr -038h[RSP],offset FLAT:.str@32S
mov qword ptr -040h[RSP],5
mov ECX,-040h[RSP]
mov -050h[RSP],ECX
mov -048h[RSP],ECX
mov qword ptr -8[RSP],offset FLAT:.str1@32S
mov qword ptr -010h[RSP],5
mov RDX,-010h[RSP]
mov ECX,EDX
mov -020h[RSP],ECX
mov ECX,-020h[RSP]
mov -018h[RSP],ECX
mov dword ptr -024h[RSP],038h
mov ECX,-024h[RSP]
mov -030h[RSP],ECX
mov ECX,-030h[RSP]
mov -028h[RSP],ECX
ret
Comment #6 by k.hara.pg — 2013-09-15T19:33:24Z
The root cause is in mars.c around line 1580.
https://github.com/D-Programming-Language/dmd/blob/a5086fa49c5cd236297584c07e03be8e52208158/src/mars.c#L1579
if (global.params.useInline)
{
/* The problem with useArrayBounds and useAssert is that the
* module being linked to may not have generated them, so if
* we inline functions from those modules, the symbols for them will
* not be found at link time.
* We must do this BEFORE generating the .deps file!
*/
if (!global.params.useArrayBounds && !global.params.useAssert)
{
// Do pass 3 semantic analysis on all imported modules,
// since otherwise functions in them cannot be inlined
for (size_t i = 0; i < Module::amodules.dim; i++)
{
m = Module::amodules[i];
if (global.params.verbose)
printf("semantic3 %s\n", m->toChars());
m->semantic3();
}
if (global.errors)
fatal();
}
}
To turn off both useArrayBounds and useAssert flags, specifying `-release -noboundscheck` in command line is necessary.
The problem I see is that we don't want to emit code for everything imported, as it would render separate compilation useless. A form of LTO seems like the much more interesting direction to me.
Comment #9 by github-bugzilla — 2013-12-27T17:06:14Z