Bug 18928 – extern(C++) bad codegen, wrong calling convention

Status
RESOLVED
Resolution
FIXED
Severity
critical
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Windows
Creation time
2018-05-31T18:01:47Z
Last change time
2018-06-10T19:04:37Z
Keywords
C++, industry
Assigned to
No Owner
Creator
Manu

Attachments

IDFilenameSummaryContent-TypeSize
1698ConsoleApp1.zipTest appapplication/x-zip-compressed4033

Comments

Comment #0 by turkeyman — 2018-05-31T18:01:47Z
Created attachment 1698 Test app Crashing from bad codegen, uses wrong registers passing arguments. Test calls a static function and a virtual function. Static function works as expected, virtual call crashes. Open the project in VS2015 (others may work) and build + run. Place a breakpoint in the virtual `addUser`function and notice that it didn't receive its argument correctly, and also fails to return its result correct. Static version of the function calls properly.
Comment #1 by turkeyman — 2018-05-31T19:21:25Z
I'm suspecting an issue with RVO. I think the MSC ABI has changed to guarantee RVO at some point in the last few years... but I'm not sure why the static call and virtual call appear to be called using different calling convention.
Comment #2 by bugzilla — 2018-06-01T03:08:46Z
Can you please post an example that doesn't require the IDDE? Just show the disassembly where the code is wrong, please.
Comment #3 by turkeyman — 2018-06-01T19:12:26Z
You'll have to relate these 2 lines to the .cpp file (attached) in main() Calling code in main(): // static call... auto x = addUser(test, "test"); 00007FF74843C587 lea rdx,[string "test" (07FF748448D44h)] 00007FF74843C58E mov rcx,qword ptr [test] 00007FF74843C592 call addUser (07FF74840C735h) 00007FF74843C597 mov dword ptr [rbp+104h],eax 00007FF74843C59D mov eax,dword ptr [rbp+104h] 00007FF74843C5A3 mov dword ptr [x],eax // virtual call... x = test->addUser("test"); 00007FF74843C5A6 mov rax,qword ptr [test] 00007FF74843C5AA mov rax,qword ptr [rax] 00007FF74843C5AD lea r8,[string "test" (07FF748448D44h)] 00007FF74843C5B4 lea rdx,[rbp+124h] // what is? RVO return address? 00007FF74843C5BB mov rcx,qword ptr [test] 00007FF74843C5BF call qword ptr [rax] 00007FF74843C5C1 mov eax,dword ptr [rax] 00007FF74843C5C3 mov dword ptr [x],eax This is the C++ code calling into D; it appears C++ makes a static call differently to a virtual call? I suspect RDX in the virtual call is the RVO return address? I wonder why the static-call doesn't follow the same call convention? I can attach the DMD disassembly? But it's probably easier to build that yourself. This might reveal that the VC ABI has RVO rules that we don't understand?
Comment #4 by turkeyman — 2018-06-01T19:14:51Z
If you look at the disassembly of userdatabase.d, you'll see that the virtual function is not expecting the string in R8 where VC is putting it. It also doesn't appear to return via the RVO pointer(?) in RDX. The static call matches D's expectations.
Comment #5 by kinke — 2018-06-01T20:34:10Z
(In reply to Manu from comment #3) > This might reveal that the VC ABI has RVO rules that we don't understand? Or rather that their C++ ABI apparently enforces sret/RVO for methods returning structs, even such trivial 32-bit PODs, but doesn't for normal functions. Interesting findings; I'll verify them.
Comment #6 by turkeyman — 2018-06-01T21:27:31Z
I recall a few years ago reading some article about MSVC breaking their ABI to guarantee RVO, because C++11 and move semantics and all that. I guess it needed to be efficient and therefore justified guaranteeing RVO where previously it was an optimisation. It's probably the case that extern(C++) was implemented before C++11 made it to MSVC.
Comment #7 by kinke — 2018-06-01T22:42:41Z
Well it doesn't make too much sense, and even less so to do it for methods only; I rather think this is some legacy COM compatibility stuff. I did a few tests, and it really seems to be specific to methods (instance member functions), which return every struct via sret/RVO, not just non-PODs as regular functions, incl. class member functions. So this is essentially a duplicate of https://issues.dlang.org/show_bug.cgi?id=16987; I thought it was COM-specific back then. I tried fixing it for LDC, but that's sadly not trivial, as the ABI stuff doesn't operate on FuncDeclaration but on TypeFunction, which contains the calling convention but not whether it's a method.
Comment #8 by turkeyman — 2018-06-01T23:31:11Z
So, probability of a prompt fix? Is this just a matter of not having correct information up front, or is the fix complicated by some issue? We have an evaluation here that's kinda been shattered by this, but a fast fix would really help recover the situation if it's possible.
Comment #9 by kinke — 2018-06-02T02:36:51Z
Well I gave it a shot and came up with an ugly hack which might just work, the first test results are promising: https://github.com/ldc-developers/ldc/pull/2720
Comment #10 by turkeyman — 2018-06-04T07:04:24Z
Awesome sauce... sadly, for the immediate moment, I really need a fix in DMD >_<
Comment #11 by r.sagitario — 2018-06-04T07:35:27Z
Comment #12 by github-bugzilla — 2018-06-10T19:04:37Z
Commits pushed to master at https://github.com/dlang/dmd https://github.com/dlang/dmd/commit/20ffec71c10a6f6e1651503dbb63cb1a7015cb20 fix issue 18928 - extern(C++) bad codegen on win64, wrong calling convention With the VC ABI, a member function always returns a struct on the stack no matter its size https://github.com/dlang/dmd/commit/3a79629988efd51d4dda9edb38a6701cd097da89 Merge pull request #8330 from rainers/issue_18928 fix issue 18928 - extern(C++) bad codegen on win64, wrong calling convention merged-on-behalf-of: Mathias LANG <[email protected]>