Bug 21690 – Unable to dynamic cast extern(C++) classes

Status
NEW
Severity
critical
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2021-03-08T20:32:14Z
Last change time
2024-12-13T19:15:01Z
Keywords
C++, safe
Assigned to
No Owner
Creator
thomas.bockman
See also
https://issues.dlang.org/show_bug.cgi?id=23957
Moved to GitHub: dmd#19881 →

Comments

Comment #0 by thomas.bockman — 2021-03-08T20:32:14Z
Since 2.067.1, the program below outputs: D cast was dynamic. C++ cast was static. The extern(C++) class casts need to be dynamic, just like the extern(D) casts: /////////////////////////////////////////////// module app; import std.stdio; @safe: extern(D) { abstract class DA { char foo(); } class DB : DA { override char foo() { return 'B'; } } class DC : DA { override char foo() { return 'C'; } } } extern(C++) { abstract class CA { char foo(); } class CB : CA { override char foo() { return 'b'; } } class CC : CA { override char foo() { return 'c'; } } } void main() { DB db = new DB; DA da = db; DC dc = cast(DC) da; writeln((dc is null)? "D cast was dynamic." : "D cast was static."); CB cb = new CB; CA ca = cb; CC cc = cast(CC) ca; writeln((cc is null)? "C++ cast was dynamic." : "C++ cast was static."); return; } /////////////////////////////////////////////// This is a critical bug, as the results of accessing an incorrectly typed class object could, in the general case, be almost anything, including memory corruption.
Comment #1 by kinke — 2021-03-08T23:38:05Z
The C++ vtable emitted by D does not contain any pointer to any typeinfo AFAIK. Of course we don't generate the C++ typeinfo anyway, but we do generate the D one (for use with the D GC). At least some C++ implementations store the (C++) typeinfo pointer at vtbl index -1. But then again, when we are confronted with some C++ class ref, we have no idea whether it was instantiated on the D side or C++ side and so which vtable its vptr points to. So while getting proper dynamic casts working is probably quite hard, we could probably disallow downcasts like this and require an explicit static cast (`cast(CC) cast(void*) ca`)?
Comment #2 by thomas.bockman — 2021-03-09T00:16:20Z
(In reply to kinke from comment #1) > So while getting proper dynamic casts working is probably quite hard, we > could probably disallow downcasts like this and require an explicit static > cast (`cast(CC) cast(void*) ca`)? That would be the bare minimum, to prevent people from unknowingly shooting themselves in the foot with this. It's an egregious hole in @safe as it stands now, a violation of the principle of least surprise even in @system code, and potentially a major security issue anywhere. So, my recommendation is to make extern(C++) dynamic casts a compile-time error as soon as possible, and notify potential users through the announce forum that anyone who might have dynamic extern(C++) class casts in their code needs to update their compiler as soon as possible to identify latent security issues. Once that has been done the importance of this bug can be dropped down from "critical" to "enhancement". However, it would be really nice to get proper dynamic casts working somehow, as they are a rather fundamental feature of class systems, and extern(C++) is required for C++ interop, and for -betterC (my use case). Additionally, people sometimes prefer extern(C++) classes simply because they are lighter weight than extern(D) without the monitor pointer and certain virtual functions which are unwanted for attribute reasons: https://forum.dlang.org/post/[email protected]
Comment #3 by thomas.bockman — 2021-03-09T00:31:59Z
(In reply to kinke from comment #1) > The C++ vtable emitted by D does not contain any pointer to any typeinfo > AFAIK. Of course we don't generate the C++ typeinfo anyway, but we do > generate the D one (for use with the D GC). At least some C++ > implementations store the (C++) typeinfo pointer at vtbl index -1. But then > again, when we are confronted with some C++ class ref, we have no idea > whether it was instantiated on the D side or C++ side and so which vtable > its vptr points to. Does that mean that passing a D-initialized extern(C++) class instance to C++ code breaks dynamic casting on the C++ side, too? If so, that is a critical issue, as well. Given that all current D compilers are also C++ compilers, would it be possible to just auto-generate an actual C++ function that performs the dynamic_cast<>, and then call (and hopefully inline) that function from the D side?
Comment #4 by kinke — 2021-03-09T10:09:39Z
(In reply to thomas.bockman from comment #3) > Does that mean that passing a D-initialized extern(C++) class instance to > C++ code breaks dynamic casting on the C++ side, too? AFAIK, yes. > Given that all current D compilers are also C++ compilers They most definitely aren't - they are D compilers, with the ability to interop with C++ in many cases, but quite obviously not in all aspects. See point 33.1 and 33.13 on https://dlang.org/spec/cpp_interface.html.
Comment #5 by thomas.bockman — 2021-03-09T18:54:18Z
(In reply to kinke from comment #4) > (In reply to thomas.bockman from comment #3) > > Given that all current D compilers are also C++ compilers > > They most definitely aren't - they are D compilers, with the ability to > interop with C++ in many cases, but quite obviously not in all aspects. See > point 33.1 and 33.13 on https://dlang.org/spec/cpp_interface.html. I was thinking of the common heritage with dmc, clang, and g++, but I will, of course, take your word for it that my suggestion is not helpful.
Comment #6 by thomas.bockman — 2021-03-09T19:11:35Z
(In reply to kinke from comment #4) > (In reply to thomas.bockman from comment #3) > > Does that mean that passing a D-initialized extern(C++) class instance to > > C++ code breaks dynamic casting on the C++ side, too? > > AFAIK, yes. This is a very serious problem. It sounds like D will turn any attempt by C++ code to dynamically cast a D-allocated class instance into a buffer overrun-like bug by trying to access meta-data that isn't actually present, and also possibly by subsequently mis-typing the cast reference (like D does) if the program doesn't just crash in the first step. APIs generally do not advertise whether they attempt dynamic casts internally or not, so there is no way to know what C++ libraries are affected by this bug, short of auditing their source code, which may not even be publicly available. Making extern(C++) dynamic casts a compile-time error in D code is an acceptable, albeit disappointing, solution for D code. But, it won't solve the problem of D code poisoning linked C++ code. Is there ANY way to solve that problem, short of fully implementing this feature on the D side in a compatible way? The only other way I can think of would be to forbid initialization of extern(C++) classes on the D side entirely, at least in @safe code. I suspect that would be a massive and frustrating breaking change for many people, though.
Comment #7 by dfj1esp02 — 2021-03-10T09:07:59Z
Calypso and dpp are the only hybrid D/C++ compilers around. I guess, it won't be too difficult to put a null pointer in C++ typeinfo place as a sentinel.
Comment #8 by thomas.bockman — 2021-03-10T19:42:04Z
(In reply to anonymous4 from comment #7) > Calypso and dpp are the only hybrid D/C++ compilers around. I knew about Calypso, but couldn't remember its name. Thanks for reminding me. > I guess, it won't be too difficult to put a null pointer in C++ typeinfo > place as a sentinel. Good idea. If we can confirm by testing that this really turns unsafe dynamic_casts on the C++ side into null pointer exceptions / segmentation faults, then that plus a compile-time error on the D side should mitigate the security and memory safety issues well enough to drop this from "critical" to "enhancement".
Comment #9 by b2.temp — 2021-03-11T00:22:20Z
dynamic cast of c++ classes can work to a **very** limited extent that is if the target type is **exactly** the same (i.e not one of its derivate). This can be done by comparing the virtual table address of the instance with the virtual table address of the target type. --- extern(C++) class A { void f(){}} extern(C++) class B : A { override void f(){} } extern(C++) class C : A { } extern(C++) T dyncast(T, U)(U u) { const void* u_vtbl = *(cast(void**) u); const void* t_vtbl = &T.classinfo.vtbl[0]; return t_vtbl is u_vtbl ? cast(T) u : null; } void main(string[] args) { B b = new B; A a = b; C c = new C; assert( dyncast!(B)(a) ); assert( !dyncast!(B)(c) ); } --- But in now way D CastExp does that for now.
Comment #10 by thomas.bockman — 2021-03-11T02:17:03Z
(In reply to Basile-z from comment #9) > dynamic cast of c++ classes can work to a **very** limited extent that is if > the target type is **exactly** the same (i.e not one of its derivate). This > can be done by comparing the virtual table address of the instance with the > virtual table address of the target type. Cool! That actually covers a lot of use cases, and is better than nothing. Maybe I'll publish a dub package with a more fleshed-out version of it eventually. However, it seems a little too unpredictable and limited to be worth implementing in the compiler itself, given that kinke mentioned earlier that two instances of the same class may still end up pointing to different vtable addresses if they were instantiated on different sides of the language barrier: (kinke from comment #1) > But then again, when we are confronted with some C++ class ref, we have no idea > whether it was instantiated on the D side or C++ side and so which vtable its > vptr points to.
Comment #11 by b2.temp — 2021-03-11T08:27:17Z
Very hypothetically, let's say that if "super" calls are supported in C++ classes then this means that maybe the super virtual table is stored somewhere. If so then full dynamic casts would be possible. Someone needs to study a bit more the ABI of extern(C++) classes in D. Then maybe a dedicated dyncast() could be added to druntime.
Comment #12 by b2.temp — 2021-06-07T10:10:51Z
(In reply to Basile-z from comment #11) > Very hypothetically, let's say that if "super" calls are supported in C++ > classes then this means that maybe the super virtual table is stored > somewhere. If so then full dynamic casts would be possible. > this supplemental comment was BS. Obviously `super` does not use the vtable. > Someone needs to study a bit more the ABI of extern(C++) classes in D. > Then maybe a dedicated dyncast() could be added to druntime.
Comment #13 by b2.temp — 2021-06-07T10:40:49Z
(In reply to thomas.bockman from comment #10) > However, it seems a little too unpredictable and limited to be worth > implementing in the compiler itself, given that kinke mentioned earlier that > two instances of the same class may still end up pointing to different > vtable addresses if they were instantiated on different sides of the > language barrier: > > (kinke from comment #1) > > But then again, when we are confronted with some C++ class ref, we have no idea > > whether it was instantiated on the D side or C++ side and so which vtable its > > vptr points to. Major problem indeed.
Comment #14 by qs.il.paperinik — 2023-06-01T14:10:18Z
(In reply to kinke from comment #1) > So while getting proper dynamic casts working is probably quite hard, we > could probably disallow downcasts like this and require an explicit static > cast (`cast(CC) cast(void*) ca`)? In C++ terminology at least, this isn’t a `static_cast`, but a `reinterpret_cast`. It works when you only have single inheritance (counting interfaces as inheritance as well). As soon as any form of multiple inheritance (including interfaces) enters the picture, `cast(void*)` won’t do what you want. There are simple, but relevant, pointer value adjustments at play. Example: ```d import std.stdio; extern(C++) abstract class B { int x; abstract void f(); } extern(C++) interface I { void g(); } extern(C++) class D : B, I { override void f() => writeln("f"); override void g() => writeln("g"); } void main() { D d = new D; void* bptr = cast(void*) cast(B) d; void* iptr = cast(void*) cast(I) d; assert(bptr !is iptr); // passes void* vptr = cast(void*) d; B b = cast(B) vptr; I i = cast(I) vptr; b.f(); // prints f as expected i.g(); // should print g, on run.dlang.org, prints f as well } ``` The `extern(C++)` has little to do with it; with `extern(D)` the problems persist. As far as I know, for class handles, it’s syntactically impossible to express a C++ `static_cast` in D; it’s a `dynamic_cast` or – if you go via `void*` – it’s a `reinterpret_cast`. Of course, a derived-to-base `dynamic_cast` can be optimized to be a `static_cast`, but there’s no way to tell the compiler: I *know* this base-class object is in fact a derived object, so adjust the pointer accordingly (and don’t check TypeInfo or whatever and give me UB if I’m wrong). A simple solution would be to introduce `static cast(Type) UnaryExpression` for that. It would be valid only if `Type` and the type of `UnaryExpression` are both a class or interface type, which have a unique inheritance relationship.
Comment #15 by dlang-bot — 2023-06-03T10:59:20Z
@ntrel created dlang/dmd pull request #15293 "Disallow dynamic cast on extern(C++) class" mentioning this issue: - Disallow dynamic cast on extern(C++) class See Issue 21690 - Unable to dynamic cast extern(C++) classes. https://github.com/dlang/dmd/pull/15293
Comment #16 by robert.schadek — 2024-12-13T19:15:01Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/19881 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB