Bug 8887 – static arrays passed by value in extern C/C++ functions should not compile

Status
REOPENED
Severity
critical
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-10-24T13:01:56Z
Last change time
2024-12-13T18:02:05Z
Keywords
accepts-invalid, industry
Assigned to
Andrej Mitrovic
Creator
Andrej Mitrovic
Moved to GitHub: dmd#18480 →

Comments

Comment #0 by andrej.mitrovich — 2012-10-24T13:01:56Z
extern(C) void fail(int[4] x); extern(C) int[4] fail2(); extern(C++) void fail3(int[4] x); extern(C++) int[4] fail4(); These should fail since C/C++ compilers always pass arrays by pointer to first element.
Comment #1 by andrej.mitrovich — 2012-11-04T16:32:34Z
Comment #2 by bearophile_hugs — 2012-11-04T16:48:28Z
From Walter: https://github.com/D-Programming-Language/dmd/pull/1215#issuecomment-9770031 > This change breaks existing code, What existing code? (And the D1 code that relies on is now broken). > requires an awkward workaround for existing uses, Even if they exist they are niche usages, right? It's better to keep the D semantics saner/cleaner. As Python Zen says, few special cases are not special enough to justify this hole in the D type system. In D it's important not just to fix holes in the design of C, but also holes coming from changes between D1 and D2. I think this type system hole was an oversight while D2 fixed-sized array semantics argument pass changed. > and has only a marginal benefit. The benefit is helping all future C programmers that don't remember that D passes fixed-sized arrays by value, for years and years to come. From my experience I've seen that lot of people don't remember that. Those C programmers will write extern(C) and they will come in the D.learn group to ask why their code doesn't work. I will try to keep count how many such posts will appear in D.learn. One of the design rules of D is that D acts as C or it gives an error. In this case it silently passes wrong data to C functions, so here there's a breaking of the D design rules. So I think this hole should be fixed.
Comment #3 by public — 2014-11-07T16:11:13Z
Reopining it and marking as critical regression issue. Also with "industry" keyword. This is regression from D1 behaviour that makes all of our existing extern(C) bindings segfault when compiled with D2 compiler. Only workaroun possible to replace it with pointer arguments everywhere, losing important chunk of type safety in process. However I don't like compilation failure here either. extern(C) functions must comply to C ABI and thus pass static arrays by pointer even if normal D functions do that by value. Walter original reasoning about breaking original code is completely unapplicable here as any code that currently compiles with this pattern will segfault at runtime anyway. There can't be any legit working code out there to break.
Comment #4 by schveiguy — 2014-11-07T16:24:29Z
(In reply to Dicebot from comment #3) > However I don't like compilation failure here either. extern(C) functions > must comply to C ABI and thus pass static arrays by pointer even if normal D > functions do that by value. Wouldn't ref work? It seems to work for me. -Steve
Comment #5 by schveiguy — 2014-11-07T16:25:46Z
(In reply to Steven Schveighoffer from comment #4) > (In reply to Dicebot from comment #3) > > > However I don't like compilation failure here either. extern(C) functions > > must comply to C ABI and thus pass static arrays by pointer even if normal D > > functions do that by value. > > Wouldn't ref work? It seems to work for me. Sorry, I quoted the wrong part of your post, it should have been: (In reply to Dicebot from comment #3) > This is regression from D1 behaviour that makes all of our existing > extern(C) bindings segfault when compiled with D2 compiler. Only workaroun > possible to replace it with pointer arguments everywhere, losing important > chunk of type safety in process.
Comment #6 by bearophile_hugs — 2014-11-07T17:26:36Z
(In reply to Dicebot from comment #3) > However I don't like compilation failure here either. extern(C) functions > must comply to C ABI and thus pass static arrays by pointer even if normal D > functions do that by value. I think that a compilation failure is better because it avoids to introduce a special case in the D language. Special cases cause problems, even when they look a little handy.
Comment #7 by public — 2014-11-07T19:23:22Z
The fact that `ref` parameters are allowed for extern(C) functions is even more bizzare than the fact that static array arguments crash programs. Need to check if it works, I would have never guessed to even try it. In general we have "looks like C, works like C" slogan. My point is that this is especially important with function _explicitly_ marked as "should work like C" - any mismatch between extern(C) ABI and real C ABI is an important bug.
Comment #8 by public — 2014-11-07T19:25:08Z
(In reply to bearophile_hugs from comment #6) > I think that a compilation failure is better because it avoids to introduce > a special case in the D language. Special cases cause problems, even when > they look a little handy. There is no special case here - extern(C) by definition must implement C ABI of argument passing. It is simply out of D language domain ones arguments are passed.
Comment #9 by bearophile_hugs — 2014-11-07T19:33:23Z
(In reply to Dicebot from comment #8) > There is no special case here - extern(C) by definition must implement C ABI > of argument passing. It is simply out of D language domain ones arguments > are passed. D fixed-size arrays are values. C language lacks those values, so I think D doesn't have a C ABI to respect.
Comment #10 by public — 2014-11-07T19:36:11Z
C has static arrays. C allows passing static arrays as function arguments. HOW it is passing them is defined by ABI and D rules are irrelevant here.
Comment #11 by schveiguy — 2014-11-07T20:10:39Z
The issue here is -- we use the non-mangling "feature" of C calls to override type checking inside druntime. So even if something is extern(C) it can actually be implemented in D. That function may never need to be called or used in C code at all. Why shouldn't D support ref for C? All it is doing is auto-taking the address of the parameter, which maps perfectly to accepting a pointer or array argument. I think the important pieces of the C ABI to implement are where parameters go, and the lack of name mangling. Other than that, how to pass certain types is more fuzzy. As another example, on a 32-bit system, C's long is 32-bit. But D's long is 64 bit. What to do here? extern(C) foo(long x); Clearly, it's not too important the type of x, but how big it actually is. This leads one to simply type it as int instead, and move on. This hasn't really caused tremendous issues or difficulties. I think really the crux of the push for this bug is more that D2 behaves differently vs. D1 than D2 implements the C ABI incorrectly. Note, I remember the ref trick when we moved to passing static arrays by value, for the system call pipe. This takes an int[2]. To switch this to a pointer not only breaks existing code, but we lose the type requirement that it should be exactly 2 elements wide. The ref solution worked, so we employed it, and I think it was the right move.
Comment #12 by public — 2014-11-07T20:21:46Z
(In reply to Steven Schveighoffer from comment #11) > The issue here is -- we use the non-mangling "feature" of C calls to > override type checking inside druntime. So even if something is extern(C) it > can actually be implemented in D. That function may never need to be called > or used in C code at all. This is abuse and needs to be fixed. We have pragma(mangle) to override mangling. http://dlang.org/attribute.html#linkage is pretty clear on legitimate use cases this feature was designed for. > Why shouldn't D support ref for C? All it is doing is auto-taking the > address of the parameter, which maps perfectly to accepting a pointer or > array argument. Exactly because it is special case - it takes a language feature that does not have clear mapping to C feature and makes use of ABI details to work it as-is. It is not bad on its own but quite puzzling. > As another example, on a 32-bit system, C's long is 32-bit. But D's long is > 64 bit. What to do here? > > extern(C) foo(long x); Ideally this should use C interpretation too but I recognize how it is totally impractical. This is not the case for static array arguments. > Note, I remember the ref trick when we moved to passing static arrays by > value, for the system call pipe. This takes an int[2]. To switch this to a > pointer not only breaks existing code, but we lose the type requirement that > it should be exactly 2 elements wide. The ref solution worked, so we > employed it, and I think it was the right move. You call tweaking own code (that makes use of undefined language area) to address compiler regression a right move?
Comment #13 by schveiguy — 2014-11-07T22:26:15Z
(In reply to Dicebot from comment #12) > (In reply to Steven Schveighoffer from comment #11) > > The issue here is -- we use the non-mangling "feature" of C calls to > > override type checking inside druntime. So even if something is extern(C) it > > can actually be implemented in D. That function may never need to be called > > or used in C code at all. > > This is abuse and needs to be fixed. We have pragma(mangle) to override > mangling. I wasn't aware that we had a C mangling feature. We should change this ASAP, as I don't like the idea of specifying C functions for the sole purpose of escaping typechecking. If we fixed this, perhaps you have a better case for changing this. > http://dlang.org/attribute.html#linkage is pretty clear on legitimate use > cases this feature was designed for. I don't think that's an exhaustive list of the uses. It only identifies one use -- calling existing C or C++ functions. But it can also be used to *implement* functions that are called from C (or D) code. When you implement an extern(C) function in D code, how should it treat a static array parameter? I think it would be more confusing to have this be different based on the extern(C). > > Why shouldn't D support ref for C? All it is doing is auto-taking the > > address of the parameter, which maps perfectly to accepting a pointer or > > array argument. > > Exactly because it is special case - it takes a language feature that does > not have clear mapping to C feature and makes use of ABI details to work it > as-is. It is not bad on its own but quite puzzling. ref is sort of a weird thing to begin with. It automatically takes the address of the parameter, and passes that to the function, then the function dereferences it at every use. It's syntax sugar more than a storage class. And it extends beyond the ABI, because the entire function has to deal with it. > > As another example, on a 32-bit system, C's long is 32-bit. But D's long is > > 64 bit. What to do here? > > > > extern(C) foo(long x); > > Ideally this should use C interpretation too but I recognize how it is > totally impractical. This is not the case for static array arguments. I see it as very similar. Two different interpretations of the same type keyword, in one language you have to tweak it a bit to fit the other, intransigent, existing language. > You call tweaking own code (that makes use of undefined language area) to > address compiler regression a right move? It's not a regression if the breakage is intentional. It was not an accident that the compiler changed static arrays from being passed by reference to passed by value.
Comment #14 by aldacron — 2014-11-08T10:37:55Z
(In reply to Dicebot from comment #7) > The fact that `ref` parameters are allowed for extern(C) functions is even > more bizzare than the fact that static array arguments crash programs. Need > to check if it works, I would have never guessed to even try it. > This is documented. See the section "Passing D Array Arguments to C Functions" at http://dlang.org/interfaceToC.html. If anyone wants examples of existing code that will be broken by this, see Derelict. DerelictODE in particular, in which the matrix and vertex types are typedefed on the C side as fixed-size arrays and passed as such to all the functions that use them. Personally, I don't see the problem with prototyping such C functions with ref. I hope it *doesn't* change.
Comment #15 by public — 2014-11-08T14:29:36Z
(In reply to Steven Schveighoffer from comment #13) > > This is abuse and needs to be fixed. We have pragma(mangle) to override > > mangling. > > I wasn't aware that we had a C mangling feature. We should change this ASAP, > as I don't like the idea of specifying C functions for the sole purpose of > escaping typechecking. If we fixed this, perhaps you have a better case for > changing this. Are there any legit extern(C) declaration in druntime at all or everything needs to be updated? I'll do the PR.
Comment #16 by public — 2014-11-08T14:35:13Z
(In reply to Mike Parker from comment #14) > (In reply to Dicebot from comment #7) > > The fact that `ref` parameters are allowed for extern(C) functions is even > > more bizzare than the fact that static array arguments crash programs. Need > > to check if it works, I would have never guessed to even try it. > > > > This is documented. See the section "Passing D Array Arguments to C > Functions" at http://dlang.org/interfaceToC.html. I don't think complementary articles can be considered part of main spec - need to copy that definitions to `extern(C)` spec itself or at least place a direct link there. > If anyone wants examples of existing code that will be broken by this, see > Derelict. DerelictODE in particular, in which the matrix and vertex types > are typedefed on the C side as fixed-size arrays and passed as such to all > the functions that use them. Personally, I don't see the problem with > prototyping such C functions with ref. I hope it *doesn't* change. I am not proposing to prohibit ref-static-arrays, it would have been too much of a breakage indeed. But situation with plain static array arguments does sound too error-prone and impractical too keep. Think about it that way - because of this issue you can't just take C header and tweak it until it compiles as D code - it can still segfault at runtime after. I will lower importance/priority though because of mentioned workaround with `ref` that makes it possible to preserve API safety
Comment #17 by aldacron — 2014-11-08T16:22:07Z
(In reply to Dicebot from comment #16) > I am not proposing to prohibit ref-static-arrays, it would have been too > much of a breakage indeed. But situation with plain static array arguments > does sound too error-prone and impractical too keep. Think about it that way > - because of this issue you can't just take C header and tweak it until it > compiles as D code - it can still segfault at runtime after. > Yes, I see. I got things mixed up. No argument there. That actually was a problem for me before I realized that ref worked.
Comment #18 by schveiguy — 2014-11-09T18:52:05Z
(In reply to Dicebot from comment #16) > I am not proposing to prohibit ref-static-arrays, it would have been too > much of a breakage indeed. But situation with plain static array arguments > does sound too error-prone and impractical too keep. Think about it that way > - because of this issue you can't just take C header and tweak it until it > compiles as D code - it can still segfault at runtime after. If we can supersede druntime's usage of extern(C) for name mangling, I think I'm OK with making any extern(C) function that accepts a non-ref static array an error. (In reply to Dicebot from comment #15) > Are there any legit extern(C) declaration in druntime at all or everything > needs to be updated? I'll do the PR. I think it's for historical reasons mainly -- the original version of phobos (before druntime) had C implementations of some things, so the compiler always hooked with C mangling. But I think this is a good question to ask the whole community. It could very well be that some people call the functions from C. However, I'm fairly certain most of the extern(C) declarations are simply for name mangling and getting around attribute requirements where needed.
Comment #19 by bugzilla — 2014-11-10T20:30:27Z
Comment #20 by razvan.nitu1305 — 2022-07-14T12:50:20Z
(In reply to Andrej Mitrovic from comment #0) > extern(C) void fail(int[4] x); > extern(C) int[4] fail2(); > extern(C++) void fail3(int[4] x); > extern(C++) int[4] fail4(); > > These should fail since C/C++ compilers always pass arrays by pointer to > first element. The last 2 declarations fails in today's compiler. I guess we can do the same for the first 2.
Comment #21 by robert.schadek — 2024-12-13T18:02:05Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/18480 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB