Bug 10763 – (&x)[0 .. 1] doesn't work in CTFE

Status
NEW
Severity
enhancement
Priority
P4
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2013-08-05T15:42:46Z
Last change time
2024-12-13T18:10:08Z
Keywords
CTFE
Assigned to
No Owner
Creator
Nils
Moved to GitHub: dmd#18644 →

Comments

Comment #0 by nilsbossung — 2013-08-05T15:42:46Z
static assert({ int x; int[] a = (&x)[0 .. 1]; return true; }()); Error: pointer & x cannot be sliced at compile time (it does not point to an array)
Comment #1 by clugdbug — 2013-08-12T14:16:59Z
This restriction is intentional. It's a consequence of strictly enforcing C's pointer arithmetic rules. You can only slice a pointer that you can perform pointer arithmetic on. Where x is a variable, C does not guarantee that &x + 1 is a legal address. (For example, it might be 0, if x is at the end of the address space). (Enforcing C's pointer arithmetic enormously simplifies the implementation. Allowing this would create a huge number of special cases).
Comment #2 by timon.gehr — 2013-08-12T14:59:00Z
What kind of special cases? (The above code works in my own implementation.)
Comment #3 by clugdbug — 2013-08-12T17:13:15Z
It's basically the same as issue 10266. The corner cases arise if you still disallow &x + 1. My guess is that you're allowing it in your implementation? The problem with allowing it is that we're departing from C. And there's annoying things like: // global scope int x; int *p = &x + 1; // points to junk! - must not compile Is there really a use case for this unsafe behaviour?
Comment #4 by timon.gehr — 2013-08-12T19:01:50Z
(In reply to comment #3) > It's basically the same as issue 10266. Issue 10266 additionally requests allowing reinterpret-casts between T* and T[1]* (my implementation currently rejects this, but allowing it would be easy.) > The corner cases arise if you still disallow &x + 1. My guess is that you're > allowing it in your implementation? > ... Yes, but dereferencing it is an error. Subtracting one results in the address of x. > The problem with allowing it is that we're departing from C. Does C actually disallow adding 0 to a pointer to a local variable? That's what the example is doing. Furthermore, I don't see what the restriction buys in terms of implementation effort. Every program can be rewritten to only contain arrays. > And there's annoying things like: > > // global scope > int x; > int *p = &x + 1; // points to junk! - must not compile > Agreed, but I think this is not closely related. DMD already allows creating invalid addresses in CTFE by other means. > > Is there really a use case for this unsafe behaviour? Make more code CTFE-able.
Comment #5 by clugdbug — 2013-08-19T02:37:33Z
(In reply to comment #4) > (In reply to comment #3) > > It's basically the same as issue 10266. > > Issue 10266 additionally requests allowing reinterpret-casts between T* and > T[1]* (my implementation currently rejects this, but allowing it would be > easy.) > > > The corner cases arise if you still disallow &x + 1. My guess is that you're > > allowing it in your implementation? > > ... > > Yes, but dereferencing it is an error. Subtracting one results in the address > of x. That is not the issue. The problem is that in C, simply creating the pointer is undefined behaviour. No dereferencing is involved. Note that is undefined behaviour, it's not even implementation-specific behaviour! Simply storing an invalid pointer into a pointer register may generate a hardware exception on some systems. In C, you are not permitted to do pointer arithmetic unless the pointer points to an array, or one-past-the-end-of-an-array. > > The problem with allowing it is that we're departing from C. > > Does C actually disallow adding 0 to a pointer to a local variable? That's what > the example is doing. I'm not sure if that's legal or not. I suspect not, though I think it would always work in practice. But adding 1 to a pointer to a local variable is definitely illegal, and there are systems where it will not work. So the end of the slice is problematic. > > Is there really a use case for this unsafe behaviour? > > Make more code CTFE-able. But it's undefined behaviour.
Comment #6 by timon.gehr — 2013-08-19T03:25:14Z
(In reply to comment #5) > (In reply to comment #4) > > (In reply to comment #3) > > ... > > > The corner cases arise if you still disallow &x + 1. My guess is that you're > > > allowing it in your implementation? > > > ... > > > > Yes, but dereferencing it is an error. Subtracting one results in the address > > of x. > > That is not the issue. The problem is that in C, simply creating the pointer is > undefined behaviour. I guess I'll update my implementation eventually to disallow this. (Other related limitations are that it currently allows escaping addresses to locals and simply closes over them, array appends may cause non-determinism and pointers can be freely compared.) > ... > > > Is there really a use case for this unsafe behaviour? > > > > Make more code CTFE-able. > > But it's undefined behaviour. There is not really a reason why (&x)[0..1] should be UB. But I guess if you want to keep C behaviour and also keep the invariant that slices always point to arrays, this is indeed not fixable.
Comment #7 by ibuclaw — 2013-08-19T03:42:54Z
(In reply to comment #3) > It's basically the same as issue 10266. > The corner cases arise if you still disallow &x + 1. My guess is that you're > allowing it in your implementation? > > The problem with allowing it is that we're departing from C. And there's > annoying things like: > > // global scope > int x; > int *p = &x + 1; // points to junk! - must not compile > > > Is there really a use case for this unsafe behaviour? Only one would be in std.math if we want to make the elementary functions CTFE-able (we've discussed this before). But yes, I think that it is right to disallow it, as there is no clean way to slice up basic types into an array and guarantee ie: format or endian correctness at compile time (cross-compilers, for instance).
Comment #8 by clugdbug — 2013-08-19T04:48:39Z
(In reply to comment #7) > (In reply to comment #3) > > It's basically the same as issue 10266. > > The corner cases arise if you still disallow &x + 1. My guess is that you're > > allowing it in your implementation? > > > > The problem with allowing it is that we're departing from C. And there's > > annoying things like: > > > > // global scope > > int x; > > int *p = &x + 1; // points to junk! - must not compile > > > > > > Is there really a use case for this unsafe behaviour? > > Only one would be in std.math if we want to make the elementary functions > CTFE-able (we've discussed this before). That's why my proposed solution for that is to allow only the complete expression, where the pointer is instantly dereferenced: (cast(ulong *)cast(void *)&f)[0]; and it really only needs to be allowed for 80-bit reals, since casting float<->int and double<->long is already supported. The minimal operations are: - significand <-> ulong - sign + exponent <-> ushort That would give us four special-case hacks which are x87 specific. Effectively they are intrinsics with ugly syntax. The existing code could be modified slightly to only use those four operations, with no performance penalty. > But yes, I think that it is right to disallow it, as there is no clean way to > slice up basic types into an array and guarantee ie: format or endian > correctness at compile time (cross-compilers, for instance). It's an ugly area.
Comment #9 by ibuclaw — 2013-08-19T05:29:09Z
(In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #3) > > > It's basically the same as issue 10266. > > > The corner cases arise if you still disallow &x + 1. My guess is that you're > > > allowing it in your implementation? > > > > > > The problem with allowing it is that we're departing from C. And there's > > > annoying things like: > > > > > > // global scope > > > int x; > > > int *p = &x + 1; // points to junk! - must not compile > > > > > > > > > Is there really a use case for this unsafe behaviour? > > > > Only one would be in std.math if we want to make the elementary functions > > CTFE-able (we've discussed this before). > > That's why my proposed solution for that is to allow only the complete > expression, where the pointer is instantly dereferenced: > > (cast(ulong *)cast(void *)&f)[0]; > > and it really only needs to be allowed for 80-bit reals, since casting > float<->int and double<->long is already supported. > And (speaking as someone who stubbed out your implementation of float<->int and double<->long cast) the only reason why it's supported is because the backend I implement against can (thankfully) do re-interpreted native casts between basic types such as integer, float, complex and vectors. You will need to support all reals that have support in std.math. This includes 64-bit, 80-bit, 96-bit (really just 80-bit), 128-bit (likewise), and 128-bit (quadruple). There are only three supported formats really... (double-double will have to keep with partial support for the time being, sorry PPC!) > The minimal operations are: > - significand <-> ulong > - sign + exponent <-> ushort > > That would give us four special-case hacks which are x87 specific. Effectively > they are intrinsics with ugly syntax. > I veto any new addition that is x87 specific - or, more accurately endian specific. Remember its: version(BigEndian) short sign_exp = (cast(ushort*)&x)[0]; else short sign_exp = (cast(ushort*)&x)[5];
Comment #10 by ibuclaw — 2013-08-19T07:44:59Z
(In reply to comment #9) > > I veto any new addition that is x87 specific - or, more accurately endian > specific. > > Remember its: > > version(BigEndian) > short sign_exp = (cast(ushort*)&x)[0]; > else > short sign_exp = (cast(ushort*)&x)[5]; Wrote a quick toy routine to paint real->ushort[real.sizeof/2] (based on backend routine that interprets a value as a vector). --- pseudo code --- Expression* e = RealExp(42.0L); size_t len = native_encode_expr(e, buffer); (gdb) p buffer "\000\000\000\000\000\000\000\250\004@\000\000\000\000\000\000`\365f\001\000\000\000\000\300\341\377\377\377\177\000\000\000\000\000\000\00 0\000\000\000\023\340Z\000\000\000\000\000`As\001\000\000\000\000\006\000\000\000\000\000\000" tree cst = native_interpret_array (TypeSArray(ushort, 8), buffer, len); (gdb) p debug_tree(cst) {[0]=0, [1]=0, [2]=0, [3]=43008, [4]=16388, [5]=0, [6]=0, [7]=0} --- OK, lets check this output against run-time results. --- writeln(*cast(ushort[8]*)(&x)); => [0, 0, 0, 43008, 16388, 0, 32672, 0] Which looks like at a first glance that the real->ushort[real.sizeof/2] conversion isn't correct... up until the point you realise that the '32672' value is just garbage in padding. So... this might be very well doable, but will have to be *extremely* careful about it. Also, I'm assuming that CTFE is able to get values from constant static arrays?
Comment #11 by ibuclaw — 2013-08-19T10:22:55Z
(In reply to comment #10) > > So... this might be very well doable, but will have to be *extremely* careful > about it. Also, I'm assuming that CTFE is able to get values from constant > static arrays? Adapted code so that it does the following: real <-> ushort[8]: RealExp <-> VectorExp(ushort[8]) <-> ArrayLiteralExp(ushort[8]) Result? --- ushort[8] foo(real x) { return *cast(ushort[8]*)(&x); } real bar(ushort[8] x) { return *cast(real*)(&x); } pragma(msg, foo(42.0L)); pragma(msg, bar(foo(42.0L))); static assert(foo(42.0L) == [0,0,0,43008,16388,0,0,0]); static assert(bar(foo(42.0L)) == 42.0L); pragma(msg, "Success!"); --- $ gdc -c paint.d [cast(ushort)0u, cast(ushort)0u, cast(ushort)0u, cast(ushort)43008u, cast(ushort)16388u, cast(ushort)0u, cast(ushort)0u, cast(ushort)0u] 4.2e+1 Success! Only downside is that it is restricted to T[x].sizeof == real.sizeof. So real<->ulong[2] only works with 128bit reals on 64bit, but could look into getting around that later... Don, I think I'm ready to test trial this in GDC if you are willing to implement this in DMD? Regards Iain.
Comment #12 by ibuclaw — 2013-10-10T04:26:12Z
(In reply to comment #11) > > Don, I think I'm ready to test trial this in GDC if you are willing to > implement this in DMD? > Added support in GDC (but no front-end support) in case you want to go down this route. https://github.com/D-Programming-GDC/GDC/commit/262a5bd22754e0fa8176c1cef523bde33d1559df
Comment #13 by robert.schadek — 2024-12-13T18:10:08Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/18644 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB