Bug 17654 – return value incorrectly considered unique when casting to another pointer type

Status
NEW
Severity
normal
Priority
P3
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2017-07-15T05:33:05Z
Last change time
2024-12-13T18:53:22Z
Keywords
accepts-invalid
Assigned to
No Owner
Creator
ag0aep6g
Moved to GitHub: dmd#19282 →

Comments

Comment #0 by ag0aep6g — 2017-07-15T05:33:05Z
Found by Namal in D.learn: http://forum.dlang.org/post/[email protected] Original code: ---- void main() { import std.algorithm; import std.string; char[] line; auto bytes = line.representation.dup; bytes.sort; string result = bytes.assumeUTF; /* should be rejected */ } ---- Reduced to show it's a compiler bug: ---- char[] assumeUTF(ubyte[] str) pure { return cast(char[]) str; } void main() { ubyte[] b = ['a', 'b', 'c']; string s = assumeUTF(b); /* should be rejected */ assert(s == "abc"); /* passes */ b[0] = '!'; assert(s == "abc"); /* fails */ } ---- Another variant to show it's not about arrays or the char type: ---- ubyte* toBytePointer(uint* p) pure { return cast(ubyte*) p; } void main() { uint* i = new uint; immutable ubyte* b = toBytePointer(i); /* should be rejected */ *i = 0xFF_FF_FF_FF; assert(*b != 0xFF); /* fails */ } ----
Comment #1 by schveiguy — 2017-07-15T11:15:10Z
I actually think it's a design problem. assumeUTF is marked pure. The input is ubyte and the output is char. This means the compiler can reasonably assume the output is unrelated to the input and therefore unique. This is quite a pickle. We can't very well unmark it pure, and I think the compiler logic is sound.
Comment #2 by ag0aep6g — 2017-07-15T12:58:51Z
(In reply to Steven Schveighoffer from comment #1) > I actually think it's a design problem. assumeUTF is marked pure. The input > is ubyte and the output is char. This means the compiler can reasonably > assume the output is unrelated to the input and therefore unique. > > This is quite a pickle. We can't very well unmark it pure, and I think the > compiler logic is sound. I don't agree that the compiler logic is sound. The casts are valid. The compiler cannot assume that they don't occur. It even happens with classes (no cast needed): ---- class B { int x; } class C : B {} B toB(C c) pure { return c; } void main() { C c = new C; c.x = 1; immutable B b = toB(c); /* should be rejected */ assert(b.x == 1); /* passes */ c.x = 2; assert(b.x == 1); /* fails */ } ----
Comment #3 by schveiguy — 2017-07-15T14:21:25Z
I'm not sure the UB rules for D and aliasing. In C you definitely can run into things like the array cast being considered unrelated. The class case is definitely a bug.
Comment #4 by ag0aep6g — 2017-07-15T14:48:36Z
(In reply to Steven Schveighoffer from comment #3) > I'm not sure the UB rules for D and aliasing. In C you definitely can run > into things like the array cast being considered unrelated. As far as I know, C's strict aliasing rule isn't exactly uncontroversial. Personally, I think it's an abomination. > The class case is definitely a bug. Even with the strict aliasing rule, there is a type that is allowed to alias others. In C it's char. That would be ubyte in D, I guess. The non-class examples all involve ubyte. So even with C-like strict aliasing, they should be rejected.
Comment #5 by schveiguy — 2017-07-15T15:27:09Z
Then the example could be changed to wchar and ushort
Comment #6 by robert.schadek — 2024-12-13T18:53:22Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/19282 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB