Bug 16394 – TypeInfo.init() for static arrays returns single element instead of whole array
Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2016-08-15T21:42:00Z
Last change time
2016-08-30T11:35:40Z
Keywords
industry, pull, spec
Assigned to
nobody
Creator
eyal
Comments
Comment #0 by eyal — 2016-08-15T21:42:51Z
unittest {
void prop(T)() {
assert(T.sizeof == typeid(T).tsize, "typeid(" ~ T.stringof ~ ").tsize broken");
assert(T.sizeof == typeid(T).init().length, "typeid(" ~ T.stringof ~ ").init broken");
}
prop!int;
prop!(int[3]);
}
Crashes for the 4th assertion, because init() returns a void[] array of size 4 instead of 12.
Comment #1 by eyal — 2016-08-15T22:25:13Z
As a result, std.algorithm:initializeAll is also buggy:
import std.stdio;
unittest {
struct Int {
~this() {}
int x = 3;
}
import std.algorithm : initializeAll;
Int[2] xs = [Int(1), Int(2)];
struct R {
bool done;
bool empty() { return done; }
ref Int[2] front() { return xs; }
void popFront() { done = true; }
}
writeln(xs);
initializeAll(R());
writeln(xs);
}
Prints out:
[Int(1, 7FE7FED92000), Int(2, 7FE7FED92000)]
[Int(3, null), Int(0, 73)]
The second field being printed for Int seems like *yet another* bug.
Comment #2 by schveiguy — 2016-08-15T22:26:05Z
It has been this way since the beginning (traced it back to this commit: https://github.com/dlang/druntime/commit/1532910c769ab718a528c94c9157fecebd753c66)
Looking at the code, there really isn't a way to fix it without altering the Object API or the compiler, both of which would break a lot of things.
We could have the compiler store the entire static array data into the data section. This seems wasteful to say the least.
Is there not a way to handle TypeInfo_StaticArray specially? As far as I know, it's going to be the only type that behaves this way.
Alternatively, you can check tsize() to see what the real size should be, and then if that doesn't match initializer().length (use initializer instead of init(), as init() is deprecated), then replicate if your plan is to initialize to the initializer value.
Comment #3 by eyal — 2016-08-15T22:36:33Z
The workaround is pretty clear.
But this is nasty! Anyone who uses typeid(T).init in a generic way (not caring what T is) is going to be broken.
It would be better to have typeid(staticArr).init throw a compile-time error than to return the wrong result.
We've spent many hours reproducing and chasing this bug in a production build.
Additionally, x = x.init would work without blowing up the stack when x is large, and then accessing typeid() would be unnecessary.
Comment #4 by schveiguy — 2016-08-15T22:44:48Z
typeid(T).initializer is not meant to be used without understanding what it does in all cases. There are other subtle issues with it.
I think this is just a documentation issue. I can see where this would cause problems :(
(In reply to Eyal from comment #3)
> Additionally, x = x.init would work without blowing up the stack when x is
> large, and then accessing typeid() would be unnecessary.
I'm not sure what this means.
Comment #5 by eyal — 2016-08-15T22:51:00Z
I disagree, a function which has subtle corner cases that cause cryptic bugs is never a "documentation issue". Maybe if it were called "unsafe_init_val" -- you'd know that documentation is crucial here. Otherwise you go on your merry way until you explode.
I've seen only 2 uses of the init value on generic types, in our production code and in std.lib.algorithm:initailizeAll. *Both of these cases* had cryptic, terrible bugs due to this behavior of typeid(T).init! If D's standard library can't get the use of typeid().init right, is it a legitimate function to expose?
> > Additionally, x = x.init would work without blowing up the stack when x is
> > large, and then accessing typeid() would be unnecessary.
> I'm not sure what this means.
The only reason we need to use typeid(x).init() is because we cannot really do:
x = x.init;
The reason is that this kind of assignment always goes through a stack allocation. When x.init is a large value, it unnecessarily allocates a huge chunk of stack. Our stacks are of limited size (fibers). This explodes.
So we work around it using typeid(x).init() and explicit memory copies (similar to what initializeAll does).
If the compiler had properly implemented x = x.init without superfluous huge stack allocations, we'd have no trouble because we wouldn't need to use typeid().init() in the first place (though perhaps this is an LDC-specific problem)
Comment #6 by schveiguy — 2016-08-15T22:56:31Z
I mean it's a doc issue in that the documentation doesn't reflect what initializer() actually does (and has always done). There isn't much to say except that I don't think we can change the behavior at this point without breaking things.
It's been this way since 2009, so most people don't use it or care about it, or they would have hit this issue before. I realize we can't get your time back looking for this issue, but I think at this point, the best thing to do is fix the docs and fix any code that was done using this incorrect assumption.
(In reply to Eyal from comment #5)
> The reason is that this kind of assignment always goes through a stack
> allocation. When x.init is a large value, it unnecessarily allocates a huge
> chunk of stack. Our stacks are of limited size (fibers). This explodes.
I think this is a legitimate improvement to suggest.
Comment #7 by ag0aep6g — 2016-08-15T23:05:03Z
(In reply to Steven Schveighoffer from comment #6)
> I mean it's a doc issue in that the documentation doesn't reflect what
> initializer() actually does (and has always done).
I.e., it's a long-standing bug.
> There isn't much to say
> except that I don't think we can change the behavior at this point without
> breaking things.
Pretty much every bug fix can be considered a breaking change. I don't think we need to live with this, just because someone may be relying on behavior that's clearly going against documentation.
> It's been this way since 2009, so most people don't use it or care about it,
> or they would have hit this issue before.
If they don't use it, their code won't get broken by the fix.
> I realize we can't get your time
> back looking for this issue, but I think at this point, the best thing to do
> is fix the docs and fix any code that was done using this incorrect
> assumption.
I disagree. I think fixing the code to behave as documented is the way to go.
Comment #8 by schveiguy — 2016-08-16T00:04:26Z
(In reply to ag0aep6g from comment #7)
> I.e., it's a long-standing bug.
Or functioning as designed, just not properly documented.
> Pretty much every bug fix can be considered a breaking change. I don't think
> we need to live with this, just because someone may be relying on behavior
> that's clearly going against documentation.
This part of TypeInfo is meant to be used internally. I don't think this corner case warrants a compiler change, as the current state is completely usable.
> If they don't use it, their code won't get broken by the fix.
Every time we change the compiler, it may break other things. I don't think it's worth the risk for such a small improvement. Not to mention it will bloat all object files. Why break things for the vast majority of people when the fix is simple for those who have been misled by the incomplete docs?
Comment #9 by eyal — 2016-08-16T05:19:13Z
> Or functioning as designed, just not properly documented.
> This part of TypeInfo is meant to be used internally. I don't think this corner case warrants a compiler change, as the current state is completely usable.
Not by Weka or by Andrei Alexandrescu, apparently, as both inserted terrible bugs into production code with this.
Do you think a warning in the manual would have prevented these bugs? Do you doubt that the next usages of this function will create similar bugs?
> Every time we change the compiler, it may break other things. I don't think it's worth the risk for such a small improvement. Not to mention it will bloat all object files. Why break things for the vast majority of people when the fix is simple for those who have been misled by the incomplete docs?
Note the "fix" I am requesting does not require bloating the object files. A compile-time error about use of the "init" value of a static array's typed id is enough. Generating a special TypeInfo object for static arrays doesn't sound like rocket science.
Claiming the interface should be terrible to simplify the implementation is classic worse-is-better, and I think D aspires to be truly better.
Comment #10 by schveiguy — 2016-08-16T12:46:08Z
A compile time error isn't possible, as this is a virtual function. It would have to be a runtime error, which I think is even worse.
Let's just fix the docs, any bugs that are in druntime/Phobos, and move on. I don't understand the pushback on this, TypeInfo.initializer is for druntime internals, and those who need it for low-level memory management code. It's not commonly used, so there aren't many places to fix. The current implementation is entirely reasonable (why store N copies of the same thing?).
Comment #11 by ag0aep6g — 2016-08-16T12:49:09Z
Alright, these are arguments for changing the documentation instead of the implementation:
1) Changing the behavior would break existing code.
2) It's hard or impossible to fix in a satisfactory way. In particular: binary bloat.
If I'm missing or misrepresenting anything, please yell at me.
Re 1: I agree that it may break existing code, which isn't good. I don't assign a lot of value to this, though. I.e., yes it's a bit bad, but in my opinion the bad doesn't outweigh the good we would get from having the function behave as documented. The documented behavior is far more sensible (to the user) than the actual behavior. The code that would get broken by a fix is likely workaround code for the inconsistency.
Re 2: I don't know how much bloat a fix would add, but consider that we can embed a static array in a struct, and `initializer` works as documented then. So that case has the bloat and everything seems to be just fine.
In code:
----
struct S { char[3] a; }
void main()
{
assert(typeid(S).initializer == "\xFF\xFF\xFF"); /* passes */
assert(typeid(char[3]).initializer == "\xFF\xFF\xFF"); /* fails */
}
----
Comment #12 by schveiguy — 2016-08-16T12:55:01Z
3) Changing the compiler potentially introduces new issues. Changing the docs does not.
Comment #13 by schveiguy — 2016-08-16T12:58:20Z
changing the compiler when the existing code works JUST FINE as long as it's properly understood and documented seems like a horrible idea to me.
It would be a different story if it was not possible to properly initialize a static array using the TypeInfo.
The current code also returns an array that is null, but with a non-zero length when the initializer would otherwise be all zeros to save on binary space. That's a weird behavior, but somehow we live just fine with that. I don't see why we can't document the other weird, yet legitimate, behavior.
Comment #14 by ag0aep6g — 2016-08-16T13:59:27Z
(In reply to Steven Schveighoffer from comment #12)
> 3) Changing the compiler potentially introduces new issues. Changing the
> docs does not.
(In reply to Steven Schveighoffer from comment #13)
> changing the compiler when the existing code works JUST FINE as long as it's
> properly understood and documented seems like a horrible idea to me.
Ok, I guess we just disagree on this. I don't think we should shy away from fixing issues properly (<- just my opinion) because the code is so fragile that every touch could break stuff elsewhere.
> It would be a different story if it was not possible to properly initialize
> a static array using the TypeInfo.
It's possible, but even when documented, `initializer`'s behavior is surprising. I.e., it's likely to be missed/forgotten, bug-prone.
> The current code also returns an array that is null, but with a non-zero
> length when the initializer would otherwise be all zeros to save on binary
> space. That's a weird behavior, but somehow we live just fine with that. I
> don't see why we can't document the other weird, yet legitimate, behavior.
In no particular order:
1) That's documented. Yes, I know, we would make the other weirdness documented, too. But it isn't yet, so we an opportunity to do it differently, whereas the all-zeroes ship has sailed. If all-zeroes were undocumented today, I'm not sure what way of dealing with it I would prefer.
2) All-zeroes is the same for all types. You don't have to check for one specific kind of type, and then handle the result of `initializer` differently. You have to check all return values of `initializer` equally. That's significantly more consistent. It fails way earlier.
By the way, I would say that TypeInfo_StaticArray.initializer violates substitutability when it behaves differently from the others. Documenting it on the base class that a derived class behaves differently just makes it a substitutability violation that we try to handwave our way out of.
3) The point is basically: "We already have one corner case, let's add one more." I don't think that's the way to go. Minimizing the number corner cases should be the goal.
Overall, having `initializer` return an array seems like a bad design decision, since we try to bend/break it whenever the result consists of repeated values. Maybe we should come up with something that handles repetitive cases more gracefully, and then get rid of `initializer` (slowly with a full deprecation cycle, of course).
Comment #15 by schveiguy — 2016-08-16T14:30:15Z
I just disagree that the current behavior is a bug. Note that this wasn't documented to begin with, and whoever documented it (probably me) didn't realize the corner case behavior. This was never a code bug IMO, it's working as designed.
The runtime is full of undocumented implementation behaviors. Having worked many times on fixing things, there's quite a few cases where the compiler and runtime interact in weird undocumented ways. Documenting them is the first step, then if we want to change it, then we change it with a proper discussion with the main players.
I'm working on 2 PRs, one to fix the docs, and one to fix std.algorithm.mutation.initializeAll (the only affected code in druntime or Phobos).
Even if we get a dmd upgrade at some point, I'd like to at least fix the current problems.
Interestingly enough, TypeInfo.initializer is rarely needed, because most of the time, you have the static type, and that is usually plenty to get this working.
Comment #16 by schveiguy — 2016-08-16T14:38:59Z
(In reply to Eyal from comment #1)
> Prints out:
> [Int(1, 7FE7FED92000), Int(2, 7FE7FED92000)]
> [Int(3, null), Int(0, 73)]
>
> The second field being printed for Int seems like *yet another* bug.
The other field being printed is the context pointer, since your struct is nested in the unittest.
Comment #17 by eyal — 2016-08-16T14:41:28Z
> The other field being printed is the context pointer, since your struct is nested in the unittest.
Yeah, I've since figured it out - but was surprised because it is inconsistent with code blocks like: x=>x+1 which is inferred to be a function, not a delegate. So I expected structs to be similarly inferred to static structs when they do not actually need the context ptr.
(In reply to Eyal from comment #17)
> Yeah, I've since figured it out - but was surprised because it is
> inconsistent with code blocks like: x=>x+1 which is inferred to be a
> function, not a delegate. So I expected structs to be similarly inferred to
> static structs when they do not actually need the context ptr.
I was with you on that recently when dealing with Voldemort types. But static must be added to make it a non-nested struct, even if you don't use any of the function locals.
Worth an enhancement request probably.
Comment #20 by eyal — 2016-08-16T14:45:35Z
It could be nice to add a setInit method that sets a given buffer to the initial value, with no corner cases at all.
Comment #21 by schveiguy — 2016-08-16T14:47:01Z
(In reply to Eyal from comment #20)
> It could be nice to add a setInit method that sets a given buffer to the
> initial value, with no corner cases at all.
Agreed. I was thinking the same thing.
Comment #22 by github-bugzilla — 2016-08-30T11:35:39Z