Bug 21745 – Closure created in struct constructor passed to class constructor refers to expired stack frame

Status
RESOLVED
Resolution
INVALID
Severity
normal
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2021-03-22T06:57:03Z
Last change time
2021-03-25T20:51:48Z
Keywords
industry, safe
Assigned to
No Owner
Creator
FeepingCreature

Comments

Comment #0 by default_357-line — 2021-03-22T06:57:03Z
Despite clearly escaping, a delegate created in a struct constructor and passed to a class constructor is allocated on the heap. Consider a class: class Bar { int delegate() dg; this(int delegate() dg) { this.dg = dg; } } And a struct: struct Foo { int i; Bar bar; this(int i) { this.i = i; this.bar = new Bar({ return this.i; }); } } And a helper function, just to ensure that the variable gets a deterministic location on the stack: Foo getFoo(int i) { return Foo(i); } Then when we void main() { auto foo = getFoo(5); getFoo(6); writefln!"%s"(foo.bar.dg()); assert(foo.bar.dg() == 5); } We see that dg returns 6, despite belonging to the constructor where i was 5.
Comment #1 by default_357-line — 2021-03-22T06:57:33Z
is not* not allocated on the heap, sorry.
Comment #2 by razvan.nitu1305 — 2021-03-22T13:20:19Z
As a workaround, simply returning `i` in the delegate seems to work correctly: struct Foo { int i; Bar bar; this(int i) { this.i = i; this.bar = new Bar({ return i; }); } }
Comment #3 by razvan.nitu1305 — 2021-03-22T13:34:51Z
(In reply to RazvanN from comment #2) > As a workaround, simply returning `i` in the delegate seems to work > correctly: > > struct Foo > { > int i; > Bar bar; > this(int i) > { > this.i = i; > this.bar = new Bar({ return i; }); > } > } I am not entirely sure, but the issue might be that in the original report the closure variable is `this`, whereas in my workaround the closure variable is `i`. Therefore, once memory is allocated on the heap, in the first case it simply saves the `this` pointer (that points to some stack allocated data) not the actual struct instance.
Comment #4 by default_357-line — 2021-03-22T16:07:03Z
I think that's correct. Since it's a struct, it should save "this" by value though. typeof(this) is Foo, not Foo*.
Comment #5 by razvan.nitu1305 — 2021-03-22T16:19:30Z
(In reply to FeepingCreature from comment #4) > I think that's correct. > > Since it's a struct, it should save "this" by value though. typeof(this) is > Foo, not Foo*. I've investigated this a bit, but given my n00bism with the backend I did not get very far. The issue seems to revolve around how struct values are passed between stack frames. For example, if we add some fields to struct Foo, the example runs correctly: struct Foo { int i, l, k, t; Bar bar; this(int i) { this.i = i; this.bar = new Bar({ return this.i; }); } } This is what I could come up with so far: at [1] the type of the this variable is changed from `struct type` to a pointer, so this branch [2] is never taken. I assume that the bug is somewhere inside el_bin [3] or el_una [4] since adding more fields seems to make it work. I suspect that the bug has something to do with how the struct is passed (via registers or a memcmp) and probably the compiler forgets that a closure should be allocated in this case. I'm going to try to take a deeper dive if somebody else doesn't beat me to it. [1] https://github.com/dlang/dmd/blob/master/src/dmd/toir.d#L895 [2] https://github.com/dlang/dmd/blob/master/src/dmd/toir.d#L909 [3] https://github.com/dlang/dmd/blob/master/src/dmd/toir.d#L899 [4] https://github.com/dlang/dmd/blob/master/src/dmd/toir.d#L900
Comment #6 by dfj1esp02 — 2021-03-24T07:41:37Z
(In reply to FeepingCreature from comment #4) > Since it's a struct, it should save "this" by value though. Closures don't capture variables by value, you would end up with two different copies of the variable: one on the stack, another in the closure, but the closure should share one instance of the variable. If you want two copies, do it explicitly: struct Foo { int i; Bar bar; this(int i) { this.i = i; Foo f = this; this.bar = new Bar({ return f.i; }); } }
Comment #7 by default_357-line — 2021-03-24T08:55:22Z
This is indeed a problematic case. We want "this" to reference the variable on which the struct method was called. But we also want the stackframe to be heap allocated, so we can return a closure. It seems those two requirements may simply not be simultaneously satisfiable? I'm not sure which one should take precedence.
Comment #8 by bugzilla — 2021-03-25T09:04:48Z
Just to confirm it *is* allocating the closure on the heap. If you disassemble the object file, the call to _d_allocmemory is definitive.
Comment #9 by bugzilla — 2021-03-25T10:03:32Z
The problem is, struct Foo, as written, fits in registers. Therefore, it is returned from functions in registers. The {return this.i;} lambda is indeed a closure allocated on the heap, and the `this` pointer is put in the closure. But where is Foo located? Foo getFoo(int i) { return Foo(i); } is returned in registers. It then gets copied into the local foo. The `this` pointer in the closure is now pointing to an expired stack frame. Note that Foo itself is not stored in the closure, a pointer to Foo is stored. Add a member to Foo, and what happens? It no longer is returned in registers. It is returned via a hidden pointer to a Foo variable located in the caller's stack frame, in this case the local main.foo. The `this` pointer in the lambda always points to main.foo. When getFoo is called again, a separate main._TMP is created for the return value. So this works. But only by the happenstance of NRVO optimization putting the Foo instance in main() so it isn't expired. The fundamental problem is the `this` is saved in a closure, but what `this` is referring to goes out of scope and so the `this` pointer is pointing to garbage. D should detect this. And indeed it does, if you add @safe and compile with -dip1000: @safe: class Bar { int delegate() dg; this(int delegate() @safe dg) { this.dg = dg; } } struct Foo { int i; Bar bar; this(int i) { this.i = i; this.bar = new Bar({ return this.i; }); // line 19 } } Foo getFoo(int i) { return Foo(i); } void main() { auto foo = getFoo(5); getFoo(6); assert(foo.bar.dg() == 5); } dmd test -dip1000 yields: test.d(19): Error: reference to local `this` assigned to non-scope parameter `dg` calling test3.Bar.this This kind of subtle stack corruption problem is what DIP1000 is designed to detect. I recommend: 1. redesign the code so that Foo is allocated on the heap, or at least in a location that outlives the pointer to it in a delegate 2. consider using @safe and -dip1000
Comment #10 by razvan.nitu1305 — 2021-03-25T10:34:29Z
(In reply to Walter Bright from comment #9) Thanks for the thourough explanation > > dmd test -dip1000 yields: > > test.d(19): Error: reference to local `this` assigned to non-scope > parameter `dg` calling test3.Bar.this > > This kind of subtle stack corruption problem is what DIP1000 is designed to > detect. > The error is super weird. As a user who does not understand the inner-mechanics of how closures are allocated, this error tells me nothing. How is `this` of type `Foo` assigned to a delegate? What am I supposed to do as a user to get rid of this error? > I recommend: > > 1. redesign the code so that Foo is allocated on the heap, or at least in a > location that outlives the pointer to it in a delegate > > 2. consider using @safe and -dip1000
Comment #11 by dfj1esp02 — 2021-03-25T14:16:20Z
(In reply to FeepingCreature from comment #7) > This is indeed a problematic case. We want "this" to reference the variable > on which the struct method was called. But we also want the stackframe to be > heap allocated, so we can return a closure. It seems those two requirements > may simply not be simultaneously satisfiable? I'm not sure which one should > take precedence. They are separated in time. First the struct is allocated, then the constructor is called on it, it's too late for the constructor to change the allocation strategy, because the allocation already happened.
Comment #12 by bugzilla — 2021-03-25T20:51:48Z
> super weird It can indeed be, because what's happening is hidden behind implicit variables, complex types, templates, etc. This is why I recommend rewriting confusing examples in terms of simple pointers and types. In this case, int i; int* p = &i; int func = { return *p; } return &func; i is a local variable. p is a local variable that references i. func references p. &func escapes the stack frame. Therefore, p is allocated in a heap allocated closure, rather than on the stack. BUT p refers to another local i, which IS allocated on the stack. Hence a reference to the local stack frame escapes, and hello stack corruption. Unfortunately, this kind of corruption can often appear to work and pass all the tests. But it will at some point start failing, and can be hard to find. To fix the problem, i must be allocated in longer lived storage than the lifetime of &func.