Bug 14126 – GITHEAD - GC seemingly corrupting memory

Status
RESOLVED
Resolution
FIXED
Severity
regression
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2015-02-05T03:11:00Z
Last change time
2015-02-18T03:39:07Z
Assigned to
schveiguy
Creator
puneet

Comments

Comment #0 by puneet — 2015-02-05T03:11:41Z
Here is a reduced test. I am making sure that _foo does not get any value other than 0 or 1. But still when I run this code with the GITHEAD, I get a message: How can _foo have any other value than 0 or 1? 16777216?? really???? core.exception.InvalidMemoryOperationError@(0) Works fine with release 2.066.1. import std.stdio; import std.conv; class FooBar { private Foo[] _foos; this() { _foos.length = 512; foreach(ref foo; _foos) foo = Foo(1); } } struct Foo { private uint _foo = 0; this(uint foo) { assert(foo == 1); _foo = foo; } ~this() { if (_foo != 0 && _foo != 1) { writeln("How can _foo have any other value than 0 or 1?"); writeln(_foo, "?? really????"); assert(false, "Unexpected Value: " ~ _foo.to!string()); } } } void main() { auto bar = new FooBar(); }
Comment #1 by blah38621 — 2015-02-05T03:46:02Z
The invalid memory operation exception is expected, as you are trying to do an allocation in a destructor. As a note, it doesn't appear that the issue is with the main class, as this still produces the same error: import std.stdio; import std.conv; class FooBar { Foo[] _foos; this() { _foos = new Foo[512]; foreach (i; 0.._foos.length) _foos[i] = Foo(1); } } struct Foo { uint _foo = 0; this(uint foo) { writeln("Foo init"); assert(foo == 1); _foo = foo; } ~this() { if (_foo != 0 && _foo != 1) { writeln("How can _foo have any other value than 0 or 1?"); writeln(_foo, "?? really????"); assert(false, "Unexpected Value: " ~ _foo.to!string()); } } } void main() { auto bar = new FooBar(); writeln("End of main"); }
Comment #2 by puneet — 2015-02-05T03:58:32Z
Reducing further. Does not print "End of main" now. //// import std.stdio; import std.conv; class FooBar { private Foo[] _foos; this() { _foos.length = 512; } } struct Foo { private uint _foo = 1; ~this() { if (_foo != 1) { writeln("How can _foo have any other value than 0 or 1?"); writeln(_foo, "?? really????"); assert(false, "Unexpected Value: " ~ _foo.to!string()); } } } void main() { auto bar = new FooBar(); writeln("End of main"); }
Comment #3 by blah38621 — 2015-02-05T04:01:51Z
The 0 there isn't really unexpected, although T.init should have foo as 1, so that shouldn't be getting called...
Comment #4 by puneet — 2015-02-05T04:06:42Z
Yes 0 is not that unexpected. It does not get called with 2.066. Also, I was wrong. "End of Main" is always getting printed.
Comment #5 by bugzilla — 2015-02-06T03:23:23Z
Sounds from the comments that this bug is invalid.
Comment #6 by puneet — 2015-02-06T03:33:46Z
I am reopening the issue. It is not invalid. I know an assertion should not be called from a destructor, I am doing that to avoid printing multiple messages. The real bug is that I am getting unexpected value of element _foo in the destructor.
Comment #7 by blah38621 — 2015-02-06T03:34:28Z
It is a valid bug, just not sure of the cause yet.
Comment #8 by sinkuupump — 2015-02-06T08:30:51Z
It seems FooBar class isn't required. import core.stdc.stdio; struct Foo { private uint _foo = 1; ~this() { printf("%d\n", _foo); } } void main() { Foo[] foos; foos.length = 512; } This prints "1" with 2.066.1, but prints "16777216" with 2.067a. On my system, array length must be >= 510 to reproduce the bug.
Comment #9 by sinkuupump — 2015-02-06T08:43:13Z
Looks like the destructor is called with wrong 'this' pointer. import core.stdc.stdio; struct Foo { private uint _foo = 1; ~this() { printf("%p\n", &this); } } void main() { Foo[] foos; foos.length = 512; foreach (ref f; foos) printf("%p\n", &f); printf("--------------\n"); } Output: 0x7fa2a6420010 0x7fa2a6420014 0x7fa2a6420018 ... -------------- ... 0x7fa2a6420019 0x7fa2a6420015 0x7fa2a6420011
Comment #10 by ketmar — 2015-02-06T11:48:13Z
i can confirm it for GNU/Linux, x86. all `this` pointers are exactly one byte off.
Comment #11 by ketmar — 2015-02-06T12:13:59Z
seems that the bug is due to the code in `lifetime.d:finalize_array2()`. for big blocks (those which are bigger or equal to PAGESIZE), it does this: `p += LARGEPAD;`, and LARGEPAD is: LARGEPREFIX = 16, // 16 bytes padding at the front of the array LARGEPAD = LARGEPREFIX + 1, seems that LARGEPAD must be used only in overall block size calculations, but not as real padding (real padding is LARGEPREFIX in this case). yet i'm not an expert in D memory management, so i can't say if i'm really right here. fixing `lifetime.d:finalize_array2()` to use `LARGEPREFIX` instead seems to fix the bug. and hey, why anoyone would use 17-byte PREFIX padding anyway?
Comment #12 by andrej.mitrovich — 2015-02-06T12:22:16Z
I've read somewhere that struct dtors are now called when the struct is allocated on the heap. This might reveal some previously undetected bugs. Just my 2 cents..
Comment #13 by ketmar — 2015-02-06T12:46:35Z
i remember that there was some fixes with arrays of structs and dtors, yes. so this seems to be just a typo in the code — using wrong constant to calculare real start of the array. with mentioned fix it "works for me", and the fix seems to be logical. now we have to summon someone knowledgable to either confirm that and fix mainline code, or to reject it, so we can start searching elsewhere. ;-)
Comment #14 by schveiguy — 2015-02-06T15:18:58Z
(In reply to Ketmar Dark from comment #11) > seems that the bug is due to the code in `lifetime.d:finalize_array2()`. > > for big blocks (those which are bigger or equal to PAGESIZE), it does this: > `p += LARGEPAD;`, and LARGEPAD is: > > LARGEPREFIX = 16, // 16 bytes padding at the front of the array > LARGEPAD = LARGEPREFIX + 1, > > seems that LARGEPAD must be used only in overall block size calculations, > but not as real padding (real padding is LARGEPREFIX in this case). > > yet i'm not an expert in D memory management, so i can't say if i'm really > right here. fixing `lifetime.d:finalize_array2()` to use `LARGEPREFIX` > instead seems to fix the bug. and hey, why anoyone would use 17-byte PREFIX > padding anyway? This is exactly the problem. The + 1 is used to determine overall size of array allocation. The +1 is the sentinel byte added to the end of the block to prevent cross-block referencing. For example, imagine you had a block of 16 bytes: ubyte[] x = new ubyte[16]; If you put this into an exact block of 16 bytes, then did this: x = x[$..$]; Now x.ptr is technically referring to the NEXT block in memory. That's why the sentinel is used. For smaller blocks, the sentinel is really the array 'used' size, but for larger blocks that can be extended, we put the 'used' size at the beginning. The resulting data must be 16-byte aligned, so we skip ahead 16 bytes (LARGEPREFIX), and we add 1 byte to the end to prevent cross-block issues. I will generate a PR.
Comment #15 by ketmar — 2015-02-06T15:32:01Z
thank you for your explanation. maybe it can be written somewhere in source code as comment, so people reading the sources have their question answered even before they asked it? ;-)
Comment #16 by schveiguy — 2015-02-06T15:44:46Z
Comment #17 by schveiguy — 2015-02-06T15:48:00Z
(In reply to Ketmar Dark from comment #15) > thank you for your explanation. maybe it can be written somewhere in source > code as comment, so people reading the sources have their question answered > even before they asked it? ;-) https://github.com/schveiguy/druntime/blob/issue14126/src/rt/lifetime.d#L235 Been there since I added it 5 years ago :)
Comment #18 by ketmar — 2015-02-06T16:03:27Z
hm. i should start reading the whole file at least once before suggesting "improvements" that were already done. ;-)
Comment #19 by github-bugzilla — 2015-02-06T16:59:18Z
Commits pushed to master at https://github.com/D-Programming-Language/druntime https://github.com/D-Programming-Language/druntime/commit/21aba9efeaf9eacd7da3d2585519b083355b9b78 Fixed issue 14126 -- Used wrong constant for offset when finalizing a large array. https://github.com/D-Programming-Language/druntime/commit/1c5aef8b9d58718b633d2a45ca09b5f9dd1ab6bc Merge pull request #1159 from schveiguy/issue14126 Fixed issue 14126 -- GC seemingly corrupting memory
Comment #20 by github-bugzilla — 2015-02-07T20:33:11Z
Comment #21 by github-bugzilla — 2015-02-18T03:39:07Z