Bug 23978 – [REG 2.103.0] ICE: dip1021 memory corruption

Status
RESOLVED
Resolution
FIXED
Severity
regression
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2023-06-07T22:07:30Z
Last change time
2023-06-16T09:07:53Z
Keywords
pull
Assigned to
No Owner
Creator
Iain Buclaw

Comments

Comment #0 by ibuclaw — 2023-06-07T22:07:30Z
This is not very deterministic. Run a few times to trigger. --- dmd -lowmem -preview=dip1021 pr110113.d -o- --- class LUBench { } void lup(ulong , ulong , int , int = 1) { new LUBench; } void lup_3200(ulong iters, ulong flops) { lup(iters, flops, 3200); } void raytrace() { struct V { float x, y, z; auto normalize() { } struct Tid { } auto spawnLinked() { } string[] namesByTid; class MessageBox { } auto cross() { } } }
Comment #1 by ibuclaw — 2023-06-07T22:09:59Z
(gdb) bt #0 0x00005644ce0c033d in _D3dmd4root3aav15dmd_aaGetRvalueFNaNbNiPSQBnQBmQBk2AAPvZQd (key=0x7f9ec39f5560, aa=0x7f9ec265d360) at src/dmd/root/aav.d:127 #1 0x00005644ce0c0660 in _D3dmd4root3aav__T10AssocArrayTCQBe10identifier10IdentifierTCQCh7dsymbol7DsymbolZQCl7opIndexMFNaNbNixCQDwQCsQCjZQCa (this=..., key=0x7f9ec39f5560) at src/dmd/root/aav.d:313 #2 0x00005644cdf373fc in DsymbolTable::lookup(Identifier const*) (this=0x7f9ec2ec08c0, ident=0x7f9ec39f5560) at src/dmd/dsymbol.d:2399 #3 0x00005644cdf3509a in ScopeDsymbol::search(Loc const&, Identifier*, int) (this=0x7f9ec39f8400, loc=..., ident=0x7f9ec39f5560, flags=8) at src/dmd/dsymbol.d:1474 #4 0x00005644cdf31890 in StructDeclaration::search(Loc const&, Identifier*, int) (this=0x7f9ec39f8400, loc=..., ident=0x7f9ec39f5560, flags=8) at src/dmd/dstruct.d:277 #5 0x00005644ce008817 in _D3dmd6opover15search_functionFCQBe7dsymbol12ScopeDsymbolCQCe10identifier10IdentifierZCQDhQCd7Dsymbol (funcid=0x7f9ec39f5560, ad=0x7f9ec39f8400) at src/dmd/opover.d:1435 #6 0x00005644cdf30d97 in search_toString(StructDeclaration*) (sd=0x7f9ec39f8400) at src/dmd/dstruct.d:51 #7 0x00005644ce030afd in semanticTypeInfoMembers(StructDeclaration*) (sd=0x7f9ec39f8400) at src/dmd/semantic3.d:1650 #8 0x00005644ce03086c in Semantic3Visitor::visit(AggregateDeclaration*) (this=0x7ffd1ab29230, ad=0x7f9ec39f8400) at src/dmd/semantic3.d:1590 #9 0x00005644ce026566 in ParseTimeVisitor<ASTCodegen>::visit(StructDeclaration*) (this=0x7ffd1ab29230, s=0x7f9ec39f8400) at src/dmd/parsetimevisitor.d:88 #10 0x00005644cdf31f6e in StructDeclaration::accept(Visitor*) (this=0x7f9ec39f8400, v=0x7ffd1ab29230) at src/dmd/dstruct.d:490 #11 0x00005644ce02b736 in semantic3(Dsymbol*, Scope*) (dsym=0x7f9ec39f8400, sc=0x7f9ec317f2e0) at src/dmd/semantic3.d:83 #12 0x00005644cdf98f02 in ExpressionSemanticVisitor::visit(DeclarationExp*) (this=0x7ffd1ab293a8, e=0x7f9ec39f1c00) at src/dmd/expressionsem.d:5471 #13 0x00005644cdf7dae2 in DeclarationExp::accept(Visitor*) (this=0x7f9ec39f1c00, v=0x7ffd1ab293a8) at src/dmd/expression.d:4212 #14 0x00005644cdfafe44 in expressionSemantic(Expression*, Scope*) (e=0x7f9ec39f1c00, sc=0x7f9ec317f2e0) at src/dmd/expressionsem.d:12566 #15 0x00005644ce0361e8 in StatementSemanticVisitor::visit(ExpStatement*) (this=0x7ffd1ab29458, s=0x7f9ec39f1bd0) at src/dmd/statementsem.d:206 #16 0x00005644ce032736 in ExpStatement::accept(Visitor*) (this=0x7f9ec39f1bd0, v=0x7ffd1ab29458) at src/dmd/statement.d:490 #17 0x00005644ce0360cc in statementSemantic(Statement*, Scope*) (s=0x7f9ec39f1bd0, sc=0x7f9ec317f2e0) at src/dmd/statementsem.d:148 #18 0x00005644ce0364af in StatementSemanticVisitor::visit(CompoundStatement*) (this=0x7ffd1ab29778, cs=0x7f9ec39f1c30) at src/dmd/statementsem.d:269 #19 0x00005644ce032c5a in CompoundStatement::accept(Visitor*) (this=0x7f9ec39f1c30, v=0x7ffd1ab29778) at src/dmd/statement.d:633 #20 0x00005644ce0360cc in statementSemantic(Statement*, Scope*) (s=0x7f9ec39f1c30, sc=0x7f9ec317f2e0) at src/dmd/statementsem.d:148 #21 0x00005644ce02cd57 in Semantic3Visitor::visit(FuncDeclaration*) (this=0x7ffd1ab2a220, funcdecl=0x7f9ec39f7cc0) at src/dmd/semantic3.d:598 #22 0x00005644cdfbb2ca in FuncDeclaration::accept(Visitor*) (this=0x7f9ec39f7cc0, v=0x7ffd1ab2a220) at src/dmd/func.d:2857 #23 0x00005644ce02b736 in semantic3(Dsymbol*, Scope*) (dsym=0x7f9ec39f7cc0, sc=0x7f9ec317dcf0) at src/dmd/semantic3.d:83 #24 0x00005644ce02bb87 in Semantic3Visitor::visit(Module*) (this=0x7ffd1ab2a2e0, mod=0x7f9ec39f7000) at src/dmd/semantic3.d:205 #25 0x00005644cdf1a31e in Module::accept(Visitor*) (this=0x7f9ec39f7000, v=0x7ffd1ab2a2e0) at src/dmd/dmodule.d:1259 #26 0x00005644ce02b736 in semantic3(Dsymbol*, Scope*) (dsym=0x7f9ec39f7000, sc=0x0) at src/dmd/semantic3.d:83 #27 0x00005644cdea0e55 in _D3dmd4mars7tryMainFmPPxaKSQz7globals5ParamZi (params=..., argv=0x7ffd1ab2abe8, argc=5) at src/dmd/mars.d:473 #28 0x00005644cdea2a9a in _Dmain (_param_0=...) at src/dmd/mars.d:962 #29 0x00005644ce1ddabb in rt.dmain2._d_run_main2(char[][], ulong, extern(C) int(char[][]) function).runAll().__lambda2() () #30 0x00005644ce1dd96a in rt.dmain2._d_run_main2(char[][], ulong, extern(C) int(char[][]) function).tryExec(scope void() delegate) () #31 0x00005644ce1dda43 in rt.dmain2._d_run_main2(char[][], ulong, extern(C) int(char[][]) function).runAll() () #32 0x00005644ce1dd96a in rt.dmain2._d_run_main2(char[][], ulong, extern(C) int(char[][]) function).tryExec(scope void() delegate) () #33 0x00005644ce1dd8d3 in _d_run_main2 () #34 0x00005644ce1dd69c in _d_run_main () #35 0x00005644cdea2a58 in main (argc=5, argv=0x7ffd1ab2abe8) at src/dmd/mars.d:918
Comment #2 by ibuclaw — 2023-06-07T22:13:49Z
--- 125│ while (e) 126│ { 127├───────────> if (key == e.key) 128│ return e.value; 129│ e = e.next; 130│ } --- (gdb) p key $1 = (void *) 0x7f9ec39f5560 (gdb) p e $2 = (dmd.root.aav.aaA *) 0x20ec8348ec8b4855 (gdb) p aa.b[i] $3 = (dmd.root.aav.aaA *) 0x7f9ec2666a00 (gdb) p *aa.b[i] $4 = {next = 0x5644ce2fb510 <vtable for dmd.declaration.ThisDeclaration>, keyValue = {key = 0x7f9ec39ec5e0, value = 0x7f9ec2642660}} (gdb) p *('dmd.declaration.ThisDeclaration'*)aa.b[i] $5 = {<dmd.declaration.VarDeclaration> = {<dmd.declaration.Declaration> = {<dmd.dsymbol.Dsymbol> = {<dmd.ast_node.ASTNode> = {<dmd.root.rootobject.RootObject> = {<No data fields>}, <No data fields>}, ident = 0x7f9ec39ec5e0, parent = 0x7f9ec2642660, csym = 0x0, loc = {filename = 0x0, linnum = 0, charnum = 0}, _scope = 0x0, prettystring = 0x0, atts = 0x0, errors = false, semanticRun = 2 '\002', localNum = 0}, type = 0x7f9ec265d660, originalType = 0x7f9ec265d660, storage_class = 268697636, visibility = {kind = 5 '\005', pkg = 0x0}, _linkage = 1 '\001', inuse = 0, adFlags = 0 '\000', isym = 0x0, mangleOverride = 0x0}, _init = 0x0, nestedrefs = {length = 0, data = 0x0, smallarray = {0x0}}, aliasTuple = 0x0, lastVar = 0x0, edtor = 0x0, range = 0x0, maybes = 0x0, endlinnum = 0, offset = 0, sequenceNumber = 610, alignment = {value = 1234, pack = false}, ctfeAdrOnStack = 4294967295, bitFields = 256, canassign = 0 '\000', isdataseg = 2 '\002'}, <No data fields>} (gdb) p *(('dmd.declaration.ThisDeclaration'*)aa.b[i]).ident $6 = {<dmd.root.rootobject.RootObject> = {<No data fields>}, value = 88, isAnonymous_ = false, name = "this"} (gdb) p *(('dmd.declaration.ThisDeclaration'*)aa.b[i]).type $7 = {<dmd.ast_node.ASTNode> = {<dmd.root.rootobject.RootObject> = {<No data fields>}, <No data fields>}, ty = 8 '\b', mod = 1 '\001', deco = 0x7f9ec37e8500 "xS8pr1101138raytraceFZ1V", mcache = 0x7f9ec2649b40, pto = 0x0, rto = 0x0, arrayof = 0x0, vtinfo = 0x0, ctype = 0x0}
Comment #3 by maxhaton — 2023-06-07T22:15:29Z
Is dip1021 required?
Comment #4 by ibuclaw — 2023-06-07T22:16:57Z
It looks like one of the elements in the aaA array is pointing as a live class object ThisDeclaration. So it's loaded with bogus data and results in a segfault after dereferencing a part of memory it shouldn't have.
Comment #5 by ibuclaw — 2023-06-07T22:21:06Z
(In reply to mhh from comment #3) > Is dip1021 required? AFAICT, all that dip1021 is doing is make the compiler call GC.malloc/GC.free a number of times more often than it usually does. https://github.com/dlang/dmd/blob/master/compiler/src/dmd/ob.d I'd consider the main trigger to be `-lowmem` - so it may well be the GC itself that is at fault, given that git blame points to a change in `_d_newclass`, this seems the most plausible explanation.
Comment #6 by ibuclaw — 2023-06-07T22:21:32Z
git blame says the regressing commit is https://github.com/dlang/dmd/pull/14837
Comment #7 by dlang-bugzilla — 2023-06-08T05:34:22Z
I have not been able to reproduce this. It would be good to have a test case which more reliably reproduces the bug. I tried wrapping the code into a static foreach, but that did not help.
Comment #8 by dlang-bugzilla — 2023-06-08T08:40:11Z
Also because we need something to put in the test suite to prevent this from regressing again.
Comment #9 by ibuclaw — 2023-06-08T13:30:11Z
The printf dumps really don't give any hint as to what went wrong, but there is at least a common pattern I see for each time it crashes. --------------------- Mem.xrealloc((nil), 624) = 0x7f3686b63000 <-- !!! This address ends up as aa.b ... ... Mem.xrealloc(0x7f3686b63000, 832) = 0x7f3686b8d400 <-- !!! marked free ... ... ---------------------------------------------------------- from: dmd_aaRehash++ this = 0x7f3686b95a20 nodes = 9 b_length = 4 [ b[0] 0x7f3686b95a38 = 0x7f3686b95a58 { next = 0x7f3686b98c20 key = 0x7f3687f32fc0 value = 0x7f3687f2c500 } b[1] 0x7f3686b95a40 = 0x7f3686b98bc0 { next = (nil) key = 0x7f3687f36060 value = 0x7f3687f37000 } b[2] 0x7f3686b95a48 = 0x7f3686b98b80 { next = 0x7f3686b98be0 key = 0x7f3687f32fe0 value = 0x7f3687f2c600 } b[3] 0x7f3686b95a50 = 0x7f3686b98ba0 { next = 0x7f3686b98c00 key = 0x7f3687f36000 value = 0x7f3687f2c700 } ] Mem.xmalloc(768) = 0x7f3686b63000 <-- !!! rehash wants memory, reuse old address ---------------------------------------------------------- from: dmd_aaRehash-- this = 0x7f3686b95a20 nodes = 9 b_length = 32 [ b[0] 0x7f3686b63000 = (nil) b[1] 0x7f3686b63008 = (nil) b[2] 0x7f3686b63010 = (nil) b[3] 0x7f3686b63018 = (nil) b[4] 0x7f3686b63020 = (nil) b[5] 0x7f3686b63028 = (nil) b[6] 0x7f3686b63030 = (nil) b[7] 0x7f3686b63038 = (nil) <-- !!! This index is null b[8] 0x7f3686b63040 = (nil) b[9] 0x7f3686b63048 = (nil) b[10] 0x7f3686b63050 = (nil) b[11] 0x7f3686b63058 = (nil) b[12] 0x7f3686b63060 = (nil) b[13] 0x7f3686b63068 = (nil) b[14] 0x7f3686b63070 = (nil) b[15] 0x7f3686b63078 = (nil) b[16] 0x7f3686b63080 = (nil) b[17] 0x7f3686b63088 = (nil) b[18] 0x7f3686b63090 = (nil) b[19] 0x7f3686b63098 = 0x7f3686b98ba0 { next = (nil) key = 0x7f3687f36000 value = 0x7f3687f2c700 } b[20] 0x7f3686b630a0 = 0x7f3686b95a58 { next = (nil) key = 0x7f3687f32fc0 value = 0x7f3687f2c500 } b[21] 0x7f3686b630a8 = 0x7f3686b98bc0 { next = (nil) key = 0x7f3687f36060 value = 0x7f3687f37000 } b[22] 0x7f3686b630b0 = 0x7f3686b98b80 { next = (nil) key = 0x7f3687f32fe0 value = 0x7f3687f2c600 } b[23] 0x7f3686b630b8 = 0x7f3686b98c00 { next = (nil) key = 0x7f3687f36040 value = 0x7f3687f37330 } b[24] 0x7f3686b630c0 = 0x7f3686b98c20 { next = (nil) key = 0x7f3687f360a0 value = 0x7f3687f2c800 } b[25] 0x7f3686b630c8 = (nil) b[26] 0x7f3686b630d0 = 0x7f3686b98be0 { next = (nil) key = 0x7f3687f36080 value = 0x7f3687f34200 } b[27] 0x7f3686b630d8 = (nil) b[28] 0x7f3686b630e0 = 0x7f3686b98c60 { next = (nil) key = 0x7f3687f360e0 value = (nil) } b[29] 0x7f3686b630e8 = (nil) b[30] 0x7f3686b630f0 = 0x7f3686b98c40 { next = (nil) key = 0x7f3687f360c0 value = 0x7f3687f37660 } b[31] 0x7f3686b630f8 = (nil) ] ... ... Mem.xmalloc(80) = 0x7f3686ba2b40 ---------------------------------------------------------- from: dmd_aaGet this = 0x7f3686ba2b40 nodes = 0 ---------------------------------------------------------- from: dmd_aaGet== this = 0x7f3686ba2b40 nodes = 1 b_length = 4 [ b[0] 0x7f3686ba2b58 = (nil) b[1] 0x7f3686ba2b60 = 0x7f3686ba2b78 { next = (nil) key = 0x7f3687f285e0 value = (nil) } b[2] 0x7f3686ba2b68 = (nil) b[3] 0x7f3686ba2b70 = (nil) ] ---------------------------------------------------------- from: dmd_aaGet this = 0x7f3686ba2b40 nodes = 1 b_length = 4 [ b[0] 0x7f3686ba2b58 = (nil) b[1] 0x7f3686ba2b60 = 0x7f3686ba2b78 { next = (nil) key = 0x7f3687f285e0 value = 0x7f3686b9fb00 <-- !!! This is what corrupts aa.b[7] } b[2] 0x7f3686ba2b68 = (nil) b[3] 0x7f3686ba2b70 = (nil) ] ... ... ---------------------------------------------------------- from: dmd_aaGetRvalue this = 0x7f3686ba2b40 <-- !!! Last access of this AA nodes = 2 b_length = 4 [ b[0] 0x7f3686ba2b58 = (nil) b[1] 0x7f3686ba2b60 = 0x7f3686ba2b78 { next = (nil) key = 0x7f3687f285e0 value = 0x7f3686b9fb00 } b[2] 0x7f3686ba2b68 = (nil) b[3] 0x7f3686ba2b70 = 0x7f3686b9a220 { next = (nil) key = 0x7f3687f2ade0 value = 0x7f3686b9fc00 } ] ---------------------------------------------------------- from: dmd_aaGetRvalue this = 0x7f3686b95a20 nodes = 9 b_length = 32 [ b[0] 0x7f3686b63000 = (nil) b[1] 0x7f3686b63008 = (nil) b[2] 0x7f3686b63010 = (nil) b[3] 0x7f3686b63018 = (nil) b[4] 0x7f3686b63020 = (nil) b[5] 0x7f3686b63028 = (nil) b[6] 0x7f3686b63030 = (nil) b[7] 0x7f3686b63038 = (nil) <-- !!! Still null b[8] 0x7f3686b63040 = (nil) b[9] 0x7f3686b63048 = (nil) b[10] 0x7f3686b63050 = (nil) b[11] 0x7f3686b63058 = (nil) b[12] 0x7f3686b63060 = (nil) b[13] 0x7f3686b63068 = (nil) b[14] 0x7f3686b63070 = (nil) b[15] 0x7f3686b63078 = (nil) b[16] 0x7f3686b63080 = (nil) b[17] 0x7f3686b63088 = (nil) b[18] 0x7f3686b63090 = (nil) b[19] 0x7f3686b63098 = 0x7f3686b98ba0 { next = (nil) key = 0x7f3687f36000 value = 0x7f3687f2c700 } b[20] 0x7f3686b630a0 = 0x7f3686b95a58 { next = (nil) key = 0x7f3687f32fc0 value = 0x7f3687f2c500 } b[21] 0x7f3686b630a8 = 0x7f3686b98bc0 { next = (nil) key = 0x7f3687f36060 value = 0x7f3687f37000 } b[22] 0x7f3686b630b0 = 0x7f3686b98b80 { next = (nil) key = 0x7f3687f32fe0 value = 0x7f3687f2c600 } b[23] 0x7f3686b630b8 = 0x7f3686b98c00 { next = (nil) key = 0x7f3687f36040 value = 0x7f3687f37330 } b[24] 0x7f3686b630c0 = 0x7f3686b98c20 { next = (nil) key = 0x7f3687f360a0 value = 0x7f3687f2c800 } b[25] 0x7f3686b630c8 = (nil) b[26] 0x7f3686b630d0 = 0x7f3686b98be0 { next = (nil) key = 0x7f3687f36080 value = 0x7f3687f34200 } b[27] 0x7f3686b630d8 = (nil) b[28] 0x7f3686b630e0 = 0x7f3686b98c60 { next = (nil) key = 0x7f3687f360e0 value = 0x7f3687f37990 } b[29] 0x7f3686b630e8 = (nil) b[30] 0x7f3686b630f0 = 0x7f3686b98c40 { next = (nil) key = 0x7f3687f360c0 value = 0x7f3687f37660 } b[31] 0x7f3686b630f8 = (nil) ] ... ... ---------------------------------------------------------- from: dmd_aaGetRvalue this = 0x7f3686b95a20 nodes = 9 b_length = 32 [ b[0] 0x7f3686b63000 = (nil) b[1] 0x7f3686b63008 = (nil) b[2] 0x7f3686b63010 = (nil) b[3] 0x7f3686b63018 = (nil) b[4] 0x7f3686b63020 = (nil) b[5] 0x7f3686b63028 = (nil) b[6] 0x7f3686b63030 = (nil) b[7] 0x7f3686b63038 = 0x7f3686b9fb00 { <-- !!! Now has value?!?! next = 0x559ae6f9b510 key = 0x7f3687f285e0 value = 0x7f3686b8c330 } b[8] 0x7f3686b63040 = (nil) b[9] 0x7f3686b63048 = (nil) b[10] 0x7f3686b63050 = (nil) b[11] 0x7f3686b63058 = (nil) b[12] 0x7f3686b63060 = (nil) b[13] 0x7f3686b63068 = (nil) b[14] 0x7f3686b63070 = (nil) b[15] 0x7f3686b63078 = (nil) b[16] 0x7f3686b63080 = (nil) b[17] 0x7f3686b63088 = (nil) b[18] 0x7f3686b63090 = (nil) b[19] 0x7f3686b63098 = 0x7f3686b98ba0 { next = (nil) key = 0x7f3687f36000 value = 0x7f3687f2c700 } b[20] 0x7f3686b630a0 = 0x7f3686b95a58 { next = (nil) key = 0x7f3687f32fc0 value = 0x7f3687f2c500 } b[21] 0x7f3686b630a8 = 0x7f3686b98bc0 { next = (nil) key = 0x7f3687f36060 value = 0x7f3687f37000 } b[22] 0x7f3686b630b0 = 0x7f3686b98b80 { next = (nil) key = 0x7f3687f32fe0 value = 0x7f3687f2c600 } b[23] 0x7f3686b630b8 = 0x7f3686b98c00 { next = (nil) key = 0x7f3687f36040 value = 0x7f3687f37330 } b[24] 0x7f3686b630c0 = 0x7f3686b98c20 { next = (nil) key = 0x7f3687f360a0 value = 0x7f3687f2c800 } b[25] 0x7f3686b630c8 = (nil) b[26] 0x7f3686b630d0 = 0x7f3686b98be0 { next = (nil) key = 0x7f3687f36080 value = 0x7f3687f34200 } b[27] 0x7f3686b630d8 = (nil) b[28] 0x7f3686b630e0 = 0x7f3686b98c60 { next = (nil) key = 0x7f3687f360e0 value = 0x7f3687f37990 } b[29] 0x7f3686b630e8 = (nil) b[30] 0x7f3686b630f0 = 0x7f3686b98c40 { next = (nil) key = 0x7f3687f360c0 value = 0x7f3687f37660 } b[31] 0x7f3686b630f8 = (nil) ] Segmentation fault (core dumped) fail
Comment #10 by ibuclaw — 2023-06-08T13:35:36Z
What the Mem.xmalloc/xrealloc calls are telling me is that the D front-end with `-lowmem` is reusing some memory that was previously allocated (and subsequently freed) for some other purpose. --- Despite non-determinism, some things are always constant: 1. The object that causes segfault is a ThisDeclaration 2. The AA struct always has 9 nodes, and a bucket size 32. 3. It's always array index 7 that has a value assigned seemingly from out of nowhere. --- Is it plausible that there might still be references within the AST to memory xrealloc'd or xfree'd by the front-end? I could at least believe that can happen. Why did it take the switch from function _d_newclass to template _d_newclassT to hit this? Still haven't a clue, but it is very clear that before `_d_newclassT`, it is impossible to hit this segfault.
Comment #11 by ibuclaw — 2023-06-08T15:52:11Z
(In reply to Vladimir Panteleev from comment #7) > I have not been able to reproduce this. It would be good to have a test case > which more reliably reproduces the bug. I tried wrapping the code into a > static foreach, but that did not help. (In reply to Vladimir Panteleev from comment #8) > Also because we need something to put in the test suite to prevent this from > regressing again. Can you drop this into compiler/test/compilable/test23978.d? --- // REQUIRED_ARGS: -preview=dip1021 -lowmem // PERMUTE_ARGS: -debug=A -debug=B -debug=C -debug=D -debug=E -debug=F -debug=G -debug=H class LUBench { } void lup(ulong , ulong , int , int = 1) { new LUBench; } void lup_3200(ulong iters, ulong flops) { lup(iters, flops, 3200); } void raytrace() { struct V { float x, y, z; auto normalize() { } struct Tid { } auto spawnLinked() { } string[] namesByTid; class MessageBox { } auto cross() { } } } --- The long list of permutations should make the test compile ~256 times. Enough to ensure that it never succeeds on any of my dev machines.
Comment #12 by dlang-bot — 2023-06-09T13:50:28Z
@dkorpel created dlang/dmd pull request #15302 "Fix 23978 - ICE: EscapeBy[] is malloced, but contains GC-allocated objects" fixing this issue: - Fix 23978 - ICE: EscapeBy[] is malloced, but contains GC-allocated objects https://github.com/dlang/dmd/pull/15302
Comment #13 by ibuclaw — 2023-06-09T17:18:30Z
From valgrind. --- ==1582870== Invalid read of size 8 ==1582870== at 0x6A33B5: _D3dmd4root3aav15dmd_aaGetRvalueFNaNbNiPSQBnQBmQBk2AAPvZQd (aav.d:127) ==1582870== by 0x6A36D7: _D3dmd4root3aav__T10AssocArrayTCQBe10identifier10IdentifierTCQCh7dsymbol7DsymbolZQCl7opIndexMFNaNbNixCQDwQCsQCjZQCa (aav.d:313) ==1582870== by 0x512F97: DsymbolTable::lookup(Identifier const*) (dsymbol.d:2408) ==1582870== by 0x510B6D: ScopeDsymbol::search(Loc const&, Identifier*, int) (dsymbol.d:1470) ==1582870== by 0x50D2D7: StructDeclaration::search(Loc const&, Identifier*, int) (dstruct.d:279) ==1582870== by 0x5E8A0A: _D3dmd6opover15search_functionFCQBe7dsymbol12ScopeDsymbolCQCe10identifier10IdentifierZCQDhQCd7Dsymbol (opover.d:1424) ==1582870== by 0x49DC50: _D3dmd5clone19hasIdentityOpEqualsFCQBh9aggregate20AggregateDeclarationPSQCs6dscope5ScopeZCQDk4func15FuncDeclaration (clone.d:462) ==1582870== by 0x49DF98: _D3dmd5clone13buildOpEqualsFCQBb7dstruct17StructDeclarationPSQCh6dscope5ScopeZCQCz4func15FuncDeclaration (clone.d:519) ==1582870== by 0x523A57: DsymbolSemanticVisitor::visit(StructDeclaration*) (dsymbolsem.d:4790) ==1582870== by 0x50D9E1: StructDeclaration::accept(Visitor*) (dstruct.d:502) ==1582870== by 0x514E65: dsymbolSemantic(Dsymbol*, Scope*) (dsymbolsem.d:131) ==1582870== by 0x576A2B: ExpressionSemanticVisitor::visit(DeclarationExp*) (expressionsem.d:5607) ==1582870== Address 0x20ec8348ec8b485d is not stack'd, malloc'd or (recently) free'd --- Prodding this in vgdb --- (gdb) p aa.b $10 = (dmd.root.aav.aaA **) 0x5ebb990 (gdb) monitor who_points_at 0x5ebb990 ==1582870== Searching for pointers to 0x5ebb990 ==1582870== *0x5ef8600 points at 0x5ebb990 Address 0x5ef8600 is in a rw- anonymous segment (gdb) p aa $11 = (dmd.root.aav.AA *) 0x5ef8600 --- There is nobody referencing the base address that was GC.realloc'd but the AA. So blaming xrealloc is the wrong thing here. Next info to retrieve, look at each address between &aa.b[0] .. &aa.b[b_length].
Comment #14 by ibuclaw — 2023-06-09T17:36:23Z
Hit! --- (gdb) monitor who_points_at 0x5ebb998 // &aa.b[1] ==1582870== Searching for pointers to 0x5ebb998 (gdb) monitor who_points_at 0x5ebb9a0 // &aa.b[2] ==1582870== Searching for pointers to 0x5ebb9a0 (gdb) monitor who_points_at 0x5ebb9a8 // &aa.b[3] ==1582870== Searching for pointers to 0x5ebb9a8 (gdb) monitor who_points_at 0x5ebb9b0 // &aa.b[4] ==1582870== Searching for pointers to 0x5ebb9b0 (gdb) monitor who_points_at 0x5ebb9b8 // &aa.b[5] ==1582870== Searching for pointers to 0x5ebb9b8 (gdb) monitor who_points_at 0x5ebb9c0 // &aa.b[6] ==1582870== Searching for pointers to 0x5ebb9c0 (gdb) monitor who_points_at 0x5ebb9c8 // &aa.b[7] ==1582870== Searching for pointers to 0x5ebb9c8 ==1582870== *0x5eef430 points at 0x5ebb9c8 Address 0x5eef430 is in a rw- anonymous segment ==1582870== *0xd964580 points at 0x5ebb9c8 Address 0xd964580 is in a rw- anonymous segment --- No hint as to where those references are at run-time however, it is clear that the ThisDeclaration/VarDeclaration object is a live object in the AST. --- (gdb) p aa.b[5] $57 = (dmd.root.aav.aaA *) 0x5efa4a0 (gdb) p aa.b[6] $58 = (dmd.root.aav.aaA *) 0x0 (gdb) p aa.b[7] $59 = (dmd.root.aav.aaA *) 0xde2c100 (gdb) p aa.b[8] $60 = (dmd.root.aav.aaA *) 0x5efa480 (gdb) monitor who_points_at 0x5efa4a0 // aa.b[5] ==1582870== Searching for pointers to 0x5efa4a0 ==1582870== *0x5ebb9b8 points at 0x5efa4a0 // &aa.b[5] points at (gdb) monitor who_points_at 0x5efa480 // aa.b[8] ==1582870== Searching for pointers to 0x5efa480 ==1582870== *0x5ebb9d0 points at 0x5efa480 // &aa.b[8] points at Address 0x5ebb9d0 is in a rw- anonymous segment (gdb) monitor who_points_at 0xde2c100 // aa.b[7] ==1582870== Searching for pointers to 0xde2c100 ==1582870== *0x5ebb9c8 points at 0xde2c100 // &aa.b[7] points at Address 0x5ebb9c8 is in a rw- anonymous segment ==1582870== *0xdddef80 points at 0xde2c100 // and many more... Address 0xdddef80 is in a rw- anonymous segment ==1582870== *0xdddef90 points at 0xde2c100 Address 0xdddef90 is in a rw- anonymous segment ==1582870== *0xde1bec0 points at 0xde2c100 Address 0xde1bec0 is in a rw- anonymous segment ==1582870== *0xde25490 points at 0xde2c100 Address 0xde25490 is in a rw- anonymous segment ==1582870== *0xde254b0 points at 0xde2c100 Address 0xde254b0 is in a rw- anonymous segment ==1582870== *0xde254c8 points at 0xde2c100 Address 0xde254c8 is in a rw- anonymous segment ==1582870== *0xde2c3c8 points at 0xde2c100 Address 0xde2c3c8 is in a rw- anonymous segment ==1582870== *0xde2d0e8 points at 0xde2c100 Address 0xde2d0e8 is in a rw- anonymous segment ==1582870== *0xde30550 points at 0xde2c100 Address 0xde30550 is in a rw- anonymous segment ==1582870== *0xde30880 points at 0xde2c100 Address 0xde30880 is in a rw- anonymous segment ==1582870== *0xde35428 points at 0xde2c100 Address 0xde35428 is in a rw- anonymous segment ==1582870== *0xde35568 points at 0xde2c100 Address 0xde35568 is in a rw- anonymous segment ==1582870== *0xde35668 points at 0xde2c100 Address 0xde35668 is in a rw- anonymous segment ==1582870== *0xde3d980 points at 0xde2c100 Address 0xde3d980 is in a rw- anonymous segment ==1582870== *0xde694a8 points at 0xde2c100 Address 0xde694a8 is in a rw- anonymous segment ==1582870== *0xde73b28 points at 0xde2c100 Address 0xde73b28 is in a rw- anonymous segment ==1582870== *0xde795a8 points at 0xde2c100 Address 0xde795a8 is in a rw- anonymous segment ==1582870== *0xde7c168 points at 0xde2c100 Address 0xde7c168 is in a rw- anonymous segment ==1582870== tid 1 register RAX pointing at 0xde2c100
Comment #15 by ibuclaw — 2023-06-09T20:49:27Z
The new title seems wrong to me. EscapeBy[] is GC allocated.
Comment #16 by dkorpel — 2023-06-09T21:24:50Z
I changed it back to something more generic. My initial diagnosis was wrong indeed, but it still seems to be related to the `Mem.xrealloc` call in dmd.escape.checkMutableArguments, since changing that makes the test case pass.
Comment #17 by ibuclaw — 2023-06-10T08:13:47Z
(In reply to Dennis from comment #16) > I changed it back to something more generic. My initial diagnosis was wrong > indeed, but it still seems to be related to the `Mem.xrealloc` call in > dmd.escape.checkMutableArguments, since changing that makes the test case > pass. Indeed, I still stand by my initial assessment that there are live references to memory being marked as free by the GC. To clarify, these are being explicitly marked free by the program, rather than the GC scan failing to find live references. Valgrind/vgdb confirms this at the moment in the program immediately before the segfault occurs. ``` (gdb) monitor who_points_at 0x5ebb9c8 ==1582870== Searching for pointers to 0x5ebb9c8 ==1582870== *0x5eef430 points at 0x5ebb9c8 Address 0x5eef430 is in a rw- anonymous segment ==1582870== *0xd964580 points at 0x5ebb9c8 Address 0xd964580 is in a rw- anonymous segment ``` Expected output is for valgrind to find no references, because the memory block is actively being used as part of a dynamic array (starting at 0x5ebb990). We know that the memory block was first allocated for another purpose, then subsequently marked as free'd in the GC from the initial printf debug traces of malloc/realloc addresses. ``` Mem.xrealloc((nil), 624) = 0x5ebb990 ... Mem.xrealloc(0x5ebb990, 832) = 0x5e63000 ... Mem.xmalloc(768) = 0x5ebb990 ``` Catching the moment xrealloc is called the second time and dumping all memory references would confirm or disprove my suspicions. Both GC.realloc and GC.free mark the base pointer as "free" in the GC regardless of whether there are any other live references to the memory block. This makes the use of both Mem.xrealloc and Mem.xfree unsafe if it being used for non-trivial data structures.
Comment #18 by ibuclaw — 2023-06-10T16:41:02Z
Back to valgrind/vgdb, set breakpoint in Mem::xrealloc if (p == 0x5eba660) ``` Thread 1 hit Breakpoint 2, Mem::xrealloc(void*, unsigned long) (p=0x5eba660, size=832) at src/dmd/root/rmem.d:92 92 if (isGCEnabled) (gdb) monitor who_points_at 0x5eba698 ==61853== Searching for pointers to 0x5eba698 ==61853== *0x5eba690 points at 0x5eba698 Address 0x5eba690 is in a rw- anonymous segment ``` ^--- We have 1 reference to the address that later causes issues. ``` (gdb) monitor who_points_at 0x5eba690 ==61853== Searching for pointers to 0x5eba690 ``` ^--- But we still don't know where that reference is coming from. ``` (gdb) p *(void**)0x5eba698 $10 = (void *) 0x5e8aa00 (gdb) p **(void***)0x5eba698 $11 = (void *) 0x8f5a80 <vtable for dmd.declaration.VarDeclaration> ``` ^--- It is initialized with a non-null value, that appears to be a VarDeclaration class object. ``` (gdb) up #1 0x0000000000549df5 in _D3dmd6escape21checkMutableArgumentsFPSQBl6dscope5ScopeCQCc4func15FuncDeclarationCQDc5mtype12TypeFunctionCQEa10expression10ExpressionPSQFd4root5array__T5ArrayTQCcZQlbZb (gag=false, arguments=0x4fb97e0, ethis=0x0, tf=0x584de00, fd=0x4fbf990, sc=0x5883a10) at src/dmd/escape.d:103 103 auto newPtr = cast(EscapeBy*)mem.xrealloc(escapeBy.ptr, len * EscapeBy.sizeof); (gdb) p escapeBy $12 = {{er = {byref = {length = 0, data = 0x0, smallarray = {0x0}}, byvalue = { length = 0, data = {0x5e8aa00}, smallarray = {0x5e8aa00}}, byfunc = { length = 0, data = 0x0, smallarray = {0x0}}, byexp = {length = 0, data = 0x0, smallarray = {0x0}}, refRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}, expRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}}, param = 0x5e8fd80, isMutable = false}, {er = {byref = {length = 0, data = 0x0, smallarray = { 0x0}}, byvalue = {length = 0, data = {0x5e8ab00}, smallarray = { 0x5e8ab00}}, byfunc = {length = 0, data = 0x0, smallarray = {0x0}}, byexp = {length = 0, data = 0x0, smallarray = {0x0}}, refRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}, expRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}}, param = 0x5e8fdb0, isMutable = false}, {er = {byref = {length = 0, data = 0x0, smallarray = {0x0}}, byvalue = {length = 0, data = 0x0, smallarray = {0x0}}, byfunc = {length = 0, data = 0x0, smallarray = { 0x0}}, byexp = {length = 0, data = 0x0, smallarray = {0x0}}, refRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}, expRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}}, param = 0x5e8fde0, isMutable = false}} ``` ^--- Ah-ha! there's an address we've seen before. ``` (gdb) p escapeBy.ptr[0] $13 = {er = {byref = {length = 0, data = 0x0, smallarray = {0x0}}, byvalue = { length = 0, data = {0x5e8aa00}, smallarray = {0x5e8aa00}}, byfunc = { length = 0, data = 0x0, smallarray = {0x0}}, byexp = {length = 0, data = 0x0, smallarray = {0x0}}, refRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}, expRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}}, param = 0x5e8fd80, isMutable = false} (gdb) p escapeBy.ptr[0].er.byvalue $14 = {length = 0, data = {0x5e8aa00}, smallarray = {0x5e8aa00}} (gdb) p escapeBy.ptr[0].er.byvalue.data.ptr $15 = (dmd.declaration.VarDeclaration **) 0x5eba698 (gdb) p &escapeBy.ptr[0].er.byvalue.data.ptr $16 = (dmd.declaration.VarDeclaration ***) 0x5eba690 ``` ^--- And there it is. 0x5eba690 is an address that's being pointed to by the escapeByStorage global variable. So the memory range from the base `.ptr` (0x5eba660) has a `data.ptr` self reference to a part of itself (0x5eba698) thanks to the smallarray optimization in Array(T). We can all guess what's going to happen when we realloc this memory, but let's finish stepping through runtime anyway for completeness sake. ``` Thread 1 hit Breakpoint 3, _D3dmd6escape21checkMutableArgumentsFPSQBl6dscope5ScopeCQCc4func15FuncDeclarationCQDc5mtype12TypeFunctionCQEa10expression10ExpressionPSQFd4root5array__T5ArrayTQCcZQlbZb (gag=false, arguments=0x4fb97e0, ethis=0x0, tf=0x584de00, fd=0x4fbf990, sc=0x5883a10) at src/dmd/escape.d:105 105 memset(newPtr + escapeBy.length, 0, (len - escapeBy.length) * EscapeBy.sizeof); (gdb) p newPtr $25 = (dmd.escape.checkMutableArguments.EscapeBy *) 0x5eef400 (gdb) p *newPtr $26 = {er = {byref = {length = 0, data = 0x0, smallarray = {0x0}}, byvalue = { length = 0, data = {0x5e8aa00}, smallarray = {0x5e8aa00}}, byfunc = { length = 0, data = 0x0, smallarray = {0x0}}, byexp = {length = 0, data = 0x0, smallarray = {0x0}}, refRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}, expRetRefTransition = {length = 0, data = 0x0, smallarray = {false}}}, param = 0x5e8fd80, isMutable = false} (gdb) p newPtr.er.byvalue.data.ptr $27 = (dmd.declaration.VarDeclaration **) 0x5eba698 ``` ^--- No surprises there, the "newPtr" returned by the GC has a reference to the old escapeBy.ptr memory (which the GC has just marked as free too). ``` (gdb) p &newPtr.er.byvalue.data.ptr $28 = (dmd.declaration.VarDeclaration ***) 0x5eef430 ``` ^--- Confirmed! There's the pointer reference from the first valgrind run we were looking for (0x5eef430). ``` Thread 1 received signal SIGTRAP, Trace/breakpoint trap. 0x00000000006a33b5 in _D3dmd4root3aav15dmd_aaGetRvalueFNaNbNiPSQBnQBmQBk2AAPvZQd (key=0x4fbca00, aa=0x5ef7660) at src/dmd/root/aav.d:127 127 if (key == e.key) (gdb) p aa.b $34 = (dmd.root.aav.aaA **) 0x5eba660 (gdb) p &aa.b[7] $35 = (dmd.root.aav.aaA **) 0x5eba698 (gdb) monitor who_points_at 0x5eba698 ==61853== Searching for pointers to 0x5eba698 ==61853== *0x5eef430 points at 0x5eba698 Address 0x5eef430 is in a rw- anonymous segment <-- !!! Here ==61853== *0xd964580 points at 0x5eba698 Address 0xd964580 is in a rw- anonymous segment ```
Comment #19 by ibuclaw — 2023-06-10T16:41:50Z
Lessons inferred from this. Don't use `Mem.xrealloc` on a `dmd.root.array.Array(T)` type!
Comment #20 by dlang-bot — 2023-06-10T18:16:02Z
dlang/dmd pull request #15302 "Fix 23978 - ICE: dip1021 memory corruption" was merged into stable: - 32a4a5dc52cd6f6fc812ae76ac7d654518d4492d by Dennis Korpel: Fix 23978 - ICE: EscapeBy[] is malloced, but contains GC-allocated objects https://github.com/dlang/dmd/pull/15302
Comment #21 by dlang-bot — 2023-06-16T09:07:47Z
dlang/dmd pull request #15325 "merge stable" was merged into master: - 167b0504293926b3f9cecbb67b05e1e50f2150d5 by Dennis: Fix 23978 - ICE: EscapeBy[] is malloced, but contains GC-allocated objects (#15302) https://github.com/dlang/dmd/pull/15325
Comment #22 by dlang-bot — 2023-06-16T09:07:53Z
dlang/dmd pull request #15310 "Merge stable" was merged into master: - 167b0504293926b3f9cecbb67b05e1e50f2150d5 by Dennis: Fix 23978 - ICE: EscapeBy[] is malloced, but contains GC-allocated objects (#15302) https://github.com/dlang/dmd/pull/15310