Bug 3214 – Incorrect DWARF line number debugging information on Linux

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
dmd
Product
D
Version
D1 (retired)
Platform
x86
OS
Linux
Creation time
2009-07-29T12:58:00Z
Last change time
2014-04-18T09:12:07Z
Keywords
patch, wrong-code
Assigned to
nobody
Creator
nfxjfg
Depends on
3540
Blocks
4044

Attachments

IDFilenameSummaryContent-TypeSize
435patchdmd patch to fix the problem (against dmd 1.046)text/plain377

Comments

Comment #0 by nfxjfg — 2009-07-29T12:58:59Z
Here's an example that shows that dmd obviously/probably generates incorrect DWARF line number debugging information under Linux. The example consists of several modules, and I don't know if it's further reducible, or where the problem actually is. (I'm happy that's it's reproducible at all.) a.d >>> module a; import b; void main(char[][] args) { throw new Exception("huh"); } <<< b.d >>> module b; import c; void foo1() { } <<< c.d >>> module c; class Foo2 : Exception { this() { super("huh"); } } <<< Compile exactly like this: dmd -c a.d b.d c.d -od. -g dmd a.o b.o c.o -ofa -g Now look what is wrong and check if the symbol "_D1c4Foo25_ctorMFZC1c4Foo2" is correct: 1. Get address: nm a|grep _D1c4Foo25_ctorMFZC1c4Foo2 |cut -f2 -d: | cut -f1 -d' ' (outputs something like "08049b64") 2. Resolve the address: addr2line -e a 0x08049b64 This outputs "/tmp/test/xx/b.d:5", although the symbol is in c.d. Somehow, the address is incorrectly resolved, which suggests that the compiler might generate incorrect debugging information.
Comment #1 by nfxjfg — 2009-07-29T21:19:56Z
Created attachment 435 dmd patch to fix the problem (against dmd 1.046) I think I found out why it doesn't work: when compiling several files at once, the object writer's state isn't reset correctly. Now I don't know how the hell the backend works; I just noticed that the DWARF writer still tried to write line numbers from the previous file (also notice what's added by objlinnum()). In this case, c.o contained some line number information from b.o, which is why addr2line resolved the address mentioned to b.d. I'm posting a trivial 3 line patch that seems to solve the problem. It simply makes obj_init() forget about the SegData from the previous object file. I don't know if it's correct or complete, but with my test cases (the one posted above and a larger, non-trivial heap of code) it seemed to work very well. The patch causes a little memory leak, but that'd trivial to fix (maybe not setting seg_max to 0 does it). There's just one more problem. When resolving some invalid address with addr2line, I get this output: BFD: Dwarf Error: mangled line number section (bad file number). BFD: Dwarf Error: mangled line number section (bad file number). BFD: Dwarf Error: Could not find abbrev number 1014. BFD: Dwarf Error: Could not find abbrev number 119. BFD: Dwarf Error: Could not find abbrev number 651. BFD: Dwarf Error: Could not find abbrev number 53. BFD: Dwarf Error: Could not find abbrev number 84. BFD: Dwarf Error: Could not find abbrev number 657. BFD: Dwarf Error: Could not find abbrev number 1230. I have no idea what's up with that and I can't reproduce it with a simpler testcase.
Comment #2 by robert — 2009-09-08T05:49:22Z
Don't forget to add the patch keyword along with the patch so it doesn't get missed :)
Comment #3 by nfxjfg — 2009-11-21T18:28:15Z
Also see 3540.
Comment #4 by nfxjfg — 2009-11-22T21:40:49Z
Problem still persists in dmd 1.051 and the dmd svn version (as of revision 267). I assume there is some other problem, that doesn't make fixing the bug as simple as it seems?
Comment #5 by nfxjfg — 2010-04-01T05:58:19Z
It's strange... With my patch, my executable has a size of 16 MB, with a .debug_line section size of ca. 500 KB. Without my patch, it's 32 MB, of which .debug_line takes up 16 MB. PS: With the fix for bug 3987, the addr2line error messages go away (when compiling with -gc). Which means my patch doesn't cause additional problems at least in this aspect.
Comment #6 by bugzilla — 2011-04-04T12:36:33Z