Bug 21656 – [REG2.091] Wrong file read during exception stringification leads to SIGBUS

Status
RESOLVED
Resolution
FIXED
Severity
regression
Priority
P1
Component
druntime
Product
D
Version
D2
Platform
All
OS
Linux
Creation time
2021-02-22T11:00:48Z
Last change time
2021-12-13T13:23:49Z
Keywords
pull
Assigned to
No Owner
Creator
Vladimir Panteleev

Comments

Comment #0 by dlang-bugzilla — 2021-02-22T11:00:48Z
/////////////// bug.d ////////////// void main() { try throw new Exception("Test"); catch (Exception e) e.toString(); } //////////////////////////////////// To reproduce: dmd bug.d && mkdir -p a && cd a && touch bug && PATH=.. bug Introduced in https://github.com/dlang/druntime/pull/2330 There might be two bugs here: 1. The wrong file is being read. This is the more important bug. It might contain something else entirely, or even point to something that the problem really should not access. (In my case, it pointed to a directory hosting a FUSE filesystem provided by the program - so it therefore had the same name as the program - which led to a deadlock.) 2. Perhaps an empty file should not cause a SIGBUS. The executable file may have been modified between the point when the program was started and when Druntime attempts to read it. For example, if the developer is recompiling the program while an older instance is still running, said instance could conceivably see an empty file as the compiler is writing the new version.
Comment #1 by kinke — 2021-02-22T11:40:45Z
This seems to boil down to the usage of `program_invocation_name` on Linux to obtain the name of the executable file (though that hasn't changed in the linked druntime PR). It's apparently just `argv[0]` and so not really robust. Not sure if BSD's `getprogname()` is better in this regard. Too bad `std.file.thisExePath()` is in Phobos and thus not usable.
Comment #2 by dlang-bugzilla — 2021-02-22T11:43:31Z
(In reply to kinke from comment #1) > (though that hasn't changed in the linked druntime PR) But, why would bisection point to that PR as the earliest point for the observed SIGBUS, then? > Too bad `std.file.thisExePath()` is in Phobos and thus not usable. I don't see why it couldn't be moved to Druntime. It looks like it would be a good fit.
Comment #3 by dlang-bugzilla — 2021-02-22T11:55:34Z
BTW, not sure about other systems, but on Linux, reading /proc/self/exe is going to be more reliable than via thisExePath. The reason being that /proc/self/exe is a special directory entry which continues to be valid (openable) even when the target that it (as a symlink) is pointing at is gone. So, perhaps just moving thisExePath to Druntime is not the best option.
Comment #4 by petar.p.kirov — 2021-02-22T15:29:38Z
Could we move it to druntime and then make it use `/proc/self/exe` on Linux?
Comment #5 by dlang-bugzilla — 2021-02-22T16:05:28Z
It would have to be a different function. `thisExePath` returns a string, but what we really need is a file descriptor. (And, we can't return the string literal "/proc/self/exe" from `thisExePath` on Linux, that would break other uses such as reading files adjacent to the program's executable.)
Comment #6 by doob — 2021-02-22T20:15:40Z
`thisExePath` uses `/proc/self/exe` on Linux, but it uses `readlink` to read out the path, instead of opening the actual file. `getprogname` is no better than `program_invocation_name`. They have more or less the same semantics. `getprogname` returns what was stored by `setprogname`. This is what the man pages say about `setprogname`: The setprogname() function sets the name of the program to be the last component of the progname argument. The best would probably be to copy the implementation of `thisExePath`, but change it to open the file and return a file descriptor. A file descriptor to the executable also seems to be available in the auxiliary vector on Linux [1]. This can be accessed with the function `getauxval` or directly after the `envp` parameter in the C main function. I see references to the auxiliary vector in *BSD systems as well. Might be worth checking out. BTW, there does not seem to be a reliable way to get the path to the current executable on OpenBSD. [1] http://articles.manugarg.com/aboutelfauxiliaryvectors.html [2] https://lwn.net/Articles/519085/
Comment #7 by dlang-bugzilla — 2021-02-23T02:30:56Z
(In reply to Jacob Carlborg from comment #6) > A file descriptor to the executable also seems to be available in the > auxiliary vector on Linux [1]. This can be accessed with the function > `getauxval` or directly after the `envp` parameter in the C main function. I > see references to the auxiliary vector in *BSD systems as well. Might be > worth checking out. Interesting, though as far as I can see, that file descriptor is only available to interpreters, and is no longer there at the time that the interpreted program has begun execution. Otherwise, it would take up a file descriptor slot, and thus would be a very noticeable fourth addition to the standard stdin/stdout/stderr streams. So it doesn't look like we can use this unfortunately.
Comment #8 by doob — 2021-02-23T08:59:03Z
You are right that the file descriptor (AT_EXECFD) won't work, I just tested that, it doesn't return a valid value. But, perhaps even better, it's possible to access the address of the program headers of the executable (AT_PHDR) [1]. Here's an experiment I did on Linux: # cat main.d import core.stdc.stdio; import core.internal.elf.dl; extern (C) ulong getauxval(ulong type); void main() { const aux = cast(void*) getauxval(3); const baseAddress = SharedObject.thisExecutable.baseAddress; printf("aux=%p baseAddress=%p diff=%ld\n", aux, baseAddress, aux - baseAddress); } # dmd -run main.d aux=0x55eb0f1e4040 baseAddress=0x55eb0f1e4000 diff=64 It returns a slightly different address than `SharedObject.thisExecutable.baseAddress`, but it's close enough that it cannot be a coincident. Perhaps someone with more knowledge of `core.internal.efl` can explain why the difference. [1] https://man7.org/linux/man-pages/man3/getauxval.3.html
Comment #9 by kinke — 2021-02-23T10:26:29Z
I've just hit a related issue, a regression in master which was unfortunately cherry-picked for LDC 1.25 final: https://github.com/dlang/druntime/pull/3382 The problem in this specific case was that the unittest process changes the working dir (unit-threaded SandBox) and then an exception is thrown. The executable can thus not be opened anymore, and no file/line infos are available (if it wouldn't segfault before). I guess this scenario is 1000x more likely than the executable file being in Nirvana at the time an exception is thrown, and a real problem.
Comment #10 by dlang-bot — 2021-12-07T20:10:38Z
@kinke created dlang/druntime pull request #3643 "Fix Issue 21656 - Read correct ELF executable for DWARF file/line infos" fixing this issue: - Fix Issue 21656 - Read correct ELF executable for DWARF file/line infos By porting Phobos' `thisExePath()` to druntime (for ELF platforms), in a `@nogc nothrow` fashion for low-level usage in exception backtraces. https://github.com/dlang/druntime/pull/3643
Comment #11 by dlang-bot — 2021-12-13T13:23:49Z
dlang/druntime pull request #3643 "Fix Issue 21656 - Read correct ELF executable for DWARF file/line infos" was merged into stable: - dc17630863449c04128d912bca3275313fb2a0bc by Martin Kinkelin: Fix Issue 21656 - Read correct ELF executable for DWARF file/line infos By porting Phobos' `thisExePath()` to druntime (for ELF platforms), in a `@nogc nothrow` fashion for low-level usage in exception backtraces. https://github.com/dlang/druntime/pull/3643