Bug 6085 – The filename part of a thrown core.exception.UnicodeException is incomprehensible
Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
druntime
Product
D
Version
D2
Platform
Other
OS
Mac OS X
Creation time
2011-06-01T01:57:00Z
Last change time
2011-06-06T12:30:15Z
Keywords
diagnostic, patch
Assigned to
nobody
Creator
kennytm
Comments
Comment #0 by kennytm — 2011-06-01T01:57:56Z
Comment #1 by kennytm — 2011-06-01T02:00:01Z
Argh, the Unicode seems to make the post disappeared. Let me try again.
Test case:
==================================
void main() {
string s = "\xff\xff\xff\0\0\0";
foreach (dchar c; s) {}
}
==================================
core.exception.UnicodeException@<hundred lines of garbage skipped>
opEqualsMFC6ObjectZb(0): invalid UTF-8 sequence
----------------
5 x 0x000096d6 onUnicodeError + 66
6 x 0x000151e1 dchar rt.util.utf.decode(const(char[]), ref uint) + 373
7 x 0x00012218 _aApplycd1 + 68
<snip>
----------------
==================================
The problem seems to be that the string __FILE__ sent to onUnicodeError is corrupt. Further more, the __LINE__ displayed is always 0.
Comment #2 by kennytm — 2011-06-01T02:17:14Z
The reason of corrupted output is because of a wrong signature.
diff --git a/src/rt/util/utf.d b/src/rt/util/utf.d
index d7aeac1..cdbc27c 100644
--- a/src/rt/util/utf.d
+++ b/src/rt/util/utf.d
@@ -28,7 +28,7 @@
module rt.util.utf;
-extern (C) void onUnicodeError( string msg, size_t idx );
+extern (C) void onUnicodeError( string msg, size_t idx, string file = __FILE__, size_t line = __LINE__ );
/*******************************
* Test if c is a valid UTF-32 character.
This patch, however, only gives the __FILE__ and __LINE__ of the druntime function, not the actual user code that emits the error. But I think this can't be fixed without modifying DMD because _aApplycd1 doesn't contain the line information.
core.exception.UnicodeException@src/rt/util/utf.d(290): invalid UTF-8 sequence
----------------
5 x 0x000096b2 onUnicodeError + 66
6 x 0x000151ce dchar rt.util.utf.decode(const(char[]), ref uint) + 390
7 x 0x000121f4 _aApplycd1 + 68
8 x 0x000027e5 _Dmain + 37