Bug 1358 – ICE(root.c) on Unicode codepoints greater than 0x7FFFFFFF

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
dmd
Product
D
Version
D1 (retired)
Platform
x86
OS
Linux
Creation time
2007-07-20T15:57:00Z
Last change time
2014-02-16T15:26:24Z
Keywords
ice-on-invalid-code, patch
Assigned to
bugzilla
Creator
aziz.koeksal

Attachments

IDFilenameSummaryContent-TypeSize
3501358patch.patchPatch against DMD2.029text/plain399

Comments

Comment #0 by aziz.koeksal — 2007-07-20T15:57:45Z
auto foo = '\U80000000'; // invalid UTF character \U80000000 rebuild: root.c:1490: void OutBuffer::writeUTF8(unsigned int): Assertion `0' failed.
Comment #1 by thomas-dloop — 2007-07-23T14:57:25Z
Comment #2 by aziz.koeksal — 2007-07-24T07:01:31Z
Sorry, the example is wrong. This should fire the error: auto bla = "\U80000000"; src/Parser.d(125): invalid UTF character \U80000000 rebuild: root.c:1490: void OutBuffer::writeUTF8(unsigned int): Assertion `0' failed.
Comment #3 by clugdbug — 2009-05-05T01:55:15Z
Created attachment 350 Patch against DMD2.029 This is a trivial one. After printing the error message, just change it to a valid char to avoid the later ICE.
Comment #4 by bugzilla — 2009-07-09T02:45:46Z
Fixed dmd 1.046 and 2.031
Comment #5 by eljay — 2009-08-07T06:58:39Z
The Unicode codespace is 0 to 10FFFF, which is a 21 bit space. So \U80000000 is not a valid Unicode codepoint. Even for ISO 10646, which is a 31 bit space (and has an interesting relationship to Unicode), \U80000000 is not a valid ISO 10646 codepoint either. I'd expect \U80000000 to be a "gigo bug". Still, should not cause ICE.