Bug 5995 – string append negative integer causes segfault

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2011-05-13T01:48:00Z
Last change time
2017-07-05T20:57:43Z
Keywords
bootcamp
Assigned to
lucia.mcojocaru
Creator
andrden

Comments

Comment #0 by andrden — 2011-05-13T01:48:27Z
Digital Mars D Compiler v2.052 $ dmd SpawnBug.d $ ./SpawnBug Segmentation fault (core dumped) $ cat SpawnBug.d void main(){ string ret; int i = -1; ret ~= i; }
Comment #1 by kennytm — 2011-05-13T02:14:37Z
This may or may not be expected. Appending any non-Unicode (> 0x10ffff) character will halt the program. In _d_arrayappendcd (https://github.com/D-Programming-Language/druntime/blob/master/src/rt/lifetime.d#L1762, BTW why this is in rt/lifetime.d?!): else if (c <= 0x10FFFF) { ... } else assert(0); // invalid utf character - should we throw an exception instead?
Comment #2 by dlang-bugzilla — 2011-05-13T04:14:04Z
Calling onUnicodeError would be more appropriate.
Comment #3 by schveiguy — 2011-05-16T10:31:52Z
Hm... I think in general it is a design flaw to allow int to implicitly cast to dchar. I think that is the source of the problem. Going from (d|w)char to the appropriate width integer should be fine, but going the other way seems prone to error. Note that in lifetime.d, the assert(0) should not lead to a segmentation fault.
Comment #4 by issues.dlang — 2011-05-16T10:44:52Z
Implicitly converting to the same-size _unsigned_ integral type might be fine, but converting to a signed type would be a narrowing conversion. I'd still argue that converting between any of the character types and any of the integral types should require a cast though simply because they're not only different types, they're different types of types. The character types are for characters and the integral types are for integers. Regardless, no implicit conversion should be permitted when it's a narrowing conversion. Narrowing conversions should require casts.
Comment #5 by timon.gehr — 2011-05-16T11:01:47Z
(In reply to comment #3) > Hm... I think in general it is a design flaw to allow int to implicitly cast to > dchar. > > I think that is the source of the problem. > > Going from (d|w)char to the appropriate width integer should be fine, but going > the other way seems prone to error. > > Note that in lifetime.d, the assert(0) should not lead to a segmentation fault. assert(0) emits asm{hlt;} when compiled in release mode. Encountering hlt _is_ a segmentation fault, so this is just fine.
Comment #6 by timon.gehr — 2011-05-16T11:07:36Z
(In reply to comment #4) > Implicitly converting to the same-size _unsigned_ integral type might be fine, > but converting to a signed type would be a narrowing conversion. I'd still > argue that converting between any of the character types and any of the > integral types should require a cast though simply because they're not only > different types, they're different types of types. The character types are for > characters and the integral types are for integers. Regardless, no implicit > conversion should be permitted when it's a narrowing conversion. Narrowing > conversions should require casts. How is that "narrowing"? No information is lost. @Topic: void main(){ uint i=-1; //fine dchar c=-1; //compile time error }
Comment #7 by issues.dlang — 2011-05-16T11:19:03Z
dchar is unsigned. int is signed. They don't cover the same range of values. Converting from one to the other in either direction is a narrowing conversion. I expect that the only reason that uint i = -1; compiles is to make it easy to create the unsigned value whose equivalent is -1 or some other reason related to C code. But personally, I don't think that it should compile without a cast, because you cannot represent -1 in a uint.
Comment #8 by schveiguy — 2011-05-16T11:26:52Z
(In reply to comment #6) > void main(){ > uint i=-1; //fine > dchar c=-1; //compile time error > } Just tried this and it indeed produces an error: Error: cannot implicitly convert expression (-1) of type int to dchar So I wonder why this works? Seems inconsistent: int i = -1; dchar c = i; Also, the reporter's issue seems to be inconsistent with that error.
Comment #9 by briancschott — 2014-06-06T20:12:58Z
Still present in 2.065.
Comment #10 by hsteoh — 2014-09-11T19:04:57Z
Still present in git HEAD (2.067b).
Comment #11 by hsteoh — 2014-10-30T22:19:32Z
Should appending invalid codepoints append the Unicode replacement character instead?
Comment #12 by schveiguy — 2014-10-31T11:59:34Z
(In reply to hsteoh from comment #11) > Should appending invalid codepoints append the Unicode replacement character > instead? I think implicit casting of int to dchar should be invalid altogether. See my 2011 comment.
Comment #13 by erictsau — 2015-04-09T01:55:25Z
Still present in 2.067
Comment #14 by lucia.mcojocaru — 2016-11-22T14:19:44Z
This is a problem in the compiler. https://github.com/dlang/dmd/blob/master/src/dcast.d#L66 https://github.com/dlang/dmd/blob/master/src/mtype.d#L4150 I will open a PR shortly to disable implicit cast of int -> dchar. Should we disable the implicit cast of all integral types to chars? For example, is it expected to make an implicit cast from uint to dchar? (The compiler itself seems to rely on implicit casts of unit -> dchar. Compiling the compiler with this cast disabled produces some errors.) This is also enabled: bool -> dchar. Not sure if it is desirable. Expression.implicitCastTo(z of type bool) => dchar
Comment #15 by schveiguy — 2016-11-22T15:27:39Z
Lucia, I think nothing should implicitly cast to dchar. Not bool, int, or even char or wchar. But something this drastic needs approval from Walter and Andrei. Of course, we definitely need a deprecation step before completely banning it -- this will certainly break a lot of code.
Comment #16 by andrei — 2016-11-22T16:16:08Z
This bug has a simple fix - throw a runtime exception (e.g. by onUnicodeError) instead of assert(0). We shouldn't change language rules on account of this. Thanks!
Comment #17 by schveiguy — 2016-11-22T17:02:04Z
There are two problems, one is that the OP's code compiles, the other is that it segfaults. Arguably, fixing the first problem will fix the second. But just fixing the second leaves other problems still intact. Also, note that this succeeds, but likely does not do what the writer wants: string s = "123456"; s ~= 7; Guess what this does (yes, it compiles)? s ~= 123456;
Comment #18 by lucia.mcojocaru — 2016-11-24T10:59:49Z
As Andrei suggested, here is the quick fix: PR https://github.com/dlang/druntime/pull/1696 Language design changes should be discussed with Walter and Andrei in depth.
Comment #19 by github-bugzilla — 2016-12-06T05:04:51Z
Commits pushed to master at https://github.com/dlang/druntime https://github.com/dlang/druntime/commit/316e6d2607b4b22794ef75a331ad27d970717cda fix issue 5995 https://github.com/dlang/druntime/commit/6dbbadbac4a0567ba49f0e616fccc8c597fec771 Merge pull request #1696 from somzzz/issue_5995 fix issue 5995 - string append negative integer causes segfault
Comment #20 by github-bugzilla — 2016-12-27T13:11:03Z
Comment #21 by github-bugzilla — 2017-01-07T03:01:56Z
Comment #22 by github-bugzilla — 2017-01-16T23:24:25Z
Comment #23 by dlang-bugzilla — 2017-07-05T20:57:43Z
*** Issue 16545 has been marked as a duplicate of this issue. ***