Bug 14459 – String literal merge bug causes incorrect runtime program behavior

Status
RESOLVED
Resolution
FIXED
Severity
critical
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2015-04-17T22:04:00Z
Last change time
2017-07-22T12:36:20Z
Keywords
pull, wrong-code
Assigned to
nobody
Creator
gorox

Attachments

IDFilenameSummaryContent-TypeSize
1514strlit16bug.dstrlit16bug.dtext/plain1091

Comments

Comment #0 by gorox — 2015-04-17T22:04:53Z
Created attachment 1514 strlit16bug.d The front end propagates const char* string literals in such a way that backend must merge identical constants or else funny things may happen at runtime. The DMD backend merges string literals but only caches 16 unique strings, so a carefully crafted program with more than 16 string literals can produce erroneous results. See attached code, which follows this pattern: { const char* s0 = "hi0"; const(char)* p0 = s0; assert(p0 == s0); const char* s1 = "hi1"; const(char)* p1 = s1; ... const char* s15 = "hi15"; const(char)* p15 = s15; assert(p0 == s0); // ok const char* s16 = "hi16"; const(char)* p16 = s16; assert(p0 == s0); // fails } This was uncovered while digging into an LDC issue #898 and trying to understand how DMD front end propagates string literals for const char*. https://github.com/ldc-developers/ldc/issues/898 Output from attached strlit16bug.d $ dmd --version DMD64 D Compiler v2.067.0 Copyright (c) 1999-2014 by Digital Mars written by Walter Bright $ dmd -run strlit16bug [email protected](40): Assertion failure ---------------- 5 dmd_runN2D2U4 0x000000010855ee54 void strlit16bug.__assert(int) + 44 6 dmd_runN2D2U4 0x000000010855ee21 _Dmain + 193 7 dmd_runN2D2U4 0x0000000108573b94 D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZ9__lambda1MFZv + 40 8 dmd_runN2D2U4 0x0000000108573ad9 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 45 9 dmd_runN2D2U4 0x0000000108573b39 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll() + 45 10 dmd_runN2D2U4 0x0000000108573ad9 void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate()) + 45 11 dmd_runN2D2U4 0x0000000108573a4c _d_run_main + 504 12 dmd_runN2D2U4 0x000000010855ee6c main + 20 13 libdyld.dylib 0x00007fff8dd375c9 start + 1 14 ??? 0x0000000000000001 0x0 + 1 $
Comment #1 by yebblies — 2015-04-19T02:48:53Z
Why do you think string literals must be merged? I would guess it's in 'implementation defined' territory, and identity equality should not be relied on.
Comment #2 by ketmar — 2015-04-19T02:51:52Z
yet in the case when i'm assigning pointer to pointer i'm expecting that two pointers are the same. i'd never expect the following fail, under no circumstances: const char* s16 = "hi16"; const(char)* p16 = s16; assert(p0 == s0); // fails
Comment #3 by gorox — 2015-04-19T16:48:06Z
(In reply to yebblies from comment #1) > Why do you think string literals must be merged? I would guess it's in > 'implementation defined' territory, and identity equality should not be > relied on. Yes I agree in general it is implementation defined for any D compiler to merge string literals. In this case, one pointer is initialized to the the value of another, so they do need to be equal. The dmd front end code propagates const char* literals into unique StringExps when used. Now the backend must merge StringExps with identical literals or else bad code can be generated. In LDC which uses dmd front end, it was worse because string literals were merged late and even simpler code was wrong. See https://github.com/ldc-developers/ldc/issues/898 for details. In the snippet, it leads to p0 == s0, as they should be, then a few lines later, suddenly p0 != s0. Anyway, it is in the backend el.c stable[16] where the cache is defined. Also, the code in the frond end that does the const propagation of string literals is fromConstInitializer() in optimize.c
Comment #4 by code — 2015-04-23T11:24:02Z
Indeed, p0 == s0 must never fail as they are both pointers and one is initialized from the other, irrespective of what we guarantee or don't guarantee about string literal merging.
Comment #5 by k.hara.pg — 2015-08-06T16:01:15Z
https://github.com/D-Programming-Language/dmd/pull/4866(In reply to Dan Olson from comment #3) > Anyway, it is in the backend el.c stable[16] where the cache is defined. > Also, the code in the frond end that does the const propagation of string > literals is fromConstInitializer() in optimize.c Fixed the bug in optimize.c: https://github.com/D-Programming-Language/dmd/pull/4866 Honestly I didn't know about this issue until now (my random bugzila walk fortunately hit). I'm curious why the detailed analysis result couldn't put out actual PR.
Comment #6 by code — 2015-08-06T16:36:51Z
(In reply to Kenji Hara from comment #5) > I'm curious why the detailed analysis result couldn't put > out actual PR. As for me, that's because Daniel already knew how to fix the issue, but I couldn't convince him that the behavior is actually wrong at DConf (or rather, the day after).
Comment #7 by k.hara.pg — 2015-08-06T17:05:43Z
I think it's clearly a bug. Because: 1. The identiry of the string literal is saved into a variable, then copied to one another variable. The two pointer variables comparison must match. 2. But the excessive front end 'optimization' didn't save the identity. It's a problem. A point is the variable s0 is a stack allocated variable. Its value is evaluated and determined at runtime. The optimization wrongly beyond the compile time/runtime evaluation boundary.
Comment #8 by github-bugzilla — 2015-08-06T19:33:36Z
Commits pushed to master at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/3252ddcf36f53928d04b62a601c0b7bb9d911cba fix Issue 14459 - String literal merge bug causes incorrect runtime program behavior https://github.com/D-Programming-Language/dmd/commit/fd8e527b2a950eaa437426e8201572e0da51000f Merge pull request #4866 from 9rnsr/fix14459 Issue 14459 - String literal merge bug causes incorrect runtime program behavior
Comment #9 by github-bugzilla — 2017-07-22T12:36:20Z
Commits pushed to dmd-cxx at https://github.com/dlang/dmd https://github.com/dlang/dmd/commit/3252ddcf36f53928d04b62a601c0b7bb9d911cba fix Issue 14459 - String literal merge bug causes incorrect runtime program behavior https://github.com/dlang/dmd/commit/fd8e527b2a950eaa437426e8201572e0da51000f Merge pull request #4866 from 9rnsr/fix14459