Bug 20148 – void initializated bool can be both true and false

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P3
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2019-08-20T22:21:12Z
Last change time
2024-06-05T11:18:39Z
Keywords
backend, pull, safe
Assigned to
No Owner
Creator
ag0aep6g
See also
https://issues.dlang.org/show_bug.cgi?id=24582

Comments

Comment #0 by ag0aep6g — 2019-08-20T22:21:12Z
This is a spin-off from issue 19968. This program can exhibit undefined behavior even `main` is @safe and `f` is correctly @trusted: ---- void main() @safe { bool b = void; f(b); } void f(bool cond) @trusted { import core.stdc.stdlib: free, malloc; byte b; void* p = cond ? &b : malloc(1); if(!cond) free(p); } ---- Typical output: ---- munmap_chunk(): invalid pointer Error: program killed by signal 6 ---- That means `free` is being called on `&b`. That operation has undefined behavior. But that can only happen if `cond` is both true and false at the same time. Surely, an @trusted function should be allowed to assume that a bool is either true or false, and not both.
Comment #1 by simen.kjaras — 2019-08-21T07:58:51Z
So instead of closing the obvious hole of @safe functions using void initialization we're just poking at symptoms here and there?
Comment #2 by hsteoh — 2020-06-05T23:31:09Z
There's more to it than a hole in @safe. Look at the disassembly below, there seems to be a codegen bug as well: ------------------- 000000000003f698 <_Dmain>: 3f698: 55 push %rbp 3f699: 48 8b ec mov %rsp,%rbp 3f69c: 48 83 ec 10 sub $0x10,%rsp 3f6a0: 40 8a 7d f8 mov -0x8(%rbp),%dil 3f6a4: e8 07 00 00 00 callq 3f6b0 <@trusted void test.f(bool)> 3f6a9: 31 c0 xor %eax,%eax 3f6ab: c9 leaveq 3f6ac: c3 retq 3f6ad: 00 00 add %al,(%rax) ... 000000000003f6b0 <@trusted void test.f(bool)>: 3f6b0: 55 push %rbp 3f6b1: 48 8b ec mov %rsp,%rbp 3f6b4: 48 83 ec 20 sub $0x20,%rsp 3f6b8: 89 7d f8 mov %edi,-0x8(%rbp) 3f6bb: c6 45 e8 00 movb $0x0,-0x18(%rbp) 3f6bf: 40 80 7d f8 00 rex cmpb $0x0,-0x8(%rbp) 3f6c4: 74 06 je 3f6cc <@trusted void test.f(bool)+0x1c> 3f6c6: 48 8d 45 e8 lea -0x18(%rbp),%rax 3f6ca: eb 0a jmp 3f6d6 <@trusted void test.f(bool)+0x26> 3f6cc: bf 01 00 00 00 mov $0x1,%edi 3f6d1: e8 9a fc ff ff callq 3f370 <malloc@plt> 3f6d6: 48 89 45 f0 mov %rax,-0x10(%rbp) 3f6da: 8a 4d f8 mov -0x8(%rbp),%cl 3f6dd: 80 f1 01 xor $0x1,%cl 3f6e0: 74 09 je 3f6eb <@trusted void test.f(bool)+0x3b> 3f6e2: 48 8b 7d f0 mov -0x10(%rbp),%rdi 3f6e6: e8 65 f9 ff ff callq 3f050 <free@plt> 3f6eb: c9 leaveq 3f6ec: c3 retq --------------- In main(), the value of -0x8(%rbp), apparently where main.b is stored, is loaded into the lower register %dil. But in f(), the value of the entire register %edi is stored in a local variable (coincidentally -0x8(%rbp), but points to a different place because this is now the local scope of the callee). Then a few instructions down this local variable is tested for having all 0's in its value: even though only the lower part of the register was actually loaded in main! Then after the if-statement, the (lower byte of the) local variable -0x8(%rbp) is loaded into %cl and compared against a literal 1. Even though technically this codegen works if b is either 0 or 1, it seems inconsistent at best (why compare the entire 32-bit value to 0 when checking for false, but only the lower byte when checking for true?), and in this case outright wrong when b is uninitialized and therefore can have any random garbage value other than 0 or 1.
Comment #3 by hsteoh — 2020-06-05T23:44:42Z
Actually, as far as this bug is concerned, @safe is a red herring, and so is void initialization. Proof: --------- bool schrodingersCat() @safe { union U { bool b; int i; } U u; u.i = 2; return u.b; } void main() @safe { import std.stdio; bool b = schrodingersCat(); if (b) writeln("alive"); if (!b) writeln("dead"); } --------- Output: --------- alive dead --------- Apparently, D semantics exhibit quantum mechanical effects!
Comment #4 by Patrick.Schluter — 2020-06-06T09:21:34Z
(In reply to hsteoh from comment #2) > There's more to it than a hole in @safe. Look at the disassembly below, > there seems to be a codegen bug as well: > > ------------------- > 000000000003f698 <_Dmain>: > 3f698: 55 push %rbp > 3f699: 48 8b ec mov %rsp,%rbp > 3f69c: 48 83 ec 10 sub $0x10,%rsp > 3f6a0: 40 8a 7d f8 mov -0x8(%rbp),%dil The bug is here and only in dmd! gdb and ldc use movzx to load the EDI register no mov. When b is initialized the error doesn't manifest as it reuses the EAX register to load EDI that it had used to zero the byte. This said. The example doesn't compile with option -O . It returns then <source>(4): Error: variable b used before set > > Even though technically this codegen works if b is either 0 or 1, it seems > inconsistent at best (why compare the entire 32-bit value to 0 when checking > for false, but only the lower byte when checking for true?), and in this > case outright wrong when b is uninitialized and therefore can have any > random garbage value other than 0 or 1. This is C integer promotion rule. bool being really just an integral type with 2 values instead of being a real special thing (see Java for the drawbacks of that).
Comment #5 by dlang-bot — 2023-06-28T10:10:41Z
@dkorpel created dlang/dmd pull request #15362 "Fix 20148 - void initializated bool can be both true and false" fixing this issue: - Fix 20148 - void initializated bool can be both true and false https://github.com/dlang/dmd/pull/15362
Comment #6 by bugzilla — 2023-07-21T00:44:48Z
(In reply to hsteoh from comment #2) > There's more to it than a hole in @safe. Look at the disassembly below, > there seems to be a codegen bug as well: > > ------------------- > 000000000003f698 <_Dmain>: > 3f698: 55 push %rbp > 3f699: 48 8b ec mov %rsp,%rbp > 3f69c: 48 83 ec 10 sub $0x10,%rsp > 3f6a0: 40 8a 7d f8 mov -0x8(%rbp),%dil > 3f6a4: e8 07 00 00 00 callq 3f6b0 <@trusted void > test.f(bool)> > 3f6a9: 31 c0 xor %eax,%eax > 3f6ab: c9 leaveq > 3f6ac: c3 retq > 3f6ad: 00 00 add %al,(%rax) > ... > > 000000000003f6b0 <@trusted void test.f(bool)>: > 3f6b0: 55 push %rbp > 3f6b1: 48 8b ec mov %rsp,%rbp > 3f6b4: 48 83 ec 20 sub $0x20,%rsp > 3f6b8: 89 7d f8 mov %edi,-0x8(%rbp) > 3f6bb: c6 45 e8 00 movb $0x0,-0x18(%rbp) > 3f6bf: 40 80 7d f8 00 rex cmpb $0x0,-0x8(%rbp) > 3f6c4: 74 06 je 3f6cc <@trusted void > test.f(bool)+0x1c> > 3f6c6: 48 8d 45 e8 lea -0x18(%rbp),%rax > 3f6ca: eb 0a jmp 3f6d6 <@trusted void > test.f(bool)+0x26> > 3f6cc: bf 01 00 00 00 mov $0x1,%edi > 3f6d1: e8 9a fc ff ff callq 3f370 <malloc@plt> > 3f6d6: 48 89 45 f0 mov %rax,-0x10(%rbp) > 3f6da: 8a 4d f8 mov -0x8(%rbp),%cl > 3f6dd: 80 f1 01 xor $0x1,%cl > 3f6e0: 74 09 je 3f6eb <@trusted void > test.f(bool)+0x3b> > 3f6e2: 48 8b 7d f0 mov -0x10(%rbp),%rdi > 3f6e6: e8 65 f9 ff ff callq 3f050 <free@plt> > 3f6eb: c9 leaveq > 3f6ec: c3 retq > --------------- > > In main(), the value of -0x8(%rbp), apparently where main.b is stored, is > loaded into the lower register %dil. But in f(), the value of the entire > register %edi is stored in a local variable (coincidentally -0x8(%rbp), but > points to a different place because this is now the local scope of the > callee). Then a few instructions down this local variable is tested for > having all 0's in its value: even though only the lower part of the register > was actually loaded in main! > > Then after the if-statement, the (lower byte of the) local variable > -0x8(%rbp) is loaded into %cl and compared against a literal 1. > > Even though technically this codegen works if b is either 0 or 1, it seems > inconsistent at best (why compare the entire 32-bit value to 0 when checking > for false, but only the lower byte when checking for true?), and in this > case outright wrong when b is uninitialized and therefore can have any > random garbage value other than 0 or 1. The code gen looks correct to me. The cmp is a byte compare instruction which only looks at the least significant byte, where the bool was stored.
Comment #7 by dkorpel — 2024-04-01T16:20:48Z
Comment #8 by hsteoh — 2024-04-01T17:24:30Z
The specified PR does not fully fix the problem. Proof: -------- bool schrodingersCat() @safe { union Box { bool b; ubyte y; } Box u; u.y = 2; return u.b; } void main() @safe { import std.stdio; bool b = schrodingersCat(); if (b) writeln("alive"); if (!b) writeln("dead"); } -------- Output: -------- alive dead --------
Comment #9 by dkorpel — 2024-04-01T21:34:40Z
That's not to do with void initialization though, so filed as issue 24477