Bug 11508 – [REG 2.064] Wrong code with -O on x86_64 for char comparisons
Status
RESOLVED
Resolution
FIXED
Severity
regression
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
All
Creation time
2013-11-12T20:39:00Z
Last change time
2013-11-16T17:53:56Z
Keywords
pull, wrong-code
Assigned to
nobody
Creator
dlang-bugzilla
Comments
Comment #0 by dlang-bugzilla — 2013-11-12T20:39:58Z
bool isWordChar(char c)
{
return c=='_' || c=='-' || c=='+' || c=='.';
}
void main()
{
assert(isWordChar('_'));
}
"dmd -O -m64 -run test.d" will cause the above assert to trip.
Happens with 2.064.2 and git HEAD, doesn't happen with 2.063.
Comment #1 by tchajed+d — 2013-11-15T19:20:32Z
I wasn't able to isolate the issue in the compiler, but it looks like starting at https://github.com/D-Programming-Language/dmd/commit/fd999804a0d79fcbbfac39191d4e8f4ba7872467 an optimization was added for sequences of OR operations that is doing something strange to this code. Just prior to this commit the isWordChar function looked like this (in earlier versions there was no optimization and the code just did a straightforward sequence of cmp and je):
0000000000000050 <_D4dbug10isWordCharFaZb>:
50: 55 push %rbp
51: 48 8b ec mov %rsp,%rbp
54: 48 83 ec 10 sub $0x10,%rsp
58: 89 7d f8 mov %edi,-0x8(%rbp)
5b: 0f b6 45 f8 movzbl -0x8(%rbp),%eax
5f: 83 c0 d5 add $0xffffffd5,%eax
62: 83 f8 34 cmp $0x34,%eax
65: 77 0a ja 71 <_D4dbug10isWordCharFaZb+0x21>
67: b9 0d 00 00 00 mov $0xd,%ecx
6c: 0f a3 c1 bt %eax,%ecx
6f: 72 04 jb 75 <_D4dbug10isWordCharFaZb+0x25>
71: 31 c0 xor %eax,%eax
73: eb 05 jmp 7a <_D4dbug10isWordCharFaZb+0x2a>
75: b8 01 00 00 00 mov $0x1,%eax
7a: 48 8b e5 mov %rbp,%rsp
7d: 5d pop %rbp
7e: c3 retq
While afterward it looks like (but only on X86_64):
70: 55 push %rbp
71: 48 8b ec mov %rsp,%rbp
74: 48 83 ec 10 sub $0x10,%rsp
78: 89 7d f8 mov %edi,-0x8(%rbp)
7b: 0f b6 45 f8 movzbl -0x8(%rbp),%eax
7f: 83 c0 d5 add $0xffffffd5,%eax
82: 83 f8 34 cmp $0x34,%eax
85: 77 0f ja 96 <_D4dbug10isWordCharFaZb+0x26>
87: 48 b9 0d 00 00 00 00 movabs $0x1000000000000d,%rcx
8e: 00 10 00
91: 0f a3 c1 bt %eax,%ecx
94: 72 04 jb 9a <_D4dbug10isWordCharFaZb+0x2a>
96: 31 c0 xor %eax,%eax
98: eb 05 jmp 9f <_D4dbug10isWordCharFaZb+0x2f>
9a: b8 01 00 00 00 mov $0x1,%eax
9f: 48 8b e5 mov %rbp,%rsp
a2: 5d pop %rbp
a3: c3 retq
I don't understand the codebase well enough to see how to fix the issue, but hope this helps!