Bug 21038 – wchar and dchar string alignment should be 2 and 4, respectively
Status
RESOLVED
Resolution
FIXED
Severity
major
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2020-07-11T17:15:35Z
Last change time
2020-08-13T14:34:28Z
Keywords
pull, wrong-code
Assigned to
No Owner
Creator
Tim
Comments
Comment #0 by tim.dlang — 2020-07-11T17:15:35Z
The result of wcslen can be too low when compiling with dmd 2.093. Consider the following two files:
//////////////////// testabcd.d ///////////////////
import core.stdc.stddef;
import core.stdc.stdio;
import core.stdc.wchar_;
const(wchar_t)* name = "abcd";
void test()
{
size_t length = wcslen(name);
printf("length: %zd\n", length);
printf("data: \"");
for(const(wchar_t)* s = name; *s; s++)
printf("%c", *s);
printf("\"\n");
}
///////////////////////////////////////////////////
//////////////////// testxyzw.d ///////////////////
import testabcd;
void main()
{
test();
}
///////////////////////////////////////////////////
Running it results in the following output:
length: 3
data: "abcd"
The correct length would be 4. When compiling with ldc it works as expected.
The filenames are important. When testxyzw.d is renamed to testx.d it produces the expected output.
Comment #1 by ag0aep6g — 2020-07-11T20:12:20Z
The string data is getting misaligned. wcslen assumes properly aligned data. testabcd.d can be reduced to this:
----
alias wchar_t = dchar;
const(wchar_t)* name = "abcd";
void test()
{
assert((cast(size_t) name) % wchar_t.sizeof == 0); /* Fails. Should pass. */
}
----
Comment #2 by bugzilla — 2020-08-07T02:39:44Z
For the program:
alias wchar_t = dchar;
const(wchar_t)* x = "xz";
const(wchar_t)* name = "abcd";
void test()
{
assert((cast(size_t) name) % wchar_t.sizeof == 0); /* Fails. Should pass. */
}
the output generated is:
Section 6 .rodata PROGBITS,ALLOC,SIZE=0x0030(48),OFFSET=0x0040,ALIGN=16
0040: 78 0 0 0 7a 0 0 0 0 0 0 0 61 0 0 0 x...z.......a...
0050: 62 0 0 0 63 0 0 0 64 0 0 0 0 0 0 0 b...c...d.......
0060: 4 10 0 0 0 0 0 0 74 65 73 74 0 0 0 0 ........test....
It's a surprise to me that a 4 byte element array is supposed to be aligned to 8 bytes. I'm not seeing where this is a requirement?
Comment #3 by bugzilla — 2020-08-07T02:43:55Z
Oh, I see now. For:
alias wchar_t = dchar;
const(wchar)* x = "xz";
const(wchar_t)* name = "abcd";
void test()
{
assert((cast(size_t) name) % wchar_t.sizeof == 0); /* Fails. Should pass. */
}
the result is:
Section 6 .rodata PROGBITS,ALLOC,SIZE=0x0030(48),OFFSET=0x0040,ALIGN=16
0040: 78 0 7a 0 0 0 61 0 0 0 62 0 0 0 63 0 x.z...a...b...c.
0050: 0 0 64 0 0 0 0 0 0 0 0 0 0 0 0 0 ..d.............
0060: 4 10 0 0 0 0 0 0 74 65 73 74 0 0 0 0 ........test....
which is wrongly aligned on a 2 byte boundary.
Comment #4 by dlang-bot — 2020-08-07T03:58:35Z
@WalterBright created dlang/dmd pull request #11528 "fix Issue 21038 - wchar and dchar string alignment should be 2 and 4,…" fixing this issue:
- fix Issue 21038 - wchar and dchar string alignment should be 2 and 4, respectively
https://github.com/dlang/dmd/pull/11528
Comment #5 by dlang-bot — 2020-08-13T14:34:28Z
dlang/dmd pull request #11528 "fix Issue 21038 - wchar and dchar string alignment should be 2 and 4,…" was merged into master:
- 46994f578813b365050ce19ed0e0bcc132e7555b by Walter Bright:
fix Issue 21038 - wchar and dchar string alignment should be 2 and 4, respectively
https://github.com/dlang/dmd/pull/11528