Bug 21038 – wchar and dchar string alignment should be 2 and 4, respectively

Status
RESOLVED
Resolution
FIXED
Severity
major
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2020-07-11T17:15:35Z
Last change time
2020-08-13T14:34:28Z
Keywords
pull, wrong-code
Assigned to
No Owner
Creator
Tim

Comments

Comment #0 by tim.dlang — 2020-07-11T17:15:35Z
The result of wcslen can be too low when compiling with dmd 2.093. Consider the following two files: //////////////////// testabcd.d /////////////////// import core.stdc.stddef; import core.stdc.stdio; import core.stdc.wchar_; const(wchar_t)* name = "abcd"; void test() { size_t length = wcslen(name); printf("length: %zd\n", length); printf("data: \""); for(const(wchar_t)* s = name; *s; s++) printf("%c", *s); printf("\"\n"); } /////////////////////////////////////////////////// //////////////////// testxyzw.d /////////////////// import testabcd; void main() { test(); } /////////////////////////////////////////////////// Running it results in the following output: length: 3 data: "abcd" The correct length would be 4. When compiling with ldc it works as expected. The filenames are important. When testxyzw.d is renamed to testx.d it produces the expected output.
Comment #1 by ag0aep6g — 2020-07-11T20:12:20Z
The string data is getting misaligned. wcslen assumes properly aligned data. testabcd.d can be reduced to this: ---- alias wchar_t = dchar; const(wchar_t)* name = "abcd"; void test() { assert((cast(size_t) name) % wchar_t.sizeof == 0); /* Fails. Should pass. */ } ----
Comment #2 by bugzilla — 2020-08-07T02:39:44Z
For the program: alias wchar_t = dchar; const(wchar_t)* x = "xz"; const(wchar_t)* name = "abcd"; void test() { assert((cast(size_t) name) % wchar_t.sizeof == 0); /* Fails. Should pass. */ } the output generated is: Section 6 .rodata PROGBITS,ALLOC,SIZE=0x0030(48),OFFSET=0x0040,ALIGN=16 0040: 78 0 0 0 7a 0 0 0 0 0 0 0 61 0 0 0 x...z.......a... 0050: 62 0 0 0 63 0 0 0 64 0 0 0 0 0 0 0 b...c...d....... 0060: 4 10 0 0 0 0 0 0 74 65 73 74 0 0 0 0 ........test.... It's a surprise to me that a 4 byte element array is supposed to be aligned to 8 bytes. I'm not seeing where this is a requirement?
Comment #3 by bugzilla — 2020-08-07T02:43:55Z
Oh, I see now. For: alias wchar_t = dchar; const(wchar)* x = "xz"; const(wchar_t)* name = "abcd"; void test() { assert((cast(size_t) name) % wchar_t.sizeof == 0); /* Fails. Should pass. */ } the result is: Section 6 .rodata PROGBITS,ALLOC,SIZE=0x0030(48),OFFSET=0x0040,ALIGN=16 0040: 78 0 7a 0 0 0 61 0 0 0 62 0 0 0 63 0 x.z...a...b...c. 0050: 0 0 64 0 0 0 0 0 0 0 0 0 0 0 0 0 ..d............. 0060: 4 10 0 0 0 0 0 0 74 65 73 74 0 0 0 0 ........test.... which is wrongly aligned on a 2 byte boundary.
Comment #4 by dlang-bot — 2020-08-07T03:58:35Z
@WalterBright created dlang/dmd pull request #11528 "fix Issue 21038 - wchar and dchar string alignment should be 2 and 4,…" fixing this issue: - fix Issue 21038 - wchar and dchar string alignment should be 2 and 4, respectively https://github.com/dlang/dmd/pull/11528
Comment #5 by dlang-bot — 2020-08-13T14:34:28Z
dlang/dmd pull request #11528 "fix Issue 21038 - wchar and dchar string alignment should be 2 and 4,…" was merged into master: - 46994f578813b365050ce19ed0e0bcc132e7555b by Walter Bright: fix Issue 21038 - wchar and dchar string alignment should be 2 and 4, respectively https://github.com/dlang/dmd/pull/11528