Comment #0 by dlang-bugzilla — 2011-03-07T17:18:53Z
void main()
{
string s, s2;
s = "Привет";
foreach(c; s)
s2 ~= c;
assert(s == s2);
}
DMD now seems to consider each individual char a whole code point (as if it was automatically promoted to dchar).
Comment #1 by sohgo — 2011-03-09T04:35:00Z
Same problem happens on FreeBSD 8.2 with DMD 1.067 too.
But the problem does not happen with DMD 1.066.
Comment #2 by clugdbug — 2011-03-10T01:14:55Z
I think this is a foreach problem.
Probably triggered by the fix to bug 4389.
Comment #3 by dlang-bugzilla — 2011-03-10T01:17:37Z
It doesn't look like a foreach problem. This fails too:
void main()
{
string s, s2;
s = "Привет";
for (int i=0; i<s.length; i++)
s2 ~= s[i];
assert(s == s2);
}
Comment #4 by clugdbug — 2011-03-10T04:26:27Z
(In reply to comment #3)
> It doesn't look like a foreach problem. This fails too:
Hmm. You're right. And yet it works fine on D2.
It's inserting a call to _d_arrayappendcd, which means the append has been changed into char[] ~ dchar.
Comment #5 by clugdbug — 2011-03-10T07:13:37Z
It was indeed caused by the fix to bug 4389, which wasn't tight enough.
s~= c shouldn't turn c into a dchar, if both s and c are the same type. (ie, char[]~=char should go through unaltered). That leaves wchar[] ~ char, which I think is inevitably a mess if c is outside the ASCII range.
expression.c, line 8593. CatAssignExp::semantic()
{ // Append array
e2 = e2->castTo(sc, e1->type);
type = e1->type;
e = this;
}
else if (tb1->ty == Tarray &&
(tb1next->ty == Tchar || tb1next->ty == Twchar) &&
+ e2->type->ty != tb1next->ty &&
e2->implicitConvTo(Type::tdchar)
)
{ // Append dchar to char[] or wchar[]
e2 = e2->castTo(sc, Type::tdchar);
type = e1->type;
e = this;
/* Do not allow appending wchar to char[] because if wchar happens
* to be a surrogate pair, nothing good can result.
*/
Comment #6 by sohgo — 2011-03-10T19:27:38Z
(In reply to comment #5)
I've tried Don's patch, it works good in my environment.
That's great.
Thank you.