Bug 8754 – Function commonPrefix returns invalid string when passing two cyrillic utf-8 strings
Status
RESOLVED
Resolution
DUPLICATE
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
All
OS
All
Creation time
2012-10-04T07:53:00Z
Last change time
2013-01-07T02:14:00Z
Assigned to
nobody
Creator
lxyd.dlang
Comments
Comment #0 by lxyd.dlang — 2012-10-04T07:53:02Z
Run this demo:
--------
import std.algorithm, std.stdio, std.encoding;
void main() {
// ciryllic letters 'б' and 'в' consist of two bytes. First one is common
auto p = commonPrefix("б", "в");
writeln(p.length); // 1 code unit. Must be 0
assert(isValid(p)); // fails: incomplete code point
}
--------
I'm just studying D and, so I'm not sure this is a real bug, but commonPrefix seems to be designed to treat strings special way and this way seems to be wrong for strings :)
Let me suggest this separate implementation of commonPrefix for strings (tried to mimic original code):
--------
import std.functional, std.traits, std.range;
auto commonPrefix(alias pred = "a == b", R1, R2)(R1 r1, R2 r2)
if (isSomeString!R1 && isSomeString!R2) {
auto result = r1.save;
for (; !r1.empty && !r2.empty && binaryFun!pred(r1.front, r2.front);
r1.popFront(), r2.popFront()){}
return result[0..$-r1.length];
}
--------
Once again, I'm just studying D and I'm not sure if this code is fully correct, but it seems to work fine with strings (also, not sure if this separate implementation sould be trusted and pure).
BTW: documentation has a mistake too:
"The type of the result is the same as $(D takeExactly(r1, n))".
But takeExactly always returns takeExactly.Result, and commonPrefix can return slice.
Comment #1 by issues.dlang — 2013-01-07T02:14:00Z
*** This issue has been marked as a duplicate of issue 8890 ***