Bug 203 – std.format.doFormat() pads width incorrectly on Unicode strings
Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D1 (retired)
Platform
x86
OS
All
Creation time
2006-06-17T12:13:00Z
Last change time
2014-02-15T13:28:58Z
Keywords
spec
Assigned to
bugzilla
Creator
matti.niemenmaa+dbugzilla
Comments
Comment #0 by matti.niemenmaa+dbugzilla — 2006-06-17T12:13:48Z
import std.string;
void main() {
assert(format("%8s", "foo") == " foo");
assert(format("%8s", "foobar") == " foobar");
assert(format("%8s", "hello") == " hello");
assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4");
// this passes, though it shouldn't: assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4");
}
--
In the above, the last assertion fails.
One would expect the last two strings, having five characters each, to both be padded in the front by three spaces: however, it appears the byte count is being used for determining the length and not the actual character count, and so the last string is padded by only one space.
Comment #1 by thomas-dloop — 2007-04-29T02:09:33Z
> One would expect the last two strings, having five characters each,
> to both be padded in the front by three spaces: however, it appears
> the byte count is being used for determining the length and not the
> actual character count, and so the last string is padded by only one
> space.
The only relevant documentation I found is:
> Width
> Specifies the minimum field width. If the width is a *, the next
> argument, which must be of type int, is taken as the width. If
> the width is negative, it is as if the - was given as a Flags
> character.
"field width" could be both interpreted as " byte length" and
"UTF codepoint count".
Comment #2 by bugzilla — 2008-06-24T01:57:14Z
I suggest it's codepoint count, as field width is for display purposes.