Bug 203 – std.format.doFormat() pads width incorrectly on Unicode strings

Status
RESOLVED
Resolution
FIXED
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D1 (retired)
Platform
x86
OS
All
Creation time
2006-06-17T12:13:00Z
Last change time
2014-02-15T13:28:58Z
Keywords
spec
Assigned to
bugzilla
Creator
matti.niemenmaa+dbugzilla

Comments

Comment #0 by matti.niemenmaa+dbugzilla — 2006-06-17T12:13:48Z
import std.string; void main() { assert(format("%8s", "foo") == " foo"); assert(format("%8s", "foobar") == " foobar"); assert(format("%8s", "hello") == " hello"); assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4"); // this passes, though it shouldn't: assert(format("%8s", "h\u00e9ll\u00f4") == " h\u00e9ll\u00f4"); } -- In the above, the last assertion fails. One would expect the last two strings, having five characters each, to both be padded in the front by three spaces: however, it appears the byte count is being used for determining the length and not the actual character count, and so the last string is padded by only one space.
Comment #1 by thomas-dloop — 2007-04-29T02:09:33Z
> One would expect the last two strings, having five characters each, > to both be padded in the front by three spaces: however, it appears > the byte count is being used for determining the length and not the > actual character count, and so the last string is padded by only one > space. The only relevant documentation I found is: > Width > Specifies the minimum field width. If the width is a *, the next > argument, which must be of type int, is taken as the width. If > the width is negative, it is as if the - was given as a Flags > character. "field width" could be both interpreted as " byte length" and "UTF codepoint count".
Comment #2 by bugzilla — 2008-06-24T01:57:14Z
I suggest it's codepoint count, as field width is for display purposes.
Comment #3 by bugzilla — 2008-07-09T22:30:39Z
Fixed dmd 1.032 and 2.016