Bug 11432 – formattedRead and slurp %s format code miss tab as whitespace

Status
RESOLVED
Resolution
INVALID
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2013-11-03T09:12:46Z
Last change time
2019-12-07T09:52:52Z
Assigned to
No Owner
Creator
bearophile_hugs

Comments

Comment #0 by bearophile_hugs — 2013-11-03T09:12:46Z
This is a C program, note the string s that contains three fields, each separated by a single tab: #include <stdio.h> #include <stdlib.h> int main() { char* s = "red\t10\t20"; char t[20]; int a, b; sscanf(s, "%s %d %d", &t, &a, &b); printf(">%s<>%d<>%d<\n", t, a, b); return 0; } It prints the output I expect: >red<>10<>20< The syntax of scanf says regarding the %s code: http://www.mkssoftware.com/docs/man3/scanf.3.asp s A character string is expected; the corresponding argument should be a character pointer pointing to an array of characters large enough to accept the string and a terminating \0, which is added automatically. A white-space character terminates the input field. The conversion specifier hS is equivalent. I think this is a similar D program: import std.format, std.stdio; void main() { string s = "red\t10\t20"; string t; int a, b; formattedRead(s, "%s %d %d", &t, &a, &b); writef(">%s<>%d<>%d<\n", t, a, b); } But it prints: >red 10 20<>0<>0< As you see the tab is considered part of the first string field. This causes me troubles when I use slurp as shown below. If I have this "data.txt" text file with Unix-style newlines, and where the string is separated by the integer with just 1 tab character: red 10 blue 20 (So the whole file is: "red\t10\nblue\t20"). If I run this code: import std.file: slurp; void main() { slurp!(string, int)("data.txt", "%s %d"); } I get a stacktrace (dmd 2.064beta4): std.conv.ConvException@...\dmd2\src\phobos\std\conv.d(2009): Unexpected end of input when converting from type char[] to type int -------- 0x0040E269 in pure @safe int std.conv.parse!(int, char[]).parse(ref char[]) at C:\dmd2\src\phobos\std\conv.d(2010) ... To avoid the stack trace I have to put a tab between the two formattings: import std.file: slurp; void main() { slurp!(string, int)("data.txt", "%s\t%d"); }
Comment #1 by bugzilla — 2019-12-07T09:52:52Z
IMHO the problem is, that formattedRead is not identical to scanf - but it's not well documented. You should use formattedRead(s, "%s\t%d\t%d", &t, &a, &b); The same is true for the slurp example, where you found out allready, that you need \t.