Bug 11432 – formattedRead and slurp %s format code miss tab as whitespace
Status
RESOLVED
Resolution
INVALID
Severity
normal
Priority
P2
Component
phobos
Product
D
Version
D2
Platform
x86
OS
Windows
Creation time
2013-11-03T09:12:46Z
Last change time
2019-12-07T09:52:52Z
Assigned to
No Owner
Creator
bearophile_hugs
Comments
Comment #0 by bearophile_hugs — 2013-11-03T09:12:46Z
This is a C program, note the string s that contains three fields, each separated by a single tab:
#include <stdio.h>
#include <stdlib.h>
int main() {
char* s = "red\t10\t20";
char t[20];
int a, b;
sscanf(s, "%s %d %d", &t, &a, &b);
printf(">%s<>%d<>%d<\n", t, a, b);
return 0;
}
It prints the output I expect:
>red<>10<>20<
The syntax of scanf says regarding the %s code:
http://www.mkssoftware.com/docs/man3/scanf.3.asp
s
A character string is expected; the corresponding argument should be a character pointer pointing to an array of characters large enough to accept the string and a terminating \0, which is added automatically. A white-space character terminates the input field. The conversion specifier hS is equivalent.
I think this is a similar D program:
import std.format, std.stdio;
void main() {
string s = "red\t10\t20";
string t;
int a, b;
formattedRead(s, "%s %d %d", &t, &a, &b);
writef(">%s<>%d<>%d<\n", t, a, b);
}
But it prints:
>red 10 20<>0<>0<
As you see the tab is considered part of the first string field. This causes me troubles when I use slurp as shown below.
If I have this "data.txt" text file with Unix-style newlines, and where the string is separated by the integer with just 1 tab character:
red 10
blue 20
(So the whole file is: "red\t10\nblue\t20").
If I run this code:
import std.file: slurp;
void main() {
slurp!(string, int)("data.txt", "%s %d");
}
I get a stacktrace (dmd 2.064beta4):
std.conv.ConvException@...\dmd2\src\phobos\std\conv.d(2009): Unexpected end of input when converting from type char[] to type int
--------
0x0040E269 in pure @safe int std.conv.parse!(int, char[]).parse(ref char[]) at C:\dmd2\src\phobos\std\conv.d(2010)
...
To avoid the stack trace I have to put a tab between the two formattings:
import std.file: slurp;
void main() {
slurp!(string, int)("data.txt", "%s\t%d");
}
Comment #1 by bugzilla — 2019-12-07T09:52:52Z
IMHO the problem is, that formattedRead is not identical to scanf - but it's not well documented. You should use
formattedRead(s, "%s\t%d\t%d", &t, &a, &b);
The same is true for the slurp example, where you found out allready, that you need \t.