Comment #0 by cooper.charles.m — 2015-03-28T20:10:46Z
Performance of std.stdio.rawRead is 50-75% slower than core.std.stdio.fread in tight loop. The performance of a thin wrapper should match C stdio performance or users will be unhappy.
// stdioperf.d
struct mystruct {
long data[4];
}
void main() {
enum bool CSTDIO = false;
mystruct foo;
static if (CSTDIO) {
import core.stdc.stdio : stdin,fread;
while (0 != fread(&foo, foo.sizeof, 1, stdin)) {}
} else {
static import std.stdio;
while (0 != std.stdio.stdin.rawRead((&foo)[0..1]).length) {}
}
}
//EOF
$ dmd --version
DMD64 D Compiler v2.067.0
Copyright (c) 1999-2014 by Digital Mars written by Walter Bright
$ dmd -O -inline -release -noboundscheck stdioperf.d
$ time dd if=/dev/zero bs=1M count=8192 | ./stdioperf
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 7.0038 s, 1.2 GB/s
real 0m7.005s
user 0m5.792s
sys 0m6.924s
$ gdc --version
gdc (Debian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gdc -O3 -fno-bounds-check -fno-assert -fno-invariants -fno-in -fno-out stdioperf.d
$ time dd if=/dev/zero bs=1M count=8192 | ./a.out
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 6.07485 s, 1.4 GB/s
real 0m6.076s
user 0m4.908s
sys 0m6.684s
With CSTDIO = true (performance is same no matter the compiler):
$ gdc -O3 stdioperf.d
$ time dd if=/dev/zero bs=1M count=8192 | ./a.out
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 4.18047 s, 2.1 GB/s
real 0m4.182s
user 0m2.888s
sys 0m3.888s
Profiling suggests the overhead comes from the compiler failing to inline calls to std.exception.enforce, calling errnoEnforce even when fread's return indicates success, and from buffer slicing overhead.
The following patch to d/4.9/std/stdio.d (front end D 2.065) confirms this, reducing the performance gap to ~2% (gdc -O2). It also gets rid of the undocumented null return value:
609c609,611
< enforce(buffer.length, "rawRead must take a non-empty buffer");
---
> if (!buffer.length) {
> enforce(buffer.length, "rawRead must take a non-empty buffer");
> }
625,626c627,631
< errnoEnforce(!error);
< return result ? buffer[0 .. result] : null;
---
> if (result < buffer.length) {
> errnoEnforce(!error);
> return buffer[0..result];
> }
> return buffer;
$ gdc -O3 stdioperf.d mystdio.d
$ time dd if=/dev/zero bs=1M count=8192 | ./a.out
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 4.26723 s, 2.0 GB/s
real 0m4.269s
user 0m2.960s
sys 0m3.788s
The patch to dmd 2.067 phobos is similar except the line numbers are different:
715c715,717
< enforce(buffer.length, "rawRead must take a non-empty buffer");
---
> if (!buffer.length) {
> enforce(false, "rawRead must take a non-empty buffer");
> }
733,734c735,739
< errnoEnforce(!error);
< return result ? buffer[0 .. result] : null;
---
> if (result < buffer.length) {
> errnoEnforce(!error);
> return buffer[0..result];
> }
> return buffer;
I also suggest that stdio.File.rawRead also update the documentation of rawRead so that it includes an example of idiomatic usage:
while (1) if (0 == rawRead(...).length) break;
Charles
Comment #1 by github-bugzilla — 2015-04-03T17:08:49Z