Bug 14256 – Poor IO performance on 64-bit dmd 2.066 (OS X)

Status
REOPENED
Severity
normal
Priority
P3
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Mac OS X
Creation time
2015-03-08T02:48:59Z
Last change time
2024-12-13T18:41:01Z
Assigned to
No Owner
Creator
Paul M
Moved to GitHub: dmd#18954 →

Attachments

IDFilenameSummaryContent-TypeSize
1482gen1.dtest1 generator (1000000 lines of length 100)text/plain109
1483gen2.dtest2 generator (10000 lines of length 10000)text/plain109
1484gen3.dtest3 generator (100 lines of length 1000000)text/plain109
1485a.pyPython2 programtext/plain238
1486test_byLine.dOriginal D program by Paul Mtext/plain282
1487test_read_splitLines.dread + splitLines version by Paul Mtext/plain263
1488test_readln.dreadln version suggested by Daniel Kozak (http://stackoverflow.com/a/28926137/1488799)text/plain331

Comments

Comment #0 by pmmagic — 2015-03-08T02:48:59Z
I'm getting relatively poor I/O performance on medium to large text files in D code compiled with DMD64 D Compiler v2.066 on OSX 10.10.2. Below is an example D program and an equivalent Python program. The D code was compiled with "dmd -O -release -inline -m64". The timings on my system, when iterating over a ~470Mb file (~3.6M lines) are as follows: // D times real 0m19.146s user 0m18.932s sys 0m0.190s # Python times real 0m1.544s user 0m1.062s sys 0m0.479s // D code import std.stdio; import std.string; int main(string[] args) { if (args.length < 2) { return 1; } auto infile = File(args[1]); uint linect = 0; foreach (line; infile.byLine()) linect += 1; writeln("There are: ", linect, " lines."); return 0; } # Python code import sys if __name__ == "__main__": if (len(sys.argv) < 2): sys.exit() infile = open(sys.argv[1]) linect = 0 for line in infile.readlines(): linect += 1 print "There are %d lines" % linect This was originally asked on StackOverflow. One of the respondents (username Gassa) suggested that I file this as an issue here. See original exchange at: http://stackoverflow.com/questions/28922323/improving-line-wise-i-o-operations-in-d
Comment #1 by pmmagic — 2015-03-08T02:59:53Z
For testing purposes, here is a link to a tarball of file that I was using for testing purposes: ftp://flybase.net/genomes/Drosophila_melanogaster/dmel_r6.03_FB2014_06/gff/dmel-all-no-analysis-r6.03.gff.gz When decompressed this gives a 471Mb file with 3615492 lines.
Comment #2 by gassa — 2015-03-08T12:38:35Z
I tested the same on Win64 with dmd -m32 and dmd -m64 (options "-O -release -inline -noboundscheck"). I have three test files: 1000000 lines of length 100 each (gen1.d), 10000 lines of length 10000 each (gen2.d), 100 lines of length 1000000 each (gen3.d). Here are my test results: Entry test1 test2 test3 Number of lines 1000000 10000 100 Length of each line 100 10000 1000000 ------------------------------------------------------------------------ Python 2.7.5 x32: 0.60 0.41 0.36 Python 2.7.7 x64: 0.59 0.42 0.36 DMD 2.067.0-b3 byLine -m32: 0.53 1.27 1.77 DMD 2.067.0-b3 byLine -m64: 2.32 1.98 2.06 DMD 2.067.0-b3 readln -m32: 0.52 1.29 1.89 DMD 2.067.0-b3 readln -m64: 2.33 2.02 2.05 DMD 2.067.0-b3 read+splitLines -m32: 0.40 0.39 0.37 DMD 2.067.0-b3 read+splitLines -m64: 0.37 0.28 0.28 Here, I see at least two separate concerns: 1. The functions byLine and readln suffer from long lines with -m32. 2. The functions byLine and readln are a few times slower than possible with -m64. The behavior of read + splitLines is satisfactory (outperforms Python in both 32-bit and 64-bit mode) in my setup.
Comment #3 by gassa — 2015-03-08T12:40:40Z
Created attachment 1482 test1 generator (1000000 lines of length 100)
Comment #4 by gassa — 2015-03-08T12:40:55Z
Created attachment 1483 test2 generator (10000 lines of length 10000)
Comment #5 by gassa — 2015-03-08T12:41:11Z
Created attachment 1484 test3 generator (100 lines of length 1000000)
Comment #6 by gassa — 2015-03-08T12:41:35Z
Created attachment 1485 Python2 program
Comment #7 by gassa — 2015-03-08T12:42:47Z
Created attachment 1486 Original D program by Paul M
Comment #8 by gassa — 2015-03-08T12:43:40Z
Created attachment 1487 read + splitLines version by Paul M
Comment #9 by gassa — 2015-03-08T12:46:49Z
Created attachment 1488 readln version suggested by Daniel Kozak (http://stackoverflow.com/a/28926137/1488799)
Comment #10 by dlang-bugzilla — 2017-06-25T11:21:14Z
From a cursory benchmark, the D programs perform as well as or better than the Python version for me, except for the splitLines version. Note that splitLines also splits by Unicode line delimiters, not just \r and \n, which is why it's slower. Replacing splitLines(s) with split(s, '\n') makes the program much faster. Please reopen if you think the problem persists.
Comment #11 by gassa — 2017-06-25T12:04:00Z
I've just checked with a more recent version (dmd 2.074.1), and the picture is the same as two years ago here (Win64, file is on SSD), so reopened. In the readln program, I allocated buffer of 1100000 bytes (the posted version had 10000, and so failed on test 3). The current table: Entry test1 test2 test3 Number of lines 1000000 10000 100 Length of each line 100 10000 1000000 ------------------------------------------------------------------------ Python 2.7.5 x32: 0.68 0.44 0.36 Python 2.7.10 x64: 0.55 0.36 0.33 DMD 2.074.1 byLine -m32: 0.27 0.73 1.05 DMD 2.074.1 byLine -m64: 1.45 1.31 1.43 DMD 2.074.1 readln -m32: 0.25 0.63 1.00 DMD 2.074.1 readln -m64: 1.55 1.54 1.46 DMD 2.074.1 read+splitLines -m32: 0.35 0.39 0.31 DMD 2.074.1 read+splitLines -m64: 0.41 0.31 0.32 The times of 1 second or above are clearly problematic. In Python, string storage is low-level but number of lines affects the Pythonic part, so test1 is slower. In D -m32, the byLine and readln versions are slower when the length of lines grows, possibly due to reallocation when constructing a string. I'd say 3x slower than Python on large strings feels like too much. In D -m64, the byLine and readln versions still take 1.3+ seconds on all tests, more than 2x slower than Python, which is sad. As earlier, the read+splitLines version is the fastest on all tests in both -m32 and -m64, so speed is definitely possible, just not as out-of-the-box as the other two versions. Ivan Kazmenko.
Comment #12 by dlang-bugzilla — 2017-06-25T12:16:25Z
Can you please test with -m32mscoff, in addition to -m32 and -m64? I predict that the numbers are going to be very close to -m64. If so, then that indicates that we are limited by Microsoft's C runtime, in which case there is nothing that can be done short of rewriting std.stdio to not use C I/O.
Comment #13 by dlang-bugzilla — 2017-06-25T12:17:36Z
(In reply to Ivan Kazmenko from comment #11) > The current table: Also, if you have a script that generates this table, posting it here would be helpful.
Comment #14 by jrdemail2000-dlang — 2017-06-28T05:21:28Z
I've benchmarked File.byLine on OS X and Linux and they are quite fast on these platforms. I have not tested Windows, but have seen reports indicating it is quite slow there. I know also that performance on OS X poor prior to 2.068, when it was dramatically improved. The improvement in 2.068 was via PR #3089 (https://github.com/dlang/phobos/pull/3089). This changed File.byLine to use getdelim() on platforms supporting it, including OS X and most Linux versions. It's not clear if a similar change was made for Windows. This can be seen in part in the source file (https://github.com/dlang/phobos/blob/master/std/stdio.d) by searching for HAS_GETDELIM and NO_GETDELIM. Most platforms are listed as one or the other, Windows does not appear to be included and may still use a slow implementation.
Comment #15 by dlang-bugzilla — 2017-06-28T06:25:37Z
(In reply to Jon Degenhardt from comment #14) > This can be seen in part in the source file > (https://github.com/dlang/phobos/blob/master/std/stdio.d) by searching for > HAS_GETDELIM and NO_GETDELIM. Most platforms are listed as one or the other, > Windows does not appear to be included and may still use a slow > implementation. The DigitalMars and Microsoft C runtime versions have their own implementation of readlnImpl targeting those runtimes.
Comment #16 by robert.schadek — 2024-12-13T18:41:01Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/18954 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB