Comment #0 by secondaryAccount — 2015-08-19T19:26:26Z
Created attachment 1542
benchmarked code
http://forum.dlang.org/thread/[email protected]?page=9#post-mr2ef5:241e2a:241:40digitalmars.com
The code is in the attachment.
Timings with some test input described below:
dmd -O -inline -release -noboundscheck 3700-3800 ms
gdc -O3 -march=native -frelease -fno-bounds-check ~1000 ms
ldc2 -O3 -release -disable-boundscheck ~800 ms
versions:
dmd 2.068
gdc based on 4.9.2
ldc 0.15.2beta1
OS: linux X86_64
Some notes:
- the benchmark is calling cosineSimilarity 1 million times with different input and sum the return values (large text input file + IO functions omitted here. I can add them if helpful.)
- timing with std.datetime around the loop - no IO included.
- pragma(inline, true) shows that dmd is unable to inline scalarProduct and normSquared. disabling inlining for ldc causes no noticeable slowdown.
- elements of SparseVector are sorted by index.
- SparseVector.length is usually between 50 and 100 and maximal index is 47,000
- v1 and v2 are not pointing to the same data.
- gap between dmd and ldc/gdc is much smaller when replacing "real" with double.
Comment #1 by secondaryAccount — 2015-08-19T20:37:07Z
Created attachment 1543
compressed benchmark input file
reduced benchmark input file to pass size limit.
Comment #2 by secondaryAccount — 2015-08-19T20:45:30Z
Created attachment 1544
full benchmark code for reduced input file
The attached input file contains 1400 example vectors. This benchmark programm calls cosineSimilarity 700 x 700 times (controlled by slices / foreach loops in main). Sufficient to reproduce.
The timings in the bug report are based on 1000 x 1000 calls of cosineSimilarity and 2000 example vectors.
Path to input file is hard coded (same directory) -> line 11
Comment #3 by robert.schadek — 2024-12-13T18:44:17Z