Bug 18114 – [Reg 2.078] regex performance regression

Status
RESOLVED
Resolution
FIXED
Severity
regression
Priority
P1
Component
phobos
Product
D
Version
D2
Platform
x86_64
OS
All
Creation time
2017-12-22T20:27:55Z
Last change time
2018-05-14T14:31:23Z
Keywords
pull
Assigned to
No Owner
Creator
Jon Degenhardt

Attachments

IDFilenameSummaryContent-TypeSize
1669find_regex.dfind_regex.d: Reduced program, it counts regex matches in input linestext/plain601
1670gen_strings.dgen_strings.d: Generates input for find_regex.dtext/plain12124

Comments

Comment #0 by jrdemail2000-dlang — 2017-12-22T20:27:55Z
My standard performance benchmarks show a regex regression moving from DMD 2.077.1 to DMD 2.078.0-beta1. The benchmark is the one described here: https://github.com/eBay/tsv-utils-dlang/blob/master/docs/Performance.md#regular-expression-filter-benchmark. No other performance regressions were seen, leading me to believe it is likely specific to regex, rather than some other element of the program. This program takes a user supplied regex as a command line argument. It creates a Regex!char instance from the command line arg. Then it reads a file line-by-line, searching each line for presence of the regex using matchFirst. It's a fielded-search, so the matchFirst call is limited to a portion of the line. The benchmark times (MacBook Pro, OS X, 16GB ram, SSD drives): - DMD 2.077.1: 11.40 seconds - DMD 2.078.0-beta1: 15.10 seconds Approximately a 30% increase. I ran it about 10 times, with very consistent results. If it would be helpful, I could produce a small standalone program to validate the behavior.
Comment #1 by jrdemail2000-dlang — 2017-12-22T20:34:21Z
Regex in the test: '[RD].*(ION[0-2])' Compilation flags: dmd -release -O -boundscheck=off -inline
Comment #2 by jrdemail2000-dlang — 2017-12-23T20:16:25Z
Created attachment 1669 find_regex.d: Reduced program, it counts regex matches in input lines
Comment #3 by jrdemail2000-dlang — 2017-12-23T20:18:32Z
Created attachment 1670 gen_strings.d: Generates input for find_regex.d
Comment #4 by jrdemail2000-dlang — 2017-12-23T20:51:10Z
The two programs attached can be used to compare regex match performance. Compile find_regex.d with both DMD 2.077.1 and DMD 2.078.0-beta1. eg. $ dmd2.078.0-beta1 -release -O -inline -boundscheck=off find_regex.d -of=find_regex.dmd2.078.0-beta1 $dmd2.077.1 -release -O -inline -boundscheck=off find_regex.d -of=find_regex.dmd2.077.1 Build gen_strings also. Then run like: $ ./gen_strings 10000000 | ./find_regex.dmd2.077.1 'abc' 106438 matches; 3780.769 milliseconds $ ./gen_strings 10000000 | ./find_regex.dmd2.078.0-beta1 'abc' 106438 matches; 6315.638 milliseconds Times on my MacBook Pro for a couple of regexes (replace 'abc' in the commands above): | Regex | 2.077.1 | 2.078.0-beta1 | | 'abc' | 3780.769 ms | 6315.638 ms | | '(aa)+(cc)+g' | 5306.169 ms | 7753.435 ms | | '(aa|ax).+[gxb][cyw]' | 9888.508 ms | 12324.285 ms |
Comment #5 by code — 2018-01-02T20:29:50Z
Introduced by https://github.com/dlang/phobos/pull/5722. There were some major internal changes in the std.regex module. I see a 60% increase in issued instructions for the 'abc' regex, but also much better ILP with this WIP PR. https://github.com/dlang/phobos/pull/5981 There are quite a lot of very slow LOOP instructions used to copy fat struct parameters around. Those seem to clog the pipeline. https://github.com/dlang/dmd/blob/7f43f0f88df876aec60a22a6a74940020fb1ef50/src/dmd/backend/cod1.c#L4046 https://stackoverflow.com/questions/35742570/why-is-the-loop-instruction-slow-couldnt-intel-have-implemented-it-efficiently#35743699 Whether or not the increase in instructions with better ILP is intentional or another bug, I don't know. Guess Dmitry can help to figure this out.
Comment #6 by code — 2018-01-04T21:50:40Z
avoid inefficient codegen: https://github.com/dlang/dmd/pull/7599 WIP avoid unnecessary copies: https://github.com/dlang/phobos/pull/5981
Comment #7 by github-bugzilla — 2018-02-24T11:36:36Z
Commits pushed to master at https://github.com/dlang/phobos https://github.com/dlang/phobos/commit/59520969ef73eaf0691972ee00b389e5bbc4c8fb fix Issue 18114 - regex performance regression - reduce copying of fat structs - optimize layouts and opAssign of Captures https://github.com/dlang/phobos/commit/460693c26f79a6aaa771ecdd42f791f4c3f5fb18 Merge pull request #5981 from MartinNowak/issue18114 fix Issue 18114 - regex performance regression merged-on-behalf-of: Martin Nowak <[email protected]>
Comment #8 by jrdemail2000-dlang — 2018-03-04T01:03:52Z
I re-ran the benchmark with the new DMD 2.079.0 release (OS X). Numbers below are fastest of several runs, but were generally consistent: | Regex | 2.077.1 | 2.078.0-beta1 | 2.079.0 | |-----------------------+--------------+---------------+--------------| | 'abc' | 3819.314 ms | 6202.892 ms | 5077.937 ms | | '(aa)+(cc)+g' | 5457.678 ms | 8209.269 ms | 6672.057 ms | | '(aa|ax).+[gxb][cyw]' | 10121.181 ms | 12448.443 ms | 11254.978 ms | These results are a material improvement over 2.078.0-beta1, but still a fair bit off 2.077.1, by 10-25% depending on the test. Perhaps no longer a "performance regression", but there's still an opportunity for improvement. Would it be more appropriate to leave this open as an enhancement request?
Comment #9 by jrdemail2000-dlang — 2018-05-14T14:31:23Z
The final performance fix was included in LDC 1.10.0-beta1. For this release the standard benchmark I used for the TSV Utilities improved as follows: LDC 1.7.0 (before regression): 8.37 seconds LDC 1.8.0 (after regression): 10.01 seconds LDC 1.9.0 (first fixes): 9.44 seconds LDC 1.10.0-beta1 (second fix): 5.85 seconds First fixes: Phobos PR 5981, DMD PR 7599 Second fix: Phobos PR 6268 The benchmark test used reads a TSV file line-by-line and checks individual fields for regex matches. A significant amount of processing time is IO, so the percentage gain on the regex portion is higher than the overall gain. The overall gain from LDC 1.7.0 is 30%. Test was run on MacOS, MacMini with 16GB RAM, SSD drives. The file used was 2.7GB, 14 million lines. Test info can be found here: https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ComparativeBenchmarks2018.md Great result!