Created attachment 1113
Console Screenshot with Error Showing
Comment #4 by phshaffer — 2012-06-06T05:43:36Z
Dmitry Olshansky recommended I submit this as a bug.
The program is executed as : icomp2 fold.txt fnew.txt
It should search fold.txt for certain text patterns and then see if all "found" text also appears in fnew.txt. Fold.txt and Fnew.txt are identical so all "found" text should appeart in Fnew.txt as well.
I added some diagnostic loops counters for troubleshooting:
writeln(cntOld," ",cntNew," ",matchOld.hit," ",matchNew.hit);
As the screenshot shows after several iterations, it crashes with ->
core.exception.AssertError@C:\D\dmd2\windows\bin\..\..\src\phobos\std\regex.d(60
50): not enough preallocated memory
Comment #5 by dmitry.olsh — 2012-06-06T06:13:01Z
(In reply to comment #4)
> Dmitry Olshansky recommended I submit this as a bug.
>
Yup, case I'm the only one to fix it, at least in near future ;)
> The program is executed as : icomp2 fold.txt fnew.txt
>
> It should search fold.txt for certain text patterns and then see if all "found"
> text also appears in fnew.txt. Fold.txt and Fnew.txt are identical so all
> "found" text should appeart in Fnew.txt as well.
>
> I added some diagnostic loops counters for troubleshooting:
> writeln(cntOld," ",cntNew," ",matchOld.hit," ",matchNew.hit);
>
> As the screenshot shows after several iterations, it crashes with ->
> core.exception.AssertError@C:\D\dmd2\windows\bin\..\..\src\phobos\std\regex.d(60
> 50): not enough preallocated memory
Thanks, I'm on it. We'd better get fixed it in 2.060.
Comment #6 by dmitry.olsh — 2012-06-07T04:35:32Z
I've studied it a bit, and here is the details:
it only happens, when re-running the same match object many times:
foreach(v; match(...)) // no bug
vs
auto m = match(....)
foreach(v; m) //does run out of memory
In your case I see from comments that you try hard to do eager evalutaion, and first find all matches then work through two arrays of them. Yet it's not what program does, it still performes N*M regex searches because
auto uniCapturesNew = match(uniFileOld, regex(...));
just starts the engine and finds 1st match. Then you copy engine state on each iteration of nested loop (this copy operation is bogus apparently) and run engine till all matches are found. Next iteration of loop - another copy.
So in your case I strongly suggest to do this magic recipe, that work for all lazy ranges:
auto allMatches = array(match(....);
and work with arrays from now on.
Anyway, the root cause is now clear and I've reduced it to:
import std.regex;
string data = "
NAME = XPAW01_STA:STATION
NAME = XPAW01_STA
";
// Main function
void main(){
auto uniFileOld = data;
auto uniCapturesNew = match(uniFileOld, regex(r"^NAME = (?P<comp>[a-zA-Z0-9_]+):*(?P<blk>[a-zA-Z0-9_]*)","gm"));
for(int i=0; i<20; i++)
{ foreach (matchNew; uniCapturesNew) {} }
}
Comment #10 by ilyayaroshenko — 2014-01-04T14:42:08Z
Created attachment 1310
regex example
This regexp fails with
"аллея Театральная, д. 3, стр. 1".
Works fine in SublimeText3.
________________c
ore.exception.AssertError@/usr/include/dmd/phobos/std/regex.d(5393): not enough preallocated memory
----------------
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(_d_assert_msg+0x45) [0x5055f1]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(pure nothrow @trusted std.regex.Thread!(ulong).Thread* std.regex.ThompsonMatcher!(char, std.regex.Input!(char).Input.BackLooper).ThompsonMatcher.allocate()+0x88) [0x4e80b0]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(pure nothrow @trusted std.regex.Thread!(ulong).Thread* std.regex.ThompsonMatcher!(char, std.regex.Input!(char).Input.BackLooper).ThompsonMatcher.createStart(ulong, uint)+0x59) [0x4e84d9]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(@trusted std.regex.ThompsonMatcher!(char, std.regex.Input!(char).Input.BackLooper).ThompsonMatcher.MatchResult std.regex.ThompsonMatcher!(char, std.regex.Input!(char).Input.BackLooper).ThompsonMatcher.matchOneShot(std.regex.Group!(ulong).Group[], uint)+0xf9) [0x4e7eb1]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(@trusted void std.regex.ThompsonMatcher!(char).ThompsonMatcher.eval!(true).eval(std.regex.Thread!(ulong).Thread*, std.regex.Group!(ulong).Group[])+0x1672) [0x4e646a]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(@trusted std.regex.ThompsonMatcher!(char).ThompsonMatcher.MatchResult std.regex.ThompsonMatcher!(char).ThompsonMatcher.matchOneShot(std.regex.Group!(ulong).Group[], uint)+0x150) [0x4e2f88]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(@trusted bool std.regex.ThompsonMatcher!(char).ThompsonMatcher.match(std.regex.Group!(ulong).Group[])+0x9d) [0x4e2aa5]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(ref @trusted std.regex.__T10RegexMatchTAyaS273std5regex15ThompsonMatcherZ.RegexMatch std.regex.__T10RegexMatchTAyaS273std5regex15ThompsonMatcherZ.RegexMatch.__ctor!(std.regex.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.Regex!(char).Regex)+0x1ae) [0x4ee856]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(@safe std.regex.__T10RegexMatchTAyaS273std5regex15ThompsonMatcherZ.RegexMatch std.regex.match!(immutable(char)[], std.regex.Regex!(char).Regex).match(immutable(char)[], std.regex.Regex!(char).Regex)+0x63) [0x4fa423]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(_Dmain+0x78ff) [0x4bd29f]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll().void __lambda1()+0x18) [0x507b3c]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate())+0x2a) [0x507a96]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll()+0x30) [0x507afc]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate())+0x2a) [0x507a96]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(_d_run_main+0x1a3) [0x507a17]
/tmp/.rdmd-1000/rdmd-test.d-F2E4C955E1856CA0235A274413477A45/test(main+0x25) [0x502a7d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f967ddb0de5]
Comment #11 by dmitry.olsh — 2014-01-06T08:38:28Z
(In reply to comment #10)
> Created an attachment (id=1310) [details]
> regex example
>
> This regexp fails with
> "аллея Театральная, д. 3, стр. 1".
>
Somewhat reduced test case:
void main(){
import std.regex;
auto r = regex(`([а-яА-Я\-_]+\s*)+(?<=[\s\.,\^])`);
match("аллея Театральная", r);
}
Investigation shows it's related to lookaround.
P.S. I suggest in future to post new bugs as new reports, even if the symptoms are similar to some older bug. REOPENED is for cases where the same issue happens again (regression, patch was reverted etc.).
Comment #12 by ilyayaroshenko — 2014-01-06T09:15:56Z
(In reply to comment #11)
> (In reply to comment #10)
> > Created an attachment (id=1310) [details] [details]
> > regex example
> >
> > This regexp fails with
> > "аллея Театральная, д. 3, стр. 1".
> >
>
> Somewhat reduced test case:
> void main(){
> import std.regex;
> auto r = regex(`([а-яА-Я\-_]+\s*)+(?<=[\s\.,\^])`);
> match("аллея Театральная", r);
> }
>
> Investigation shows it's related to lookaround.
>
> P.S. I suggest in future to post new bugs as new reports, even if the symptoms
> are similar to some older bug. REOPENED is for cases where the same issue
> happens again (regression, patch was reverted etc.).
Ok, Thanks!