Bug 93 – Template regex example fails without -release switch
Status
RESOLVED
Resolution
INVALID
Severity
trivial
Priority
P5
Component
dmd
Product
D
Version
D1 (retired)
Platform
x86
OS
All
Creation time
2006-04-08T13:41:00Z
Last change time
2014-02-15T02:09:49Z
Assigned to
bugzilla
Creator
godaves
Comments
Comment #0 by godaves — 2006-04-08T13:41:27Z
Without the -release switch, the template example for the 2006 SDWest Presentation fails on both linux and Windows.
http://www.digitalmars.com/d/templates-revisited.html
The linker error on Windows is:
Error 42: Symbol Undefined _array_5regex
--- errorlevel 1
The linker error on Linux is:
test_regex.o(.gnu.linkonce.t_D5regex49__T10regexMatchVG12aa12_5b612d7a5d2a5c732a5c772aZ10regexMatchFAaZAAa+0x3a): In function `_D5regex49__T10regexMatchVG12aa12_5b612d7a5d2a5c732a5c772aZ10regexMatchFAaZAAa':
: undefined reference to `_array_5regex'
test_regex.o(.gnu.linkonce.t_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi+0x16): In function `_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi':
: undefined reference to `_array_5regex'
test_regex.o(.gnu.linkonce.t_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi+0x33): In function `_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZi':
: undefined reference to `_array_5regex'
test_regex.o(.gnu.linkonce.t_D5regex78__T14testZeroOrMoreS55_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZiZ14testZeroOrMoreFAaZi+0x3d): In function `_D5regex78__T14testZeroOrMoreS55_D5regex30__T9testRangeVAaa1_61VAaa1_7aZ9testRangeFAaZiZ14testZeroOrMoreFAaZi':
: undefined reference to `_array_5regex'
test_regex.o(.gnu.linkonce.t_D5regex32__T9testRangeVG1aa1_00VG1aa1_20Z9testRangeFAaZi+0x15): In function `_D5regex32__T9testRangeVG1aa1_00VG1aa1_20Z9testRangeFAaZi':
: undefined reference to `_array_5regex'
test_regex.o(.gnu.linkonce.t_D5regex32__T9testRangeVG1aa1_00VG1aa1_20Z9testRangeFAaZi+0x29): more undefined references to `_array_5regex' follow
collect2: ld returned 1 exit status
--- errorlevel 1
Source Code
-----------
test_regex.d:
-------------
import std.stdio;
import temp_regex;
void main()
{
auto exp = ®exMatch!(r"[a-z]*\s*\w*");
writefln("matches: %s", exp("hello world"));
}
;---
temp_regex.d
------------
module temp_regex;
const int testFail = -1;
/**
* Compile pattern[] and expand to a custom generated
* function that will take a string str[] and apply the
* regular expression to it, returning an array of matches.
*/
template regexMatch(char[] pattern)
{
char[][] regexMatch(char[] str)
{
char[][] results;
int n = regexCompile!(pattern).fn(str);
if (n != testFail && n > 0)
results ~= str[0..n];
return results;
}
}
/******************************
* The testXxxx() functions are custom generated by templates
* to match each predicate of the regular expression.
*
* Params:
* char[] str the input string to match against
*
* Returns:
* testFail failed to have a match
* n >= 0 matched n characters
*/
/// Always match
template testEmpty()
{
int testEmpty(char[] str) { return 0; }
}
/// Match if testFirst(str) and testSecond(str) match
template testUnion(alias testFirst, alias testSecond)
{
int testUnion(char[] str)
{
int n1 = testFirst(str);
if (n1 != testFail)
{
int n2 = testSecond(str[n1 .. $]);
if (n2 != testFail)
return n1 + n2;
}
return testFail;
}
}
/// Match if first part of str[] matches text[]
template testText(char[] text)
{
int testText(char[] str)
{
if (str.length &&
text.length <= str.length &&
str[0..text.length] == text
)
return text.length;
return testFail;
}
}
/// Match if testPredicate(str) matches 0 or more times
template testZeroOrMore(alias testPredicate)
{
int testZeroOrMore(char[] str)
{
if (str.length == 0)
return 0;
int n = testPredicate(str);
if (n != testFail)
{
int n2 = testZeroOrMore!(testPredicate)(str[n .. $]);
if (n2 != testFail)
return n + n2;
return n;
}
return 0;
}
}
/// Match if term1[0] <= str[0] <= term2[0]
template testRange(char[] term1, char[] term2)
{
int testRange(char[] str)
{
if (str.length && str[0] >= term1[0]
&& str[0] <= term2[0])
return 1;
return testFail;
}
}
/// Match if ch[0]==str[0]
template testChar(char[] ch)
{
int testChar(char[] str)
{
if (str.length && str[0] == ch[0])
return 1;
return testFail;
}
}
/// Match if str[0] is a word character
template testWordChar()
{
int testWordChar(char[] str)
{
if (str.length &&
(
(str[0] >= 'a' && str[0] <= 'z') ||
(str[0] >= 'A' && str[0] <= 'Z') ||
(str[0] >= '0' && str[0] <= '9') ||
str[0] == '_'
)
)
{
return 1;
}
return testFail;
}
}
/*****************************************************/
/**
* Returns the front of pattern[] up until
* the end or a special character.
*/
template parseTextToken(char[] pattern)
{
static if (pattern.length > 0)
{
static if (isSpecial!(pattern))
const char[] parseTextToken = "";
else
const char[] parseTextToken =
pattern[0..1] ~ parseTextToken!(pattern[1..$]);
}
else
const char[] parseTextToken="";
}
/**
* Parses pattern[] up to and including terminator.
* Returns:
* token[] everything up to terminator.
* consumed number of characters in pattern[] parsed
*/
template parseUntil(char[] pattern,char terminator,bool fuzzy=false)
{
static if (pattern.length > 0)
{
static if (pattern[0] == '\\')
{
static if (pattern.length > 1)
{
const char[] nextSlice = pattern[2 .. $];
alias parseUntil!(nextSlice,terminator,fuzzy) next;
const char[] token = pattern[0 .. 2] ~ next.token;
const uint consumed = next.consumed+2;
}
else
{
pragma(msg,"Error: expected character to follow \\");
static assert(false);
}
}
else static if (pattern[0] == terminator)
{
const char[] token="";
const uint consumed = 1;
}
else
{
const char[] nextSlice = pattern[1 .. $];
alias parseUntil!(nextSlice,terminator,fuzzy) next;
const char[] token = pattern[0..1] ~ next.token;
const uint consumed = next.consumed+1;
}
}
else static if (fuzzy)
{
const char[] token = "";
const uint consumed = 0;
}
else
{
pragma(msg,"Error: expected " ~
terminator ~
" to terminate group expression");
static assert(false);
}
}
/**
* Parse contents of character class.
* Params:
* pattern[] = rest of pattern to compile
* Output:
* fn = generated function
* consumed = number of characters in pattern[] parsed
*/
template regexCompileCharClass2(char[] pattern)
{
static if (pattern.length > 0)
{
static if (pattern.length > 1)
{
static if (pattern[1] == '-')
{
static if (pattern.length > 2)
{
alias testRange!(pattern[0..1], pattern[2..3]) termFn;
const uint thisConsumed = 3;
const char[] remaining = pattern[3 .. $];
}
else // length is 2
{
pragma(msg,
"Error: expected char following '-' in char class");
static assert(false);
}
}
else // not '-'
{
alias testChar!(pattern[0..1]) termFn;
const uint thisConsumed = 1;
const char[] remaining = pattern[1 .. $];
}
}
else
{
alias testChar!(pattern[0..1]) termFn;
const uint thisConsumed = 1;
const char[] remaining = pattern[1 .. $];
}
alias regexCompileCharClassRecurse!(termFn,remaining) recurse;
alias recurse.fn fn;
const uint consumed = recurse.consumed + thisConsumed;
}
else
{
alias testEmpty!() fn;
const uint consumed = 0;
}
}
/**
* Used to recursively parse character class.
* Params:
* termFn = generated function up to this point
* pattern[] = rest of pattern to compile
* Output:
* fn = generated function including termFn and
* parsed character class
* consumed = number of characters in pattern[] parsed
*/
template regexCompileCharClassRecurse(alias termFn,char[] pattern)
{
static if (pattern.length > 0 && pattern[0] != ']')
{
alias regexCompileCharClass2!(pattern) next;
alias testOr!(termFn,next.fn,pattern) fn;
const uint consumed = next.consumed;
}
else
{
alias termFn fn;
const uint consumed = 0;
}
}
/**
* At start of character class. Compile it.
* Params:
* pattern[] = rest of pattern to compile
* Output:
* fn = generated function
* consumed = number of characters in pattern[] parsed
*/
template regexCompileCharClass(char[] pattern)
{
static if (pattern.length > 0)
{
static if (pattern[0] == ']')
{
alias testEmpty!() fn;
const uint consumed = 0;
}
else
{
alias regexCompileCharClass2!(pattern) charClass;
alias charClass.fn fn;
const uint consumed = charClass.consumed;
}
}
else
{
pragma(msg,"Error: expected closing ']' for character class");
static assert(false);
}
}
/**
* Look for and parse '*' postfix.
* Params:
* test = function compiling regex up to this point
* pattern[] = rest of pattern to compile
* Output:
* fn = generated function
* consumed = number of characters in pattern[] parsed
*/
template regexCompilePredicate(alias test, char[] pattern)
{
static if (pattern.length > 0 && pattern[0] == '*')
{
alias testZeroOrMore!(test) fn;
const uint consumed = 1;
}
else
{
alias test fn;
const uint consumed = 0;
}
}
/**
* Parse escape sequence.
* Params:
* pattern[] = rest of pattern to compile
* Output:
* fn = generated function
* consumed = number of characters in pattern[] parsed
*/
template regexCompileEscape(char[] pattern)
{
static if (pattern.length > 0)
{
static if (pattern[0] == 's')
{
// whitespace char
alias testRange!("\x00","\x20") fn;
}
else static if (pattern[0] == 'w')
{
//word char
alias testWordChar!() fn;
}
else
{
alias testChar!(pattern[0 .. 1]) fn;
}
const uint consumed = 1;
}
else
{
pragma(msg,"Error: expected char following '\\'");
static assert(false);
}
}
/**
* Parse and compile regex represented by pattern[].
* Params:
* pattern[] = rest of pattern to compile
* Output:
* fn = generated function
*/
template regexCompile(char[] pattern)
{
static if (pattern.length > 0)
{
static if (pattern[0] == '[')
{
const char[] charClassToken =
parseUntil!(pattern[1 .. $],']').token;
alias regexCompileCharClass!(charClassToken) charClass;
const char[] token = pattern[0 .. charClass.consumed+2];
const char[] next = pattern[charClass.consumed+2 .. $];
alias charClass.fn test;
}
else static if (pattern[0] == '\\')
{
alias regexCompileEscape!(pattern[1..pattern.length]) escapeSequence;
const char[] token = pattern[0 .. escapeSequence.consumed+1];
const char[] next =
pattern[escapeSequence.consumed+1 .. $];
alias escapeSequence.fn test;
}
else
{
const char[] token = parseTextToken!(pattern);
static assert(token.length > 0);
const char[] next = pattern[token.length .. $];
alias testText!(token) test;
}
alias regexCompilePredicate!(test, next) term;
const char[] remaining = next[term.consumed .. next.length];
alias regexCompileRecurse!(term,remaining).fn fn;
}
else
alias testEmpty!() fn;
}
template regexCompileRecurse(alias term,char[] pattern)
{
static if (pattern.length > 0)
{
alias regexCompile!(pattern) next;
alias testUnion!(term.fn, next.fn) fn;
}
else
alias term.fn fn;
}
/// Utility function for parsing
template isSpecial(char[] pattern)
{
static if (
pattern[0] == '*' ||
pattern[0] == '+' ||
pattern[0] == '?' ||
pattern[0] == '.' ||
pattern[0] == '[' ||
pattern[0] == '{' ||
pattern[0] == '(' ||
pattern[0] == ')' ||
pattern[0] == '$' ||
pattern[0] == '^' ||
pattern[0] == '\\'
)
const isSpecial = true;
else
const isSpecial = false;
}
Comment #1 by clugdbug — 2006-04-11T02:13:19Z
This isn't a blocker.
Comment #2 by godaves — 2006-04-11T09:08:52Z
"Blocker: Blocks development and/or testing work." It's a blocker if you run into that bug and want to use Contract Programming during the course of development and testing. After all, that's a major part of the langauge. Let Walter make the call.
Comment #3 by clugdbug — 2006-04-12T09:55:50Z
I've tried to reproduce this on Windows with DMD 0.153. It always compiles for me.
I also don't understand the reference to Contract Programming in message #3 (there's no contract programming in this code).
Comment #4 by godaves — 2006-04-12T21:10:59Z
(In reply to comment #3)
> I've tried to reproduce this on Windows with DMD 0.153. It always compiles for
> me.
> I also don't understand the reference to Contract Programming in message #3
> (there's no contract programming in this code).
>
The linker error happens because of array bounds checking code that is omitted with -release. I recreated it, but it is arguably my mistake (read on).
I copied the code into two files, test_regex.d and temp_regex.d.
Then I recompiled:
C:\Zz\temp>dmd test_regex.d
C:\dmd\bin\..\..\dm\bin\link.exe test_regex,,,user32+kernel32/noi;
OPTLINK (R) for Win32 Release 7.50B1
Copyright (C) Digital Mars 1989 - 2001 All Rights Reserved
test_regex.obj(test_regex)
Error 42: Symbol Undefined _array_10temp_regex
--- errorlevel 1
Then I recompiled again with -release and ran it:
C:\Zz\temp>dmd test_regex.d -release
C:\dmd\bin\..\..\dm\bin\link.exe test_regex,,,user32+kernel32/noi;
C:\Zz\temp>test_regex
matches: [hello]
That recreates the problem, and I should have specified the exact steps better.
But, if I recompile w/o -release like so:
C:\Zz\temp>dmd test_regex.d temp_regex.d
C:\dmd\bin\..\..\dm\bin\link.exe test_regex+temp_regex,,,user32+kernel32/noi;
C:\Zz\temp>test_regex
matches: [hello]
Then it works. The reason I didn't compile in temp_regex.d (or link in the .obj compiled separately) is because the code in tempregex.d is all of either const or template code. Being used to C/++ #include <header>, I just compiled the main() module.
So under normal circumstances (e.g. the regex code is linked into a lib and that lib is linked with the app.) this 'bug' would probably not have happened, so along with the other things you pointed out, I lowered the Severity for it to 'trivial' and priority to 'informational'.
This is a potentially frustrating inconsistency between the compiler switches because, as the templates are always instantiated in the declaritive scope, the compiler generated stuff is (correctly) generated for the same scope. I say potentially frustrating because sometimes compiler generated stuff is "out of sight, out of mind", at least for me.
Walter probably spotted this right away from the linker error and just ignored it or sat back and chuckled as the e-mails went back and forth <g>
(The reference to contract programming is because the -release switch omits pre and post contracts, along with asserts, invariants, etc. So, what I was referring to is that if you ran into this bug, then in order to get it to compile the -release switch would remove your CP code, hence "blocker").
Thanks,
- Dave