Bug 14591 – [SPEC] Ambiguity between extern(Pascal) and template value parameters
Status
RESOLVED
Resolution
WONTFIX
Severity
normal
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2015-05-16T12:29:58Z
Last change time
2020-08-21T23:05:31Z
Keywords
mangling, pull
Assigned to
No Owner
Creator
Iain Buclaw
Comments
Comment #0 by ibuclaw — 2015-05-16T12:29:58Z
It's impossible to tell when demangling a symbol whether a 'V' we have encountered is for a extern(Pascal) calling convention or a template value parameter.
Two examples:
_D8demangle32__T4testTS8demangle3fooVnnZ3barZ3bazFZv
_D8demangle27__T4testTS8demangle3fooVnnZ3bar3bazFZv
One should be demangled to:
demangle.test!(demangle.foo(none, none).bar).baz()
and the other to:
demangle.test!(demangle.foo, null).bar.baz()
Because of this, I suggest Pascal calling convention mangle symbol should be changed to another symbol that is not shared with TemplateArgX (or even better, we should remove Pascal entirely)
Comment #1 by r.sagitario — 2017-04-11T07:11:06Z
I think this ambiguity doesn't exist: a template value only occurs inside a template argument list, and any previous argument terminates without needing lookahead. It's either SymbolName, QualifiedName (terminating on a SymbolName), 'Z' or a type, and a type terminates with a QualifiedName at worst.
Comment #2 by ibuclaw — 2017-04-12T07:31:12Z
I think this has been fixed by clarifying the spec in dlang.org/#1511:
https://github.com/dlang/dlang.org/commit/6230e592c983ae742ac5ebae8db060748eb08fb8#diff-7bd92f948c3c1d8d0d16a465bb464b99L241
There's ambiguity without this distinction in the grammar.
Furthermore, if that wasn't enough, dlang.org/#1626 puts any further ambiguity to rest.
https://github.com/dlang/dlang.org/pull/1626
The binutils D demangler was implemented according to the current spec of the time 2/3 years ago, and so does the following:
---
if (ISDIGIT (*mangled))
mangled = dlang_parse_symbol (decl, mangled);
else if (strncmp (mangled, "_D", 2) == 0)
{
mangled += 2;
mangled = dlang_parse_symbol (decl, mangled);
}
---
With the updated spec now in, I think the corrective action on my side is to separate the handling of MangleName, QualifiedName, and SymbolName into different functions, so that the above becomes:
---
if (ISDIGIT (*mangled))
mangled = dlang_parse_symbol (decl, mangled);
else if (strncmp (mangled, "_D", 2) == 0)
mangled = dlang_parse_mangle (decl, mangled);
---
Comment #3 by ibuclaw — 2017-04-12T08:10:38Z
Or maybe not, here's one symbol that fails the testsuite once I have made (some) fix-ups and removed the Pascal ambiguity check.
_D3std6traits37__T7fqnTypeTC6ObjectVbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
To break it down:
_D3std6traits37__T7fqnTypeTC6ObjectVbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
MangledName -> _D QualifiedName Type
3std6traits37__T7fqnTypeTC6ObjectVbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
QualifiedName -> SymbolName QualifiedName
SymbolName -> LName
LName -> 3 std
6traits37__T7fqnTypeTC6ObjectVbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
QualifiedName -> SymbolName QualifiedName
SymbolName -> LName
LName -> 6 traits
37__T7fqnTypeTC6ObjectVbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
QualifiedName -> SymbolName QualifiedName
SymbolName -> TemplateInstanceName
TemplateInstanceName -> 37 __T LName TemplateArgs Z
LName -> 7 fqnType
TC6ObjectVbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
TemplateArg -> T Type
Type -> C QualifiedName
6ObjectVbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
QualifiedName -> SymbolName TypeFunctionNoReturn QualifiedName
SymbolName -> LName
LName -> 6 Object
Vbi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
TypeFunctionNoReturn -> CallConvention Parameters ParamClose
CallConvention -> V # <-- Pascal!!!
bi0Vbi0Vbi0Vbi0Z13addQualifiersFAyabbbbZAya
Parameters -> Parameter Parameters
Parameter -> Type
Type -> bool
Type -> int
Type -> Found '0' # <-- bad symbol!
Comment #4 by ibuclaw — 2017-04-12T08:11:10Z
Or am I missing something here...
Comment #5 by ibuclaw — 2017-04-12T08:23:15Z
(In reply to Iain Buclaw from comment #4)
> Or am I missing something here...
I don't think I am, because the parser is in the middle of QualifiedName, and peeking the next character matches CallConvention, we can't know for sure whether this is really a TypeFunctionNoReturn or the next TemplateArg. So backtracking is required if the first fails.
I'd like to avoid this backtracking. Which as per first post, could either be solved by adding a stop symbol to mark the end of a QualifiedName, or Pascal should be given another identifier.
Comment #6 by r.sagitario — 2017-04-12T22:02:19Z
I think you are right. Any other of the TemplateArg prefixes 'S' (TypeStruct), 'H' (TypeAssocArray) and 'T' (TypeTypedef) should be affected aswell.
Comment #7 by r.sagitario — 2017-04-12T22:45:23Z
There is also an accuracy in the grammar still. The actual implementation for TemplateArgX is
TemplateArgX:
'T' Type
| 'V' Type Value
| 'S' Number QualifiedName
| 'S' Number MangledName
;
where Number is the length of the full name.
MangledName includes C,C++ and pragma(mangle) manglings.
Number QualifiedName causes two concatenated Numbers, Issue 3043.
Comment #8 by r.sagitario — 2017-04-12T22:52:50Z
> Any other of the TemplateArg prefixes 'S' (TypeStruct), 'H' (TypeAssocArray) and 'T' (TypeTypedef) should be affected aswell.
Probably not, as the rule above avoids trouble with symbol aliases, and types only contain QualifiedName which only contain function types.
Comment #9 by ibuclaw — 2017-04-14T09:11:58Z
(In reply to Rainer Schuetze from comment #8)
> > Any other of the TemplateArg prefixes 'S' (TypeStruct), 'H' (TypeAssocArray) and 'T' (TypeTypedef) should be affected aswell.
>
> Probably not, as the rule above avoids trouble with symbol aliases, and
> types only contain QualifiedName which only contain function types.
Yeah, I think I would have spotted it otherwise.
I think all I can do on my side is have a boolean function that attempts to parse TypeFunctionNoReturn and returns true only if a digit immediately follows (have matched the next QualifiedName in the grammar rule).
It means in the worst case I'm going over the section of the symbol twice, but at least I don't have a special case for Pascal. :-(
I'm going to drop this down to Normal, as the implementation on my side is now "satisfactory" granted now that handlers for MangledName and QualifiedName have been split into two separate routines.
Still this should be kept open on the more wider basis that this prevents the mangling grammar from being context-free.
Comment #12 by r.sagitario — 2017-06-02T19:05:27Z
Here's a symbol from the phobos unittests that hits this ambiguity and is pretty difficult to demangle, even with backtracking:
_D3std8typecons118__T8NullableTC3std8typecons19__unittestL3090_156FZ12TestToStringVC3std8typecons19__unittestL3090_156FZ12TestToStringnZ8Nullable6__initZ