Bug 23999 – literal suffixes dont mix well with template instantiations
Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2023-06-19T18:44:18Z
Last change time
2024-03-21T05:27:08Z
Keywords
pull
Assigned to
No Owner
Creator
Puneet Goel
Comments
Comment #0 by puneet — 2023-06-19T18:44:18Z
Currently, DMD allows variable declaration without whitespace after a template instance that takes a string as a parameter. This leads to potentially ambiguous code. Consider:
class Foo(string str) {
enum STR = str;
}
class Bar {
Foo!q{foo}bb;
Foo!q{foo}cc;
}
void main() {
Bar p;
pragma(msg, p.tupleof[0].stringof);
pragma(msg, p.tupleof[1].stringof);
}
// The compiler outputs:
//
// p.bb
// p.c
Comment #1 by b2.temp — 2023-06-19T20:26:09Z
Integer literals are affected in the same way :
```
class Foo(alias str) {
enum STR = str;
}
class Bar {
Foo!2LUNGS;
}
void main() {
Bar p;
pragma(msg, p.tupleof[0].stringof);
}
```
Comment #2 by dkorpel — 2023-06-20T20:15:03Z
There is no ambiguity.
> The source text is split into tokens using the maximal munch algorithm, i.e., the lexical analyzer assumes the longest possible token.
https://dlang.org/spec/lex.html#source_text
The longest possible tokens in your example are `q{foo}` and `q{foo}c` respectively, leaving `bb` and `c` as identifier tokens.
Comment #3 by puneet — 2023-06-21T01:46:22Z
By ambiguity I meant how it would be perceived by an end user.
"Programs are meant to be read by humans and only incidentally for computers to execute."
Donald Knuth
Comment #4 by puneet — 2023-06-21T01:52:56Z
There needs to be a mandatory delimiter (whitespace) between the limiter, which is a part of the type (template instance) and the object being declared.
Comment #5 by puneet — 2023-06-21T01:54:08Z
s/limiter/literal
Comment #6 by dlang-bot — 2023-06-21T12:19:45Z
@ntrel created dlang/dmd pull request #15339 "Fix Issue 23999 - literal suffixes dont mix well with template instan…" fixing this issue:
- Fix Issue 23999 - literal suffixes dont mix well with template instantiations
https://github.com/dlang/dmd/pull/15339
Comment #7 by dkorpel — 2023-06-21T12:48:30Z
(In reply to Puneet Goel from comment #4)
> There needs to be a mandatory delimiter (whitespace) between the limiter,
> which is a part of the type (template instance) and the object being
> declared.
This is impossible to express in the lexical grammar, and it isn't the compiler's job to enforce good style, that's something for a linter.
Comment #8 by nick — 2023-06-21T15:51:21Z
> This is impossible to express in the lexical grammar
I have changed the pull to a warning rather than an error, like the dangling else warning.
> it isn't the compiler's job to enforce good style
Style is a matter of preference. No reasonable person's preference can allow one of these suffixes right next to an identifier, because people read the suffix as part of the identifier.
Comment #9 by dkorpel — 2023-06-21T19:54:47Z
(In reply to Nick Treleaven from comment #8)
> No reasonable person's preference can allow one of these suffixes right next to an identifier, because people read the suffix as part of the identifier.
I agree, but it's not the compiler's job to enforce readable code. This is a valid statement, though no reasonable person would write this:
```
for ({{}for ({}0;){}}0;){}for (X:{}0;){}cast(e)q{u}w~r"o"w;
```
Yes, there are cases of parser footguns that are prevented by the compiler, such as a lambda returning a function literal `() => {}`. Such exceptions should not be the norm though, and this issue's particular case is so minor that I don't think it's worth the added complexity. Consider how there's also (minor) second order effects, like how this change could break a D minification tool.
D-scanner would be the right place to implement such a check: https://github.com/dlang-community/D-Scanner
Comment #10 by puneet — 2023-06-22T10:07:44Z
I think it is a bug (or maybe I should call it unexpected compiler behavior) at a more fundamental level. Consider how both clang and gcc treat literal suffix errors differently compared to Dlang:
$ cat /tmp/test.d
ulong test = 44LUNG;
$ ldc2 /tmp/test.d
/tmp/test.d(1): Error: semicolon expected following auto declaration, not `NG`
/tmp/test.d(1): Error: no identifier for declarator `NG`
$ cat /tmp/test.c
int long unsigned test = 44LUNG;
$ gcc /tmp/test.c
/tmp/test.c:1:26: error: invalid suffix "LUNG" on integer constant
1 | int long unsigned test = 44LUNG;
| ^~~~~~
$ clang /tmp/test.c
/tmp/test.c:1:28: error: invalid suffix 'LUNG' on integer constant
int long unsigned test = 44LUNG;
^
1 error generated.
Comment #11 by puneet — 2023-06-22T10:08:58Z
Should I reopen this error, or file another bug for unexpected/wrong literal suffix parsing?
Comment #12 by maxhaton — 2023-06-22T16:40:28Z
Why is this impossible to express? You can write a new rule to match it and then mark it illegal if nothing else. (That and the grammar is not the whole).
I think it's worth banning this, C does for example.
Comment #13 by bugzilla — 2024-03-21T05:27:08Z
Interestingly, ImportC sez:
test.c(1): Error: missing comma or semicolon after declaration of `test`, found `NG` instead