Bug 23999 – literal suffixes dont mix well with template instantiations

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2023-06-19T18:44:18Z
Last change time
2024-03-21T05:27:08Z
Keywords
pull
Assigned to
No Owner
Creator
Puneet Goel

Comments

Comment #0 by puneet — 2023-06-19T18:44:18Z
Currently, DMD allows variable declaration without whitespace after a template instance that takes a string as a parameter. This leads to potentially ambiguous code. Consider: class Foo(string str) { enum STR = str; } class Bar { Foo!q{foo}bb; Foo!q{foo}cc; } void main() { Bar p; pragma(msg, p.tupleof[0].stringof); pragma(msg, p.tupleof[1].stringof); } // The compiler outputs: // // p.bb // p.c
Comment #1 by b2.temp — 2023-06-19T20:26:09Z
Integer literals are affected in the same way : ``` class Foo(alias str) { enum STR = str; } class Bar { Foo!2LUNGS; } void main() { Bar p; pragma(msg, p.tupleof[0].stringof); } ```
Comment #2 by dkorpel — 2023-06-20T20:15:03Z
There is no ambiguity. > The source text is split into tokens using the maximal munch algorithm, i.e., the lexical analyzer assumes the longest possible token. https://dlang.org/spec/lex.html#source_text The longest possible tokens in your example are `q{foo}` and `q{foo}c` respectively, leaving `bb` and `c` as identifier tokens.
Comment #3 by puneet — 2023-06-21T01:46:22Z
By ambiguity I meant how it would be perceived by an end user. "Programs are meant to be read by humans and only incidentally for computers to execute." Donald Knuth
Comment #4 by puneet — 2023-06-21T01:52:56Z
There needs to be a mandatory delimiter (whitespace) between the limiter, which is a part of the type (template instance) and the object being declared.
Comment #5 by puneet — 2023-06-21T01:54:08Z
s/limiter/literal
Comment #6 by dlang-bot — 2023-06-21T12:19:45Z
@ntrel created dlang/dmd pull request #15339 "Fix Issue 23999 - literal suffixes dont mix well with template instan…" fixing this issue: - Fix Issue 23999 - literal suffixes dont mix well with template instantiations https://github.com/dlang/dmd/pull/15339
Comment #7 by dkorpel — 2023-06-21T12:48:30Z
(In reply to Puneet Goel from comment #4) > There needs to be a mandatory delimiter (whitespace) between the limiter, > which is a part of the type (template instance) and the object being > declared. This is impossible to express in the lexical grammar, and it isn't the compiler's job to enforce good style, that's something for a linter.
Comment #8 by nick — 2023-06-21T15:51:21Z
> This is impossible to express in the lexical grammar I have changed the pull to a warning rather than an error, like the dangling else warning. > it isn't the compiler's job to enforce good style Style is a matter of preference. No reasonable person's preference can allow one of these suffixes right next to an identifier, because people read the suffix as part of the identifier.
Comment #9 by dkorpel — 2023-06-21T19:54:47Z
(In reply to Nick Treleaven from comment #8) > No reasonable person's preference can allow one of these suffixes right next to an identifier, because people read the suffix as part of the identifier. I agree, but it's not the compiler's job to enforce readable code. This is a valid statement, though no reasonable person would write this: ``` for ({{}for ({}0;){}}0;){}for (X:{}0;){}cast(e)q{u}w~r"o"w; ``` Yes, there are cases of parser footguns that are prevented by the compiler, such as a lambda returning a function literal `() => {}`. Such exceptions should not be the norm though, and this issue's particular case is so minor that I don't think it's worth the added complexity. Consider how there's also (minor) second order effects, like how this change could break a D minification tool. D-scanner would be the right place to implement such a check: https://github.com/dlang-community/D-Scanner
Comment #10 by puneet — 2023-06-22T10:07:44Z
I think it is a bug (or maybe I should call it unexpected compiler behavior) at a more fundamental level. Consider how both clang and gcc treat literal suffix errors differently compared to Dlang: $ cat /tmp/test.d ulong test = 44LUNG; $ ldc2 /tmp/test.d /tmp/test.d(1): Error: semicolon expected following auto declaration, not `NG` /tmp/test.d(1): Error: no identifier for declarator `NG` $ cat /tmp/test.c int long unsigned test = 44LUNG; $ gcc /tmp/test.c /tmp/test.c:1:26: error: invalid suffix "LUNG" on integer constant 1 | int long unsigned test = 44LUNG; | ^~~~~~ $ clang /tmp/test.c /tmp/test.c:1:28: error: invalid suffix 'LUNG' on integer constant int long unsigned test = 44LUNG; ^ 1 error generated.
Comment #11 by puneet — 2023-06-22T10:08:58Z
Should I reopen this error, or file another bug for unexpected/wrong literal suffix parsing?
Comment #12 by maxhaton — 2023-06-22T16:40:28Z
Why is this impossible to express? You can write a new rule to match it and then mark it illegal if nothing else. (That and the grammar is not the whole). I think it's worth banning this, C does for example.
Comment #13 by bugzilla — 2024-03-21T05:27:08Z
Interestingly, ImportC sez: test.c(1): Error: missing comma or semicolon after declaration of `test`, found `NG` instead