Bug 14073 – Allow token strings to use other types of brackets as delimiters

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P1
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2015-01-28T22:25:00Z
Last change time
2015-01-28T23:26:50Z
Assigned to
nobody
Creator
monkeyworks12

Comments

Comment #0 by monkeyworks12 — 2015-01-28T22:25:28Z
Currently, token strings *must* use curly brackets as delimiters. I think it would be a useful enhancement if they were able to use other kinds of brackets for delimiters as well. Ex: string a1 = q[This is a token string]; string a2 = q(This is also a token string]; string a3 = q<Angle brackets are also valid for token strings>; This is a very simple change that would aid in the creation of DSLs in D. One real-world example: template crange(string spec) { static processSpec(string spec) { import std.algorithm; import std.ascii; import std.conv; import std.functional; import std.range; import std.string; return spec.splitter(',').map!((s) { s = s.strip!(c => c.isWhite || c == '\''); auto start = s.front; auto end = s.retro.front; return start <= end ? iota(start, end + 1) .map!(c => cast(dchar)c).array : iota(end, start + 1) .retro .map!(c => cast(dchar)c).array; }).join; } enum crange = processSpec(spec); } void main(string[] argv) { import std.stdio; auto t1 = crange!q{ 'a'..'z', 'A'..'Z', '0'..'9' }; writeln(t1); auto t2 = crange!q{ '9'..'0', 'Z'..'A', 'z'..'a' }; writeln(t2); } This doesn't look too bad, but it would be nice if square brackets could be used instead to better mimic the Pascal feature, and better communicate to the user that an array will be produced. With this enhancement, the above code would become: void main(string[] argv) { import std.stdio; auto t1 = crange!q['a'..'z', 'A'..'Z', '0'..'9']; writeln(t1); auto t2 = charRange!q['9'..'0', 'Z'..'A', 'z'..'a']; writeln(t2); } Which I subjectively believe looks better and is more readable. There is no ambiguity here lexer-wise as far as I know, as the q signifies that either a token string or a delimited string follows. It seems to me this change is as simple as a small change in the parser to accept (), <>, and [] brackets for token strings as well as {} brackets.
Comment #1 by bearophile_hugs — 2015-01-28T22:29:17Z
(In reply to monkeyworks12 from comment #0) > auto t1 = crange!q{ 'a'..'z', 'A'..'Z', '0'..'9' }; > auto t1 = crange!q['a'..'z', 'A'..'Z', '0'..'9']; They look about the same to me. I don't see the need for this syntax extension. If we want to introduce new syntax it has to introduce a bigger gain (like the proposed [1,2]s syntax for fixed-size arrays).
Comment #2 by monkeyworks12 — 2015-01-28T22:51:28Z
(In reply to bearophile_hugs from comment #1) > (In reply to monkeyworks12 from comment #0) > > > auto t1 = crange!q{ 'a'..'z', 'A'..'Z', '0'..'9' }; > > > auto t1 = crange!q['a'..'z', 'A'..'Z', '0'..'9']; > > They look about the same to me. I don't see the need for this syntax > extension. If we want to introduce new syntax it has to introduce a bigger > gain (like the proposed [1,2]s syntax for fixed-size arrays). This is unrelated to my proposal, but can't you do something similar with templates? template sa(T, T[] array) if (isRValue!array) { enum sa = array; } auto staticArr = sa![1, 2];
Comment #3 by monkeyworks12 — 2015-01-28T22:52:49Z
(In reply to monkeyworks12 from comment #2) > template sa(T, T[] array) > if (isRValue!array) > { > enum sa = array; > } > > auto staticArr = sa![1, 2]; `enum sa = array` should be `enum T[array.length] sa = array`.
Comment #4 by dlang-bugzilla — 2015-01-28T23:05:35Z
> There is no ambiguity here lexer-wise as far as I know, as the q signifies > that either a token string or a delimited string follows. This is incorrect: enum q = 5; enum arr(int x) = new int[x]; auto r = arr!q[3]; Same with q(). > Which I subjectively believe looks better and is more readable. Even without ambiguity issues, I think this is not a strong enough argument to justify the added complexity for all the tools out there that have to process D source code (compilers, linters, formatters, syntax highlighters, etc.) I would say that D has enough kinds of string literals.
Comment #5 by home — 2015-01-28T23:08:51Z
q[] and q() would clash with the use of q as an identifier. Encountering a myriad of delimiters used for the same thing would also be confusing.
Comment #6 by blah38621 — 2015-01-28T23:20:10Z
D already has string literals with arbitrary delimiters (which, incidentally, are a massive pain to parse) I don't see anything this would do that those string literals can't.
Comment #7 by monkeyworks12 — 2015-01-28T23:22:33Z
(In reply to FG from comment #5) > q[] and q() would clash with the use of q as an identifier. Encountering a > myriad of delimiters used for the same thing would also be confusing. Ah, you're right. I guess that by itself stops this in its tracks.
Comment #8 by monkeyworks12 — 2015-01-28T23:24:23Z
(In reply to Orvid King from comment #6) > D already has string literals with arbitrary delimiters (which, > incidentally, are a massive pain to parse) I don't see anything this would > do that those string literals can't. Yes, and I agree they are pretty bad. It would be nice to have something like Perl's delimited strings... Anyway, I'll close this.