Bug 3971 – Syntax & semantics for array assigns

Status
RESOLVED
Resolution
INVALID
Severity
major
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2010-03-15T14:55:00Z
Last change time
2014-02-15T02:46:07Z
Assigned to
nobody
Creator
bearophile_hugs

Comments

Comment #0 by bearophile_hugs — 2010-03-15T14:55:47Z
This is written in the page about arrays: http://www.digitalmars.com/d/2.0/arrays.html s[] = t; // the 3 elements of t[3] are copied into s[3] s[] = t[]; // the 3 elements of t[3] are copied into s[3] s[] = 3; // same as s[0] = 3, s[1] = 3, s[2] = 3 p[0..2] = 3; // same as p[0] = 3, p[1] = 3 It's not good to have two different syntaxes to do the same thing, or two similar syntaxes with a different computational complexity. So I suggest to modify the array assign syntax this way: a[] = b[]; static dynamic static OK1 OK1 dynamic OK1 OK1 a = b[]; static dynamic static Err Err dynamic Err Err a[] = b; static dynamic static Err Err dynamic Err Err a = b; static dynamic static Err2 Err dynamic Err OK2 int i; a=i; static dynamic Err Err int i; a[] = i; static dynamic OK3 OK3 Key: Err = Syntax error OK1 = Copies all items from an array to the oter. OK2 = Copies just the stuct of the dynamic array, array body not copied. OK3 = Copies the value to all the items of the array. Err2 = Syntax error, becase there is no reference to copy, better to keep tidy the language. You can see that this too is disallowed: int a, b; a = b; This breaks generic code, but it's not good when the same syntax can be O(1) (because the same syntax used on dynamic arrays is a O(1)) or O(n), and the too is different. If you have comments please add them. I like that matrix because it contains all cases, so it's easy to see and design the situation, even if you don't agree with its contents. The accepted final version of that matrix can even be added back to the arrays page.
Comment #1 by bearophile_hugs — 2010-04-12T17:41:46Z
But be careful, in this code Tri c[] is seen as Tri[] c, and it doesn't compile: alias double[3] Tri; void main() { Tri a = [1, 2, 3]; Tri b = [10, 20, 30]; Tri c[] = a1[] - b1[]; // ERR }
Comment #2 by dfj1esp02 — 2010-04-14T11:07:25Z
What "static", "dynamic", "a" and "b" mean? And those diagrams?
Comment #3 by bearophile_hugs — 2010-04-14T12:09:37Z
"a" and "b" are arrays. "dynamic" means dynamic array. "static" means stack-allocated fixed-sized array.
Comment #4 by dfj1esp02 — 2010-04-16T13:08:07Z
a = b; static dynamic static Err2 Err dynamic Err OK2 ^ As I understand, this disallows assignment of a static array to the dynamic one? Is this related to bug 3395?
Comment #5 by bearophile_hugs — 2010-04-18T16:27:47Z
This: a = b; static dynamic static Err2 Err dynamic Err OK2 Means: int[5] a, b; a = b; // Err2 int[] c = new int[5]; a = c; // Err c = a; // Err int[] d = new int[5]; c = d; // OK But: a[] = c[]; // OK c[] = a[]; // OK
Comment #6 by dfj1esp02 — 2010-04-19T14:11:17Z
Why c=a; is an error?
Comment #7 by bearophile_hugs — 2010-04-19T14:42:31Z
You are right, the c=a; case can be allowed, it can just copy the ptr and length of the static array inside the struct of the dynamic array (this is how D currently works). Thank you for spotting it. All this discussion looks academic because so far Walter seems uninterested in this enhancement request.
Comment #8 by clugdbug — 2010-04-19T17:30:28Z
(In reply to comment #7) > All this discussion looks academic because so far Walter seems uninterested in > this enhancement request. Bearophile, please stop making absurd statements like that one. If Walter makes no comment on something, you can't conclude *anything* about his attitude to it. He's just extremely busy. --- Almost all bugs and weird behaviour involving array operations happen because internally the compiler doesn't distinguish between x[] and x, where x is a dynamic array. This causes a multitude of problems, especially when multidimensional arrays are involved. I agree with you that this syntax is problematic. I don't understand why it's currently permitted: int[3] s; int[3] t; s[] = t; // the 3 elements of t[3] are copied into s[3]
Comment #9 by dfj1esp02 — 2010-05-01T08:14:28Z
> internally the compiler doesn't distinguish between x[] and x, where x is a > dynamic array. This means, that array ops are a huge hack?
Comment #10 by clugdbug — 2010-05-01T22:08:28Z
(In reply to comment #9) > > internally the compiler doesn't distinguish between x[] and x, where x is a > > dynamic array. > This means, that array ops are a huge hack? No. According to the spec, it's not supposed to. x[] is exactly the same as x. The [] is only required for lvalues. So int [4] a, b, c; a[] = b + c; // should work It's almost as if there's a []= operator. At the moment, though, a[] = b+c; fails, and you need to write a[] = b[]+c[].
Comment #11 by dfj1esp02 — 2010-05-04T09:24:59Z
Such behavior is very bug-prone: in the case of tag array it does matter whether you meant array op or array itself as value, see bug 3395 comment 2. If compiler can distinguish between x and x[] it can raise error, but judging from what you tell, compiler tries to assume.
Comment #12 by clugdbug — 2010-05-04T09:43:56Z
(In reply to comment #11) > Such behavior is very bug-prone: in the case of tag array it does matter > whether you meant array op or array itself as value, see bug 3395 comment 2. If > compiler can distinguish between x and x[] it can raise error, but judging from > what you tell, compiler tries to assume. Walter has just clarified this in the newsgroup: it is intended to be mandatory to use [] inside array expressions. I have made a patch which enforces this.
Comment #13 by bearophile_hugs — 2010-08-11T08:39:21Z
While compiling this program: void main() { int[1] a1; int[1] a2[] = a1[]; } compatibility with C syntax produces this error message: test.d(3): Error: cannot implicitly convert expression (a1[]) of type int[] to int[1u][] See also bug 4580 as a way to solve this problem.
Comment #14 by bearophile_hugs — 2010-09-28T14:57:05Z
This is not an enhancement any more, because Walter has accepted something like this. Also, raising its priority a little because there is a risk of this bug becoming permanent in D2 code.
Comment #15 by denis.spir — 2010-12-06T01:26:24Z
(In reply to comment #7) > You are right, the c=a; case can be allowed, it can just copy the ptr and > length of the static array inside the struct of the dynamic array (this is how > D currently works). Thank you for spotting it. Is it really the way D currently works? Doesn't it dup the static array instead? I thought static ones were on the stack while dynamic ones were on the heap. Denis
Comment #16 by smjg — 2010-12-06T03:22:53Z
(In reply to comment #0) > a[] = b[]; static dynamic > static OK1 OK1 > dynamic OK1 OK1 > > a = b[]; static dynamic > static Err Err > dynamic Err Err > > a[] = b; static dynamic > static Err Err > dynamic Err Err > > a = b; static dynamic > static Err2 Err > dynamic Err OK2 I'm not sure I like this. What, exactly, would be the semantic difference between the rvalue b and the rvalue b[]? Currently, they are the same, and changing this might be confusing. > int i; a=i; static dynamic > Err Err > > int i; > a[] = i; static dynamic > OK3 OK3 This is how D behaves currently. > Key: > Err = Syntax error > OK1 = Copies all items from an array to the oter. > OK2 = Copies just the stuct of the dynamic array, array body not copied. > OK3 = Copies the value to all the items of the array. > Err2 = Syntax error, becase there is no reference to copy, better to keep > tidy the language. Making it a _syntax_ error cannot be done, because D's grammar is context-free by design. > > You can see that this too is disallowed: > int a, b; > a = b; ???
Comment #17 by k.hara.pg — 2011-11-02T00:04:27Z
With latest dmd (git master): ---- Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] true true true true int[3u] / int[] true true true true int[] / int[3u] true true true true int[] / int[] true true true true Rhs is a element, is it compilable? a a=N a[]=N a[0..2]=N int[3u] true true true int[] false true true Test code: ---- import std.stdio, std.typetuple; void main() { writeln("Rhs is an array, is it compilable?"); writeln("a\t/ b\t\ta=b\ta[]=b\ta=b[]\ta[]=b[]"); foreach (i, Lhs; TypeTuple!(int[3], int[])) foreach (j, Rhs; TypeTuple!(int[3], int[])) { writef("%s\t/ %s ", Lhs.stringof, Rhs.stringof); Lhs a = [0,0,0]; Rhs b = [1,2,3]; writef("\t%s", is(typeof({ a = b; }))); writef("\t%s", is(typeof({ a[] = b; }))); writef("\t%s", is(typeof({ a = b[]; }))); writef("\t%s", is(typeof({ a[] = b[]; }))); writeln(); } writeln("\nRhs is a element, is it compilable?"); writeln("a\t\t\ta=N\ta[]=N\ta[0..2]=N"); foreach (Lhs; TypeTuple!(int[3], int[])) { writef("%s\t\t", Lhs.stringof); Lhs a = [0,0,0]; writef("\t%s", is(typeof({ a = 9; }))); writef("\t%s", is(typeof({ a[] = 9; }))); writef("\t%s", is(typeof({ a[0..2] = 9; }))); writeln(); } }
Comment #18 by bearophile_hugs — 2011-11-15T19:33:10Z
(In reply to comment #17) > With latest dmd (git master): Very nice. See: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=149289
Comment #19 by bearophile_hugs — 2011-11-15T23:56:31Z
(In reply to comment #18) > See: > > http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=149289 This is from that post: This also means this is currently accepted: void main() { int[3] a; a = 1; assert(a == [1, 1, 1]); } While this is not accepted: void main() { int[] b = new int[3]; b = 1; assert(b == [1, 1, 1]); //Error: cannot implicitly convert expression (1) of type int to int[] } I'd like D to require a[]=1 in that first case too. I'd like the [] to be required every time an O(n) vector operation is done, for: - constancy with all other vector operations among two arrays, that require []; - and to avoid unwanted (and not easy to spot in the code) O(n) operations; - bugs and confusion in D newbies that don't have memorized all current special cases. On the other hand Don says that [] is only required for lvalues. I think this boils to a new table like this: Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] FALSE true FALSE true int[3u] / int[] FALSE true FALSE true int[] / int[3u] FALSE true FALSE true int[] / int[] true true true true Rhs is a element, is it compilable? a a=N a[]=N a[0..2]=N int[3u] FALSE true true int[] false true true Now if there's a [] on the left, then it's an O(n) vector operation (like a copy), otherwise it's O(1). That also means: void main() { int[] a = new int[3]; int[] b = new int[3]; a = b; // OK, copies just array fat reference } void main() { int[3] a, b; a = b; // Not OK, hidden vector op } I am not sure this new table is fully correct, but it's a start. Fixes of mistakes are welcomes. ----------------------- This is an alternative proposal. On the other hand this vector op syntax doesn't currently compile: void main() { int[3] a, b; a[] += b; } So if array assign is seen as a normal vector op, then the [] is needed on the right too: Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] FALSE FALSE FALSE true int[3u] / int[] FALSE FALSE FALSE true int[] / int[3u] FALSE FALSE FALSE true int[] / int[] true FALSE FALSE true Rhs is a element, is it compilable? a a=N a[]=N a[0..2]=N int[3u] FALSE true true int[] false true true Where the two cases with dynamic arrays are syntax errors to keep more symmetry: void main() { int[] a = new int[3]; int[] b = new int[3]; a[] = b; // error a = b[]; // error }
Comment #20 by timon.gehr — 2011-11-19T12:32:29Z
Also from that thread: First thing: int[3] a=3; // kill it!! Rest: a[] is _just a shortcut_ for a[0..$]! Are you really suggesting to disallow slicing? You are thinking too much in terms of syntax and not enough in terms of semantics. They are basically two distinct things involved here: 1. static arrays and lvalue slices 2. dynamic arrays and rvalue slices 1 implies value semantics, 2 implies reference semantics, where value semantics overrides reference semantics. Any other distinction is more or less arbitrary. As you pointed out, the main indicator of distinction is value vs reference semantics of the performed assignments. We certainly agree on this: Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] ? ? ? true int[3u] / int[] ? ? ? true int[] / int[3u] ? ? ? true int[] / int[] ? ? ? true Now, a dynamic array a is equivalent to a[], and a static array b is equivalent to an lvalue slice b[]=. This gives the following equivalence classes of operations: Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] 1 1 2 2 int[3u] / int[] 2 2 2 2 int[] / int[3u] 3 1 4 2 int[] / int[] 4 2 4 2 Any of the same class should behave the same. Now, you suggest in both proposals to allow at least one of class 2 and at least one of class 4. Filling all those out delivers: Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] (1) (1) true true int[3u] / int[] true true true true int[] / int[3u] (3) (1) true true int[] / int[] true true true true 1 is "assign value to value". 3. is "assign value to reference". The upper left corner should certainly be true, values of any mutable type should be able to be assigned to themselves. This leaves: Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] true true true true int[3u] / int[] true true true true int[] / int[3u] (3) true true true int[] / int[] true true true true 3 is the odd thing out. Now let's think about it, what should: int[] a; int[3] b; a=b; do? The answer is, there are two options. 1. implicitly slice b 2. copy b by value into a One is as arbitrary as the other, so it should be disallowed in a sane design. Which leaves: Rhs is an array, is it compilable? a / b a=b a[]=b a=b[] a[]=b[] int[3u] / int[3u] true true true true int[3u] / int[] true true true true int[] / int[3u] FALSE true true true int[] / int[] true true true true Rhs is a element, is it compilable? a a=N a[]=N a[0..2]=N int[3u] FALSE true true int[] false true true And that is how it should be.
Comment #21 by clugdbug — 2011-11-19T13:52:04Z
(In reply to comment #20) > Also from that thread: > > First thing: > > int[3] a=3; // kill it!! > > Rest: > > a[] is _just a shortcut_ for a[0..$]! Are you really suggesting to disallow > slicing? Currently it is just a shortcut, but that's a horrible waste of a symbol. [] is a potentially very useful piece of syntax, which is currently almost unused. It is currently used to resolve ambiguity in array operations. In all other cases, it's redundant. Bearophile has seen how successful it is in array operations, and would like to see it used in the same way in simple element-wise assignment. That is, element wise assignment should never occur without a slice syntax. (The converse isn't true, block assignment sometimes occurs even when slices are present, see below). The analysis avoids the confusing case: where the element type is an array. int [][3] x; int [] y; int [] z; x[] = z; y[] = z; These look the same, but do very different things. The first one just copies the z pointer. The second one copies the elements in z. y[] = z[]; makes things clearer.
Comment #22 by bearophile_hugs — 2011-11-25T16:54:31Z
In DMD 2.057head this code fails still: void main() { int[3] a = [1, 2, 3]; int[3] b = [10, 20, 30]; auto c[] = a[] + b[]; // no identifier for declarator c[] } void main() pure nothrow { int[3] a = [1, 2, 2]; int[3] b = [10, 20, 20]; const c = a[] + b[]; // Error: Array operation a[] + b[] not implemented }
Comment #23 by yebblies — 2012-02-02T23:46:19Z
(In reply to comment #22) > In DMD 2.057head this code fails still: > > > void main() { > int[3] a = [1, 2, 3]; > int[3] b = [10, 20, 30]; > auto c[] = a[] + b[]; // no identifier for declarator c[] > } > > > > void main() pure nothrow { > int[3] a = [1, 2, 2]; > int[3] b = [10, 20, 20]; > const c = a[] + b[]; // Error: Array operation a[] + b[] not implemented > } This is a different bug. Are all the other cases fixed? I know this is an old report, but the way cases are presented in the initial report make it a pain to test. The cases I've quoted above are much better.
Comment #24 by bearophile_hugs — 2012-02-05T05:52:16Z
(In reply to comment #23) > Are all the other cases fixed? > > I know this is an old report, but the way cases are presented in the initial > report make it a pain to test. The cases I've quoted above are much better. I think that eventually this messy bug report will need to be closed, to be replaced by a new clean bug report :-)
Comment #25 by yebblies — 2012-02-05T05:55:42Z
Eventually? Why not now?
Comment #26 by bearophile_hugs — 2012-02-05T07:25:01Z
(In reply to comment #25) > Eventually? Why not now? Before closing this bug, a new issue needs to be written and opened, of course. And I can't want to write a new issue until people give me good answers about what the right behaviors are. See this thread where I have asked questions and shown two alternative proposals: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=149289 I'd like this code to be refused at compile-time: void main() { int[3] a; a = 1; assert(a == [1, 1, 1]); } Just as this is not accepted: void main() { int[] b = new int[3]; b = 1; assert(b == [1, 1, 1]); //Error: cannot implicitly convert expression (1) of type int to int[] } And to be required: void main() { int[3] a; a[] = 1; assert(a == [1, 1, 1]); } I'd like the [] to be required every time an O(n) vector operation is done, for: - constancy with all other vector operations among two arrays, that require []; - and to avoid unwanted (and not easy to spot in the code) O(n) operations; - bugs and confusion in D newbies that don't have memorized all current special cases. But there are other less clear-cut situations. If O(n) vector ops require a [], then this too has to be a compile-time error (despite a and b are values): void main() { int[3] a, b; a = b; // Not OK, hidden vector op } (Struct copies are O(n) operations, but their size if known at compile-time.) While this code is OK: void main() { int[] a = new int[3]; int[] b = new int[3]; a = b; // OK, copies just an array fat reference } Maybe two cases with dynamic arrays are better as compile-time syntax errors to keep more symmetry, I am not sure: void main() { int[] a = new int[3]; int[] b = new int[3]; a[] = b; // error a = b[]; // error } It's not a good idea to open a new bug report before such questions have a good answer because the new bug report risks to quickly become almost as messy as this old one.
Comment #27 by timon.gehr — 2012-02-05T07:42:38Z
Every O(n) vector operation already requires []. > But there are other less clear-cut situations. If O(n) vector ops require a [], > then this too has to be a compile-time error (despite a and b are values): > > > void main() { > int[3] a, b; > a = b; // Not OK, hidden vector op > } The "hidden" operation is O(1). If what you want is differing behavior based on whether an operation is O(n) or O(1), the language already does what you want.
Comment #28 by yebblies — 2012-02-05T07:56:56Z
The problem with this bug report is that there are too many different issues. Most of those cases you just mentioned deserve their own bug report, and to be evaluated individually. Large and unclear bug reports are much harder to process than concise ones targeting a single issue.
Comment #29 by bearophile_hugs — 2012-02-05T10:00:25Z
(In reply to comment #27) > The "hidden" operation is O(1). > > If what you want is differing behavior based on whether an operation is O(n) or > O(1), the language already does what you want. What I meant is that it's better to require [] when you perform a vector operation, like when you copy a int[1000] on another int[1000], or when you add them, etc. -------------------- (In reply to comment #27) > The problem with this bug report is that there are too many different issues. OK, I will try to split it up and then close this one, then.
Comment #30 by bearophile_hugs — 2012-02-05T11:14:19Z
I close this issue, I have split this into issue 7444 and issue 7445 If you see something missing inside those two reports, then please add the missing stuff there or in a new bug report.