Bug 3395 – Ambiguous array operations

Status
RESOLVED
Resolution
WONTFIX
Severity
enhancement
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
All
OS
All
Creation time
2009-10-14T01:15:48Z
Last change time
2018-05-16T08:43:42Z
Keywords
spec
Assigned to
No Owner
Creator
anonymous4
Blocks
2573

Comments

Comment #0 by dfj1esp02 — 2009-10-14T01:15:48Z
These expressions are ambiguous: --- a[].max(n); a[1..4].max(n); --- Does it mean calling the function on the slice or on each item in the slice? Possible solution is to change the meaning of empty square brackets from full slice to only a hint for array operation so that a[].max(n) is an array operation and a[1..4].max(n) is max(a[1..4],n). This also gives possibility to extend array operation to whole statement even if it's not an lvalue: --- printf("%.4f, ",a[]); becomes foreach(v;a)printf("%.4f, ",v); --- a[].max(n) and max(a[],n) become the same and unambiguous with other use cases.
Comment #1 by dfj1esp02 — 2009-10-14T01:31:43Z
This also has to do with type safety. --- a[]=b[]; --- This expression is ambiguous. What was meant? Copy items from b[] slice to a[] slice or assign b[] slice to each item in a[] slice? Ambiguity resolution: --- a[]=b[]; //copy items from b to a a[]=b; //assign b slice to each item in a slice a[]=b[0..$]; //ditto --- And types for the operation must match or an error will be issued. --- T[] a,b; a[]=b; --- Currently this is accepted, but should fail, the right side expression in this assignment must be of type T (or T[] with array op).
Comment #2 by smjg — 2011-11-29T13:31:26Z
(In reply to comment #0) > These expressions are ambiguous: > --- > a[].max(n); > a[1..4].max(n); > --- > Does it mean calling the function on the slice or on each item in the slice? It means calling the function on the slice. Unless I'm mistaken, there isn't any D syntax at the moment that means calling the function on each element of the array. > Possible solution is to change the meaning of empty square brackets from full > slice to only a hint for array operation so that a[].max(n) is an array > operation and a[1..4].max(n) is max(a[1..4],n). This would get confusing. You might want to apply a function to the whole slice [1..4] or to each element of the slice. This applies whether the array-property sugar is being used or not. Perhaps the best solution is to define [] applied to the function identifier itself to do an elementwise application. So max(a, n) or a.max(n) would just call max(a, n) once. And max[](a, n) or a.max[](n) would evaluate to an array of max(a[i], n). And the same if a is replaced with a[], a[1..4] or some such in each case. Of course, ambiguities can still occur in functions with multiple array parameters. Presumably the language would forbid it in these ambiguous cases, as it does already with ambiguous overload matching.
Comment #3 by clugdbug — 2011-11-30T00:30:12Z
(In reply to comment #2) > (In reply to comment #0) > > These expressions are ambiguous: > > --- > > a[].max(n); > > a[1..4].max(n); > > --- > > Does it mean calling the function on the slice or on each item in the slice? > > It means calling the function on the slice. Unless I'm mistaken, there isn't > any D syntax at the moment that means calling the function on each element of > the array. That's correct. > > Possible solution is to change the meaning of empty square brackets from full > > slice to only a hint for array operation so that a[].max(n) is an array > > operation and a[1..4].max(n) is max(a[1..4],n). > This would get confusing. You might want to apply a function to the whole > slice [1..4] or to each element of the slice. This applies whether the > array-property sugar is being used or not. > > Perhaps the best solution is to define [] applied to the function identifier > itself to do an elementwise application. > > So max(a, n) or a.max(n) would just call max(a, n) once. > And max[](a, n) or a.max[](n) would evaluate to an array of max(a[i], n). > And the same if a is replaced with a[], a[1..4] or some such in each case. That looks to me as if max is an array of some struct S which defines an opCall. > Of course, ambiguities can still occur in functions with multiple array > parameters. Presumably the language would forbid it in these ambiguous cases, > as it does already with ambiguous overload matching. Consider the case where we want y to be [ max(x[2][0..$], max(x[3][0..$], ... ] double [][20] x; double [10] y; Brainstorming a few possibilities: y[] = max(x[2..12]); // (1) looks like scalar assignment y[] = max[2..12](x); // (2) y[] = max(x[2..12])[]; // (3) y[] = max([] x[2..12]); // (4) y[] = max([] x[2..12])[]; // (5) messy! (2) does looks like an opCall on array called 'max'. (3) looks the most intuitive to me. Not perfect though (I don't think we'd want y[] = max(x[2..12]); to compile and be a scalar). (4) is an interesting possibility. Doesn't look great, but it seems to be a syntax hole. Ambiguous in the one-argument property case: x.max([]) could be: max([] x) or max(x, []) where the [] is an empty array literal. I think that's solvable though. Interestingly it's the case where (2) is cleanest: x.max[]; Can we put the [] _before_ the call? y[] = [] max(x); y[] = x.[]max;
Comment #4 by smjg — 2011-11-30T03:02:18Z
(In reply to comment #3) > (In reply to comment #2) > > (In reply to comment #0) > > > These expressions are ambiguous: > > > --- > > > a[].max(n); > > > a[1..4].max(n); > > > --- > > > Does it mean calling the function on the slice or on each item in the slice? > > > > It means calling the function on the slice. Unless I'm mistaken, there isn't > > any D syntax at the moment that means calling the function on each element of > > the array. > > That's correct. > > > > Possible solution is to change the meaning of empty square brackets from full > > > slice to only a hint for array operation so that a[].max(n) is an array > > > operation and a[1..4].max(n) is max(a[1..4],n). > > > This would get confusing. You might want to apply a function to the whole > > slice [1..4] or to each element of the slice. This applies whether the > > array-property sugar is being used or not. > > > > Perhaps the best solution is to define [] applied to the function identifier > > itself to do an elementwise application. > > > > So max(a, n) or a.max(n) would just call max(a, n) once. > > And max[](a, n) or a.max[](n) would evaluate to an array of max(a[i], n). > > And the same if a is replaced with a[], a[1..4] or some such in each case. > > That looks to me as if max is an array of some struct S which defines an > opCall. > > > Of course, ambiguities can still occur in functions with multiple array > > parameters. Presumably the language would forbid it in these ambiguous cases, > > as it does already with ambiguous overload matching. > > Consider the case where we want y to be > [ max(x[2][0..$], max(x[3][0..$], ... ] > > double [][20] x; > double [10] y; > > Brainstorming a few possibilities: > > y[] = max(x[2..12]); // (1) looks like scalar assignment > y[] = max[2..12](x); // (2) > y[] = max(x[2..12])[]; // (3) That's ambiguous - maybe max is a function that returns an array or other type with an opSlice(). > Can we put the [] _before_ the call? y[] = [] max(x); > y[] = x.[]max; Would [](expr) be the empty array's opCall(expr) or the vectorisation of the function referenced by expr? And [].func be a vectorisation of the global function func or the empty array's .func method? (Are you envisaging that [] vectorises a whole subexpression or just the function whose name it immediately precedes?) FWIW the other week I discovered C++11 variadic templates. I wonder if we can draw inspiration from the unpacking syntax here.... http://lanzkron.wordpress.com/2011/11/05/did-you-pack-that-yourself/
Comment #5 by clugdbug — 2011-11-30T23:03:41Z
> > Consider the case where we want y to be > > [ max(x[2][0..$]), max(x[3][0..$]), ... ] > > > > double [][20] x; > > double [10] y; > > > > Brainstorming a few possibilities: > > > > y[] = max(x[2..12]); // (1) looks like scalar assignment > > y[] = max[2..12](x); // (2) > > y[] = max(x[2..12])[]; // (3) > > That's ambiguous - maybe max is a function that returns an array or other type > with an opSlice(). True. But unlike (1), it's still obvious that it's an element-by-element assignment. The nett effect is the same as if it were vectorized. Is that an ambiguity that matters? > > Can we put the [] _before_ the call? y[] = [] max(x); > > y[] = x.[]max; > > Would [](expr) be the empty array's opCall(expr) or the vectorisation of the > function referenced by expr? And [].func be a vectorisation of the global > function func or the empty array's .func method? (Are you envisaging that [] > vectorises a whole subexpression or just the function whose name it immediately precedes?) I was imagining just the function name. At least, I think it would need to have very high precedence. []a.b is the same as ([]a).b, rather than [](a.b). This, [].func would be the empty array's .func method, since there is no function name before the dot. I think then if you wanted to vectorize .func, you'd do it as: ".[]func". I'm less sure about [](expr) but I think it would just be an opCall. But I'm really just brainstorming. It's a wild idea. Haven't given any thought to if it works with function literals or function pointers. > > FWIW the other week I discovered C++11 variadic templates. I wonder if we can > draw inspiration from the unpacking syntax here.... > http://lanzkron.wordpress.com/2011/11/05/did-you-pack-that-yourself/ Yeah, that's interesting, it does look quite similar.
Comment #6 by dfj1esp02 — 2014-09-11T07:09:10Z
(In reply to Stewart Gordon from comment #4) > > Brainstorming a few possibilities: > > > > y[] = max(x[2..12]); // (1) looks like scalar assignment > > y[] = max[2..12](x); // (2) > > y[] = max(x[2..12])[]; // (3) > > That's ambiguous - maybe max is a function that returns an array or other > type with an opSlice(). This is exactly what happened: issue 11244 (In reply to Stewart Gordon from comment #2) > (In reply to comment #0) > > These expressions are ambiguous: > > --- > > a[].max(n); > > a[1..4].max(n); > > --- > > Does it mean calling the function on the slice or on each item in the slice? > > It means calling the function on the slice. Unless I'm mistaken, there > isn't any D syntax at the moment that means calling the function on each > element of the array. The problem is that iteration is not expressed in the syntax, and the compiler is free to interpret it in any way it feels like. So the idea is to give iteration distinct syntax: x[*]=y[*]; //copy slice x[*]=a; //assign all slice items f(x[*]); //rewritten as foreach(a;x)f(a); x[*]=f(y[*]); //foreach(i,ref a;x)a=f(y[i]); x[*]=f(y[]); //foreach(ref a;x)a=f(y.opSlice()); x[][][][*]=a; //x.opSlice().opSlice().opSlice()[*]=a;
Comment #7 by dmitry.olsh — 2018-05-16T08:43:42Z
> So the idea is to give iteration distinct syntax: > x[*]=y[*]; //copy slice > x[*]=a; //assign all slice items > f(x[*]); //rewritten as foreach(a;x)f(a); > x[*]=f(y[*]); //foreach(i,ref a;x)a=f(y[i]); > x[*]=f(y[]); //foreach(ref a;x)a=f(y.opSlice()); > x[][][][*]=a; //x.opSlice().opSlice().opSlice()[*]=a; This is magic, and it's for arrays only. I believe by now it's clear we want less of it, not more. A solid DIP might get us something here but not much.