Short version: Slicing static arrays is unsafe and should not be done implicitly, because it becomes too easy to miss what's going on and screw it up. So, we really should remove the implicit slicing of static arrays from the language (obviously, by deprecating it first, and then actually removing it at some point in the future).
Long version: Slicing a static array is an inherently unsafe operation. It's taking the address of a variable (usually, a local variable). Unfortunately, thanks to issue# 8838, it's consider @safe, whereas taking the address of a local variable is considered @system (even though they're essentially the same operation). Once issue# 8838 has been fixed, then any slicing of static arrays - be it implicit or explicit - will be flagged as @system and therefore will require @trusted to be considered @safe. However, if @safe isn't being used or if there are multiple operations within a function which are @system, it's very easy for a programmer to not realize that a static array is being sliced. And such a function might be marked as @trusted due to the other @system operations in the function being fine whereas the implicit slicing is missed.
Being able to slice a static array to get a dynamic array is critical for a lot of code, but saving the programmer typing those two characters - [] - runs a high risk of making it non-obvious that the slicing is occurring, and given the high odds of operating on invalid memory when not being very careful about how you deal with a slice of a static array, I _really_ don't think that saving those two characters is worth the risk that it poses. We've made it too easy to do a very unsafe operation without even noticing that it's happening.
So, I think that we should deprecate the implicit slicing of static arrays and then later remove it outright. The eventual code breakage might be annoying, but it'll catch bugs - memory corruption bugs no less - and I think that it will be very much worth it.
Comment #1 by hsteoh — 2019-10-04T18:51:38Z
Explicit code example illustrating this problem:
-----
struct S {
int[] data;
this(int[] _data) { data = _data; }
}
S makeS() {
int[5] data = [ 1, 2, 3, 4, 5 ];
return S(data);
}
void func(S s) {
import std.stdio;
writeln("s.data = ", s.data);
}
void main() {
S s = makeS();
func(s);
}
-----
Expected output:
-----
s.data = [1, 2, 3, 4, 5]
-----
Actual output (YMMV, depends on details of stack implementation on your platform):
-----
s.data = [-580467872, 32764, 1617267003, 21973, 5]
-----
Comment #2 by razvan.nitu1305 — 2022-11-07T10:42:12Z
If you annotate the functions with @safe then you get:
test.d(9): Deprecation: reference to local variable `data` assigned to non-scope parameter `_data` calling `this`
I think that having the compiler automatically slice your static arrays is actually quite nice as you have to type less. The fact that you can escape a pointer to an expired stack frame is something that @safe and scope should deal with. Still, in system code this should be perfectly fine.
Closing this as WORKSFORME.
Comment #3 by hsteoh — 2022-11-07T17:06:55Z
(In reply to RazvanN from comment #2)
> If you annotate the functions with @safe then you get:
>
> test.d(9): Deprecation: reference to local variable `data` assigned to
> non-scope parameter `_data` calling `this`
That's good to know.
> I think that having the compiler automatically slice your static arrays is
> actually quite nice as you have to type less.
I disagree, the compiler implicitly slicing your static arrays for you has the same sort of issues as implicit constructions in C++: they tend to turn up where they are unexpected and cause problems.
> The fact that you can escape a
> pointer to an expired stack frame is something that @safe and scope should
> deal with. Still, in system code this should be perfectly fine.
How can escaping a pointer to an expired stack frame ever be fine?? That is never OK and should not be allowed in the language. At least, not implicitly. If you want to explicitly escape a reference to a local static array, you should be forced to explicitly slice it, not have the compiler silently insert it for you.
Comment #4 by razvan.nitu1305 — 2022-11-08T12:58:09Z
(In reply to hsteoh from comment #3)
> (In reply to RazvanN from comment #2)
> > If you annotate the functions with @safe then you get:
> >
> > test.d(9): Deprecation: reference to local variable `data` assigned to
> > non-scope parameter `_data` calling `this`
>
> That's good to know.
>
>
> > I think that having the compiler automatically slice your static arrays is
> > actually quite nice as you have to type less.
>
> I disagree, the compiler implicitly slicing your static arrays for you has
> the same sort of issues as implicit constructions in C++: they tend to turn
> up where they are unexpected and cause problems.
>
>
> > The fact that you can escape a
> > pointer to an expired stack frame is something that @safe and scope should
> > deal with. Still, in system code this should be perfectly fine.
>
> How can escaping a pointer to an expired stack frame ever be fine?? That is
> never OK and should not be allowed in the language. At least, not
> implicitly. If you want to explicitly escape a reference to a local static
> array, you should be forced to explicitly slice it, not have the compiler
> silently insert it for you.
What I meant was that there are situations where implicit slicing of static arrays is actually desirable. For example, the standard library can implement functions that take dynamic arrays and the user can conveniently use static arrays to call those functions. In most situations that will not lead to memory corruption and inexperienced users can have a much smoother experience.
Also, the plan is to have DIP1000 and @safe by default, so the situations where automatically slicing a dynamic array may lead to escaping of pointers to expired stack frames are going to get caught automatically by default.
If we go down the path to deprecate this behavior we are just going to cause annoyances for less experienced users for no apparent gain.
Comment #5 by hsteoh — 2022-11-15T16:45:59Z
I don't see what's so hard about adding `[]` to a static array before passing it to Phobos. Even in generic code where you could argue that the incoming type could be either static or dynamic array, you can still write `[]` and it would still work (a slice of a dynamic array is still the same dynamic array, there is no problem there).
This is what Andrei complained about, we bend over backwards to support use cases that weren't intended to be supported, and pretty soon we end up with a ridiculous amount of cruft that didn't even need to be there in the first place. Asking the user to write 2 extra characters is a much simpler and better solution than introducing silent ways to shoot themselves in the foot because they didn't realize what was happening under the hood. I have been bitten by this specific issue (implicit conversion of static array to dynamic) more than once, the headache in debugging the issue -- because it's implicit and therefore hard to find -- is so not worth the non-existent "cost" of writing two extra characters. I would have much preferred that the compiler refused to compile the code to bring my attention to the fact that I'm dealing with a static array, than to silently accept static arrays with unexpected semantics that lead to bugs down the road.
Comment #6 by dfj1esp02 — 2022-11-16T10:11:30Z
AIU this issue is inspired by std.digest design. You can always have error prone interface and if std.digest is too error prone for intended users, it's just wrong design and should have a safer interface: either take the result storage as an argument or return a heap allocated array or return a value type that will require explicit slicing.
Comment #7 by nick — 2024-07-20T21:19:47Z
> inexperienced users can have a much smoother experience.
It's a false sense of security, this problem is unfathomable to someone relatively new to D:
https://forum.dlang.org/post/[email protected]
It would be better for them to learn when slicing is needed, rather than only learn that when something breaks.
Comment #8 by kdevel — 2024-07-25T10:57:31Z
(In reply to Nick Treleaven from comment #7)
> https://forum.dlang.org/post/[email protected]
| string[3][string] lookup;
| string[] dynArray = ["d", "e", "f"];
| lookup["test"] = dynArray[0..$];
|
| This fails at runtime with RangeError. But if I change that last line to:
|
| lookup["test"] = dynArray[0..3];
|
| then it works.
Comment #9 by robert.schadek — 2024-12-13T18:47:30Z