← Back to index | Original Bugzilla link

Bug 8026 – Fix or disallow randomShuffle() on fixed-sized arrays

Status: RESOLVED
Resolution: FIXED
Severity: normal
Priority: P2
Component: phobos
Product: D
Version: D2
Platform: All
OS: All
Creation time: 2012-05-03T13:45:00Z
Last change time: 2012-05-22T15:29:22Z
Assigned to: nobody
Creator: bearophile_hugs

Comments

Comment #0 by bearophile_hugs — 2012-05-03T13:45:51Z

This comes after a report by Vidar Wahlberg: http://forum.dlang.org/thread/[email protected] Several functions of std.algorithm don't work with fixed-sized arrays, and require you to slice them first to turn them into ranges. But std.random.randomShuffle() accepts fixed-sized arrays as well: import std.stdio: writeln; import std.random: randomShuffle; void main() { int[6] data = [1, 2, 3, 4, 5, 6]; randomShuffle(data); writeln(data); } This prints the unshuffled array: [1, 2, 3, 4, 5, 6] This is bug-prone, and in my opinion it's not acceptable. I see two alternative solutions: 1) To make randomShuffle() properly support fixed-sized arrays, taking them by reference; 2) To make randomShuffle() refuse a fixed-sized array at compile-time.

Comment #1 by issues.dlang — 2012-05-03T14:07:29Z

Well, that's certainly weird. Range-based functions don't normally take static arrays, and I'd argue that they shouldn't, given the problems surrounding slicing static arrays (it's fine to do it, but you need to be aware of what you're doing) - though randomShuffle doesn't have the same problem as most range-based functions do with static arrays given that it's void. Still, I'd argue that it's probably better for it to require slicing like all the rest. It looks like the problem is that randomShuffle doesn't have a template constraint (obviously not good), so I very much doubt that it was ever intended to work with static arrays without slicing. uninformDistribution (which is right above randomShuffle) appears to have the same problem. I'd say that this is definitely a bug, not an enhancement. If I remember, I'll probably throw together a pull request for it tonight, since it should be a quick fix.

Comment #2 by issues.dlang — 2012-05-03T19:55:52Z

https://github.com/D-Programming-Language/phobos/pull/565

Comment #3 by canidae — 2012-05-11T13:27:59Z

(In reply to comment #1) > Well, that's certainly weird. Range-based functions don't normally take static > arrays, and I'd argue that they shouldn't, given the problems surrounding > slicing static arrays (it's fine to do it, but you need to be aware of what > you're doing) - though randomShuffle doesn't have the same problem as most > range-based functions do with static arrays given that it's void. Still, I'd > argue that it's probably better for it to require slicing like all the rest. Are the problems surrounding slicing static arrays easily explainable? From my point of view (fairly new to the language) it's confusing that you can pass a dynamic array to a Range-based function, while you'll need to slice a static array if you want to pass that data to a such function. To me it seems more user friendly if you could pass even static arrays to such method, but I presume there's a good reason why this is avoided. I haven't quite managed to figure out this yet, any pointers would be appreciated.

Comment #4 by issues.dlang — 2012-05-11T15:39:26Z

http://stackoverflow.com/questions/8873265/is-a-static-array-a-forward-range A static array is a value type. It owns its own memory. It's a container, not a range. A dynamic array is a reference type. It does _not_ own its own memory (the runtime does). It is _not_ a container. It _is_ a range. If I do int[] func() { auto arr = [1, 2, 3, 4, 5]; return find(arr, 3); } there is _zero_ difference between passing arr and passing arr[]. In either case, you're slicing the whole array. And the array being returned is a slice of arr ([3, 4, 5] to be exact) whose memory is on the heap, owned by the runtime. If I do int[] func() { int[5] arr = [1, 2, 3, 4, 5]; return find(arr[], 3); } I've now allocated a static array _on the stack_. Passing arr to a function would copy it (since it's a value type). Passing arr[] to a function would slice it, which means passing a reference to the static array. Now look at what func returns. It's returning a slice of arr, which is a _local variable_ _on the stack_. What you've done is the equivalent of int* func() { int i; return &i; } You've returned a reference/pointer to a local variable which no longer exists. Bad things will happen if you do this. So, the semantics of dealing with dynamic and static arrays are _very_ different. Slicing a static array in the wrong place can lead to major bugs, whereas slicing a dynamic array is perfectly safe. Now, as to range-based functions in general. They're templated functions. That means that they use IFTI (implicit function template instantiation) - i.e. it infers the type. So, if you have auto func(T)(T foo) { ... } int[] dArr; int[5] sArr; foo(dArr); foo(sArr); T is going to be inferred as the _exact type_ that you pass in. So, in the first case, func gets instantiated as auto func(int[] foo) { ... } whereas in the second, it gets instantiated as auto func(int[5] foo) { ... } int[] is a range, int[5] is not. So, if func is a range-based function, it should have a template constraint on it, and the static array will fail that constraint. auto func(T)(T foo) if(isInputRange!T) { ... } Static arrays are _not_ ranges and _cannot_ be ranges. At their must basic level (the input range), ranges must implement empty, front, and popFront, and static arrays _cannot_ implement popFront, because you cannot change their length. The _only_ way that a static array can be passed to a range-based function is if it's sliced so that you have a dynamic array (which _is_ a range). And as templates take the _exact_ type, no templated function will automatically slice a static array for you. The language _could_ be changed so that IFTI treated static arrays as dynamic arrays and automatically sliced them, but then you'd have problems creating templated functions that actually wanted to take static arrays rather than dynamic ones. You _could_ overload every range-based function with an overload specifically for dynamic arrays (with the idea that the static array would be sliced when it's passed to it as happens with non-templated functions which take dynamic arrays), but that would be a royal pain. It would also significantly increase the risk of using static arrays, because you'd end up returning ranges which reference the static array from whatever range-based function you called, and the fact that you're dealing with a static array could very quickly become buried (easily resulting in returning a range to a static array which then no longer exists, because it was a local variable). It's much better to require that the programmer explicitly slice the static array, because then they know that they're slicing it and then can know that they have to watch to make sure that no reference to that static array escapes.

Comment #5 by github-bugzilla — 2012-05-20T15:40:25Z

Commits pushed to master at https://github.com/D-Programming-Language/phobos https://github.com/D-Programming-Language/phobos/commit/76def7af856dfa898b78c0cf67228dac87dd4d83 Fix Issue# 8026. I added template constraints to all of the templates missing template constraints in std.random - including randomShuffle, per the bug. https://github.com/D-Programming-Language/phobos/commit/28bca62304c0ec21a9062b4dc4fcb64b80c1c5d8 Merge pull request #565 from jmdavis/8026 Fix issue 8026

Comment #6 by bearophile_hugs — 2012-05-22T15:29:22Z

I think few Phobos functions (like walkLength) should accept fixed-sized arrays too, but now randomShuffle() statically refuses fixed sized arrays and this is acceptable. So I close this bug report.