Bug 10876 – foreach and remove over associative array causes invalid data to appear
Status
RESOLVED
Resolution
DUPLICATE
Severity
major
Priority
P2
Component
dmd
Product
D
Version
D2
Platform
x86_64
OS
Linux
Creation time
2013-08-23T09:35:00Z
Last change time
2014-02-21T10:06:44Z
Assigned to
nobody
Creator
maximechevalierb
Comments
Comment #0 by maximechevalierb — 2013-08-23T09:35:40Z
This issue shows up using DMD64 D Compiler v2.063 on Linux.
I have a foreach loop in which I remove values from an associative array. The behavior is completely broken. Items which are not in the associative array (including the null value!) end up appearing in the iteration. The following code snippet from my compiler produces the issue:
writeln("associative array length: ", allocState.length);
writeln("keys length: ", allocState.keys.length);
// Remove dead values from the alloc state
foreach (value, allocSt; allocState)
{
writeln("key: ", value);
writeln((value in allocState)? true:false);
if (value is null)
writeln(allocState[null]);
if (liveInfo.liveAtEntry(value, block) is false)
allocState.remove(value);
}
The output is as follows:
associative array length: 4
keys length: 4
key: $22 = arg 6 "v"
true
key: $20 = arg 4 "o"
true
key: $8 = add_i32 $1, $21
true
key: $21 = arg 5 "i"
true
key: $7 = phi [call_reg(5662):0 => $14, call_cont(565C):0 => $12]
false
key: $12 = load_u32 $20, $11
false
key: $12 = load_u32 $20, $11
false
key: $1 = add_i32 32, $0
false
key: $1 = add_i32 32, $0
false
key: $0 = mul_i32 8, $7
false
key: $0 = mul_i32 8, $7
false
key: null
false
[email protected](1410): Range violation
As you can see, most of the keys listed are not in the associative array, and the null value appears as a key, out of nowhere, which causes a crash.
Comment #1 by andrej.mitrovich — 2013-08-23T09:36:54Z
This will be marked as invalid, you can't safely iterate over an associative array and remove its keys while iterating.
Comment #2 by hsteoh — 2013-08-23T14:11:55Z
Modifying a container that you're iterating over has undefined results, because you're iterating over a set of keys that changes from under you as you go. In the case of AAs, you may end up dereferencing invalid pointers.
The safe way to do it is to get an array of keys and iterate over that:
foreach (k; aa.keys) {
auto value = aa[k];
...
if (...)
aa.remove(k);
}
N.B. you can only use .keys, not .byKey, because the latter also amounts to walking the data structure while it is being modified, which will cause problems. Using .keys is OK because it creates an array of keys (a snapshot of the set of keys at that point in time) before starting to modify the data structure.
Comment #3 by andrej.mitrovich — 2013-08-23T14:16:26Z
(In reply to comment #2)
> N.B. you can only use .keys, not .byKey, because the latter also amounts to
> walking the data structure while it is being modified, which will cause
> problems. Using .keys is OK because it creates an array of keys (a snapshot of
> the set of keys at that point in time) before starting to modify the data
> structure.
Yeah, as I learned in Issue 10821.
Perhaps one could do an "!is null" check against the key if iterating via .byKey and removing at the same time? Well anyway, the safest bet is to use .keys.
Comment #4 by hsteoh — 2013-08-23T14:24:07Z
The problem with trying to iterate over a container that's being modified at the same time is that you can't guarantee much of anything. For example, you could insert !is null checks, but what happens if removing the key causes the AA to rehash itself (e.g., to optimize for space as the number of keys shrinks)? Then your subsequent iteration will be completely screwed up. Basically, such hacks can only work by relying on implementation details, which are not guaranteed by the spec. IOW, it's undefined behaviour.
Comment #6 by andrej.mitrovich — 2013-08-23T14:34:18Z
(In reply to comment #5)
> The bottomline is that modifying the container while you're iterating over it
> is a bad idea (unless the container was specifically designed to support it).
> It almost always doesn't do what you think it should do.
Btw, we should put this info up on dlang.org if it's not already there.
Comment #7 by dlang-bugzilla — 2013-12-17T15:56:25Z
Duplicate of issue 4179?
Comment #8 by hsteoh — 2013-12-17T16:58:47Z
*** This issue has been marked as a duplicate of issue 4179 ***