Bug 3929 – Interactions between LRU array cache, memory recycling
Status
RESOLVED
Resolution
FIXED
Severity
enhancement
Priority
P2
Component
druntime
Product
D
Version
D2
Platform
Other
OS
All
Creation time
2010-03-10T20:17:00Z
Last change time
2014-02-15T02:44:17Z
Assigned to
schveiguy
Creator
dsimcha
Comments
Comment #0 by dsimcha — 2010-03-10T20:17:24Z
See bug 3930 for one observable manifestation of this bug, though I'm filing this as a separate bug report because it's probably the root cause of a lot of general weird behavior I've noticed in 2.041.
It seems that, when GC.free() is called on an array now that Steve's LRU patch is in, the BlkInfo stays in the LRU cache, leading to subtle bugs. The patch seems to rely on the reference from cache preventing the array from getting GC'd. This is also bad in that it prevents memory from being reclaimed. A proper fix would be for the cache to be cleared whenever the GC runs or GC.free() is called.
Comment #1 by dsimcha — 2010-03-10T20:30:47Z
Clarification: The LRU cache would avoid causing an otherwise GC-able array from being retained by casting the pointer to size_t or something and storing the BlkInfo struct somewhere that's not scanned completely conservatively, or maybe by xoring it with some arbitrary pattern.
Comment #2 by schveiguy — 2010-03-11T03:43:16Z
Hm... first off, the LRU cache is only used for appending. So in order for the LRU cache to mistakenly access unallocated memory, you must be appending to an invalid array.
Second, the LRU cache stores blockinfos, which should never change except in one case. That one case is when an array greater than half a page size is appended to, it can grow in place by appending more pages to the blockinfo. This means a stale blockinfo could identify a memory block as being larger or smaller than it actually is. I would expect AA nodes to be small enough to not trigger this problem.
If you clear the cache on a GC run, then you remove the cache's usefulness since a GC run occurs only when new memory is requested. If you clear it on GC free, you have to stop all threads. I don't think either of these solutions will work, but I also don't think the LRU cache is causing problems.
Can we try to disable the cache to rule it out? I suspect there is another issue with AA's related to the appending fix, but I think it has to do with the fact that AAs are allocating arrays without using the runtime functions for arrays (it calls GC.malloc directly and builds the array structure manually).
The only trouble is, I see the only place where arrays are appended (via adding to the length) is in the rebalance function. This seems to coincide with another bug report (bug 3898) but the array starts out as a static array, so unless the stack frame is heap allocated (which it should *not* be), the append patch should work properly.
I will create some patches to disable the cache and the whole array stomping fix, and see if either of those makes the problem go away. If not, then we have to look elsewhere.
Comment #3 by schveiguy — 2010-03-11T05:11:55Z
See my fix as identified in bug 3930
Comment #4 by dsimcha — 2010-03-11T16:41:52Z
What about the issue of unnecessary GC retention of (possibly large) arrays?
Comment #5 by schveiguy — 2010-03-11T18:08:26Z
The cache actually depends on it. If the block info in the cache is freed and then reallocated as smaller blocks, using append may utilize a stale blockinfo and allow overwriting of data that is not allocated to the array.
I agree the situation is not ideal. Do you have any ideas on how to fix it? One possibility is to allow manual clearing of the cache if you want to tune the application.
Comment #6 by dsimcha — 2010-03-11T19:00:55Z
Well,
Comment #7 by dsimcha — 2010-03-11T19:07:16Z
Well, there are really two issues here: What happens when GC.free() gets called and what happens when the GC collects. As much as people (Andrei comes to mind) hate it from a theoretical purity point of view, I believe it's absolutely necessary to be able to GC.free() a large array while the GC sucks as bad as it currently does.
For the GC collection case I still don't understand what's wrong with clearing the LRU. If I understand how this stuff works correctly, the information is also stored at the end of every block, so on the next append the cache will be repopulated. It will only cost one non-cached lookup per array per GC collection.
For the GC.free() case you raise a very good point about thread safety. I really don't have a good answer for it. Calling free() doesn't have to be cheap, but stopping the world is a little too expensive.
Comment #8 by schveiguy — 2010-03-12T04:39:55Z
(In reply to comment #7)
> Well, there are really two issues here: What happens when GC.free() gets
> called and what happens when the GC collects. As much as people (Andrei comes
> to mind) hate it from a theoretical purity point of view, I believe it's
> absolutely necessary to be able to GC.free() a large array while the GC sucks
> as bad as it currently does.
GC.free, I don't agree with. Deleting an array, yes. The issue is that the GC is unaware of the runtime's array features, it's just a mechanism to allocate and free memory. It's the same issue as something like in C++ how you should never call C's free on something that you allocated with new. deleting an array calls a runtime function that is in the perfect place -- where all my other fixes are.
One thing I have thought of which should help somewhat is to mark arrays with a flag in the blockinfo attributes, thereby disallowing in-place appending to a memory block that was not allocated via the arraynew routines. My biggest worry (indirectly identified by searching for this nasty bug we just fixed) is that someone will try to append to a stack-allocated item, but because the function is a closure, it's actually heap allocated and succeeds in appending. It also gets rid of a bizarre consequence of requiring padding for class allocations.
> For the GC collection case I still don't understand what's wrong with clearing
> the LRU. If I understand how this stuff works correctly, the information is
> also stored at the end of every block, so on the next append the cache will be
> repopulated. It will only cost one non-cached lookup per array per GC
> collection.
I just don't like how it would affect array append performance across threads in a strangely coupled way. If you have 15 threads all doing appending, as soon as one triggers a collection cycle, they all are affected. That's the potential to degrade 8x15 arrays. While it would not be a hugely noticeable degradation, any *avoidable* degradation should be discouraged.
What I would be willing to look at is having the collection cycle *selectively* remove freed blocks from the cache, and leave allocated blocks alone. Since the GC already has to stop the world, this shouldn't be too much of an extra burden.
What I don't know is how it will deal with threading issues. I think I can make sure it's OK if the entire blockinfo matches before erasing. A block shouldn't be being cycled *and* inserted into the cache at the same time. Does that make sense?
> For the GC.free() case you raise a very good point about thread safety. I
> really don't have a good answer for it. Calling free() doesn't have to be
> cheap, but stopping the world is a little too expensive.
Clearing all the caches on every free is not an option. Removing the associated block from the cache on an array delete is a good option, and should work well enough to satisfy.
BTW, I'm changing this to enhancement, because that's what it really is.
Comment #9 by schveiguy — 2011-01-10T06:59:47Z
Fixed in http://www.dsource.org/projects/druntime/changeset/496
Note that only destroying an array via delete arr will clear from the cache. Freeing using GC.free does not remove the data from the cache.
If it becomes necessary to use GC.free to free an array because delete is deprecated, then we can revisit this problem.