Bug 3523 – [GC] Fiber is not garbage collected properly

Status
ASSIGNED
Severity
normal
Priority
P3
Component
druntime
Product
D
Version
D2
Platform
Other
OS
Linux
Creation time
2009-11-18T19:50:33Z
Last change time
2024-12-07T13:30:48Z
Keywords
bootcamp
Assigned to
Sean Kelly
Creator
Witold Baryluk
Moved to GitHub: dmd#17191 →

Comments

Comment #0 by witold.baryluk+d — 2009-11-18T19:50:33Z
Program below leaks memory. import core.thread; import std.stdio; lass DerivedFiber : Fiber { this() { super( &run ); } private void run() { Fiber.yield(); } } import core.memory : GC; void main() { foreach (i ; 1 .. 1000000) { Fiber derived = new DerivedFiber(); derived.call(); GC.collect(); // doesn't work } } Manual destruction of fiber works: delete derived; } and then it doesn't leek. chaning Fiber to "scope" also works (just like delete befor "}" I know it have something to do with similarity of Fiber to thread, that Fiber have own pool of root variables and own stack. But if Fiber is not running, and it is not accesible by any reference from any Thread, then it is imposible to resume its operation by call(), so also it's root variables and stack is not avaible, so it can (and should?) be garbage collected. I have code which creates few fibers, do milions call/yield operations, then destroy fibers, and recreat new ones, for and essentially repeats. I was hoping this would not leak memory. Unfortonetly it is. I this is intended behaviour it should be documented somewhere.
Comment #1 by witold.baryluk+d — 2009-11-18T19:56:25Z
If one will not call derived.call() (so leaving Fiber in TERM state, and never running it at all) it will be properly collected. Adding after derived.call(), a derived.reset() to make it back to TERM state, doesn't help (still it is not collected). Adding second derived.call(), after first one, will make collect() to work correctly. So i can assume it run() method terminates correctly and underlaying stack is destroyed, object is properly destructed in colletion phase. Essentially my problem is because my Fibers doesn't terminate at all they "yield" infinitly (saving some auxilary data in fields of some objects, so this data can be used outside of yield, essentiall in thread which called call), but i want to terminate them automatically (when they are not referenced by any thread or other Fiber) if needed.
Comment #2 by sean — 2009-11-19T07:28:26Z
Hm... that's tricky. The fiber implementation needs to hold a reference to the fiber on its stack for context switching, and that's the reference that is keeping the fiber alive. I'll play with the stack pointers a bit and see if things work if I exclude that reference from the GC scan.
Comment #3 by witold.baryluk+d — 2009-11-24T16:05:22Z
I "solved" my problem by changin one of my base clases from: abstract class AGenerator : Fiber { protected: this(void delegate() dg) { super(dg); } } To: abstract class AGenerator { private: Fiber x; protected: this(void delegate() dg) { x = new Fiber(dg); } ~this() { delete x; } public: void call() { x.call(); } void yield() { x.yield(); } Fiber.State state() { return x.state; } } And it magically started working correctly (Fibers are properly destroyed).
Comment #4 by witold.baryluk+d — 2009-11-25T05:53:40Z
No, sorry i made mistake. Even after change it is not garbage collected. Which is normal, because this Fiber is running some method from this object (namly method named void iter() ), so this object (AGenerator) is still referenced by Fiber, and it's destructor can't be called. So probably only way to call destructor will be to separate this into two classes, which is referenced by delegate from class A, and second which is used only for calling from outside of fiber.
Comment #5 by witold.baryluk+d — 2009-11-25T06:11:41Z
Ok, i now i solved it using kind of hack: /** This class is written because Fiber's are not correctly garbage collected */ class GenWrap(T : AGenerator, T2) { private T x; /// T derives from AGenerator which derives from Fiber public: this(T x_) { x = x_; } ~this() { delete x; } T2 getNext() { return x.getNext(); } T o() { return x; } // don't assign return value to any variable which can live longer than this object } This is hack, because it can destroy Fibers which are still referenced somewhere. So All my direct usages of variables of type T, must be changed to use GenWrap.o(), to be sure that delete x inside destructor is safe.
Comment #6 by code — 2012-02-14T05:44:18Z
I think this works now, doesn't it?
Comment #7 by witold.baryluk+d — 2012-02-14T10:20:42Z
(In reply to comment #6) > I think this works now, doesn't it? If you think so, I will test it shortly and report back.
Comment #8 by witold.baryluk+d — 2012-02-14T20:53:05Z
Still same problem in 2.052. :( Will check tomorrow 2.057, or 2.058 if it will be released.
Comment #9 by witold.baryluk+d — 2012-02-14T21:12:50Z
(In reply to comment #8) > Still same problem in 2.052. :( > > Will check tomorrow 2.057, or 2.058 if it will be released. Well, it looks 2.058 is already available. I installed it (again 32-bit Linux), and program leaks - it starts at about 14MB of virtual memory usage, 4MB or RAM usage, and both grows about 1MB per second, after minute giving about 100MB, and constantly growing. So problem is still present.
Comment #10 by code — 2012-02-15T04:15:06Z
>Hm... that's tricky. The fiber implementation needs to hold a reference to the >fiber on its stack for context switching, and that's the reference that is >keeping the fiber alive. I'll play with the stack pointers a bit and see if >things work if I exclude that reference from the GC scan. It's simple to fix I think. Don't add the fiber context to the global context list but GCscan all nested contexts instead. Now you only need to push/pop a reference to the Fiber object for call/yield. It's like swapping ownership. When the fiber is halted the Fiber object owns it's stack, when the fiber is being executed the stack owns the Fiber object.
Comment #11 by witold.baryluk+d — 2012-02-15T08:09:20Z
(In reply to comment #10) > >Hm... that's tricky. The fiber implementation needs to hold a reference to the > >fiber on its stack for context switching, and that's the reference that is > >keeping the fiber alive. I'll play with the stack pointers a bit and see if > >things work if I exclude that reference from the GC scan. > > It's simple to fix I think. Don't add the fiber context to the global context > list but GCscan all nested contexts instead. Now you only need to push/pop a > reference to the Fiber object for call/yield. Well I agree. My reasoning is - if fiber is not currently executed and is not referenced by anything from live set (all live threads, and all executing fibers, and all threads and fibers referenced by them recursively), then there is no way to resume execution of fiber (or terminate it manually, or change it), and it's GC job to terminate and delete it. > It's like swapping ownership. When the fiber is halted the Fiber object owns > it's stack, when the fiber is being executed the stack owns the Fiber object. It will probably introduce small additional overhead to Fiber context switching and execution, but should not be much. I will be happy to test any fix which resolves this problem.
Comment #12 by alex — 2012-10-09T18:54:02Z
What is the status of this today?
Comment #13 by code — 2012-10-16T18:06:40Z
>When the fiber is halted the Fiber object owns it's stack This doesn't work out with our current GC mechanisms and the required stack scans. The other solution I can think of is to keep the inner stack free of any references to the enclosing fiber. This would imply that all Fiber bookkeeping must be done using asm or on the outer stack to fully avoid pointer leakage.
Comment #14 by code — 2013-12-13T15:53:42Z
Seems like this was responsible for a performance regression in vibe.d. https://github.com/rejectedsoftware/vibe.d/commit/1488d5c7426ff7acc1d0bceef5667066354c1191
Comment #15 by schveiguy — 2018-04-24T20:43:57Z
This is pretty old, but I wonder if it's still broken? From this recent thread, it may already be fixed, as peppering in GC collections seems to solve the problem: https://forum.dlang.org/post/[email protected] I was coming to file a bug on this, but this seems (at least) to be close to explaining what is happening. Note that we have a separate issue with Fibers and GC. The GC only associates a small block with the Fiber -- the Fiber object itself, whereas the stack is huge comparatively. In 32-bit windows, a fiber stack is default 32k. This means that 10,000 Fibers probably takes up 1.2MB in the GC (assuming 128 bytes per fiber), but the stack space is 320MB. There is a lack of correlation here between memory pressure and when the memory should be cleaned up. Would it be possible to deallocate the stack when the Fiber terminates vs. when it's GC'd? That would at least help with Fibers that are completed and just left for garbage.
Comment #16 by witold.baryluk+d — 2020-05-15T16:19:56Z
Steven, yes this is still an issue with current compiler and runtime. The issue is that yes, it probably is possible to exclude references from one fiber to itself, but there might be weird cases where there are complex chains of fibers pointing to each other, and whole chains isn't accessible from anywhere else, or even from each fiber directly, but it is hard to tell if it is a garbage or not. But if all elements in the chain are in TERM state, or only leaves in the graph are in TERM state, that should be doable. But i have no idea if GC can compute such condition efficiently.
Comment #17 by robert.schadek — 2024-12-07T13:30:48Z
THIS ISSUE HAS BEEN MOVED TO GITHUB https://github.com/dlang/dmd/issues/17191 DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB