On Sep 2, 2007 5:31 AM, Gordon Henriksen <gordonhenriksen at mac.com> wrote:> Hi Talin, > > On Sep 2, 2007, at 04:54, Talin wrote: > > > I've been looking through the documentation (http://llvm.org/docs/ > > GarbageCollection.html) on how to implement a garbage collector for > > LLVM and there's a couple of things that I don't quite understand. > > Specifically, it says that when a stack variable goes out of scope, > > you're supposed to assign a null value to it to indicate that the > > value is no longer live. > > > > What I don't quite understand is how the GC can efficiently use > > this information. > > The GC intrinsics exist for the type of collector which depends on > metadata compiled alongside the function. At runtime, such a > collector walks the dynamic call stack and merges that with the > static metadata to identify dynamically live roots. Like "zero cost > exception handling" or debug info, collecting this kind of metadata > requires backend code generator support, which again is the reason > these intrinsics exist.So, if I understand what you are saying correctly, the "llvm.gcroot" intrinsic isn't actually an "instruction" in the sense I normally think of as an action to be taken; It's more of a marker which is used by the code generator to create a data structure that describes the stack layout. Is that correct? If that's the case, then my next question is: How do the LLVM garbage collection intrinsics work in a multi-threaded environment? If there's a data structure that describes the layout of the stack, and you have multiple stacks, what happens? BTW, I'm also noticing that the current SemiSpace example code seems quite incomplete. -- -- Talin
On 2007-09-03, at 23:14, Talin wrote:> On Sep 2, 2007 5:31 AM, Gordon Henriksen <gordonhenriksen at mac.com> > wrote: > >> On Sep 2, 2007, at 04:54, Talin wrote: >> >>> I've been looking through the documentation (http://llvm.org/docs/ >>> GarbageCollection.html) on how to implement a garbage collector >>> for LLVM and there's a couple of things that I don't quite >>> understand. Specifically, it says that when a stack variable goes >>> out of scope, you're supposed to assign a null value to it to >>> indicate that the value is no longer live. >>> >>> What I don't quite understand is how the GC can efficiently use >>> this information. >> >> The GC intrinsics exist for the type of collector which depends on >> metadata compiled alongside the function. At runtime, such a >> collector walks the dynamic call stack and merges that with the >> static metadata to identify dynamically live roots. Like "zero >> cost exception handling" or debug info, collecting this kind of >> metadata requires backend code generator support, which again is >> the reason these intrinsics exist. > > So, if I understand what you are saying correctly, the > "llvm.gcroot" intrinsic isn't actually an "instruction" in the > sense I normally think of as an action to be taken; It's more of a > marker which is used by the code generator to create a data > structure that describes the stack layout. Is that correct?That's correct; the gcroot intrinsic is an annotation of an alloca. This is not without precedent. The llvm.noalias intrinsic is an annotation of an SSA pointer value.> If that's the case, then my next question is: How do the LLVM > garbage collection intrinsics work in a multi-threaded environment? > If there's a data structure that describes the layout of the stack, > and you have multiple stacks, what happens?The intrinsics are entirely neutral to collector implementation, and thus to threading. They could easily be used to implement reference counting, for instance, which may or may not be implemented in a threadsafe manner. However, as with your algorithm, reference counting does not require code generator support, and so would not justify the introduction of the intrinsics. But to elaborate upon the described class of collectors: The emitted GC tables do not describe the dynamic call stack; rather, they are static data compiled into the executable. Only when cross-referenced with the dynamic call stack do they identify the roots. The garbage collectors for .NET and Java work in this manner. Here's an example of the sort of data these tables might contain. (I generated this using the -print-gc feature which I have added to llc locally; the llc in Subversion does cannot print this data because it does not collect it.) This test case has a simple function with 2 roots and 1 call. GC roots for fun: 0 8[sp] 1 4[sp] GC safe points for fun: label 1: post-call, live = { 0, 1 } Provided this information, the collector can easily identify the live roots within fun's stack frame at runtime—so long as fun is paused at the safe point, which it must be if it is not the topmost stack frame. Walking the stack; identifying the current safe point and stack pointer; stopping concurrent threads at safe points; and actually tracing the heap—all left as exercises to the interested reader. :) Here's a resource you might find interesting for further reading: http://www.cs.kent.ac.uk/people/staff/rej/gcbib/gcbib.html> BTW, I'm also noticing that the current SemiSpace example code > seems quite incomplete.I observed that myself. — Gordon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070904/192681cb/attachment.html>
Gordon Henriksen wrote:> The intrinsics are entirely neutral to collector implementation, and > thus to threading. They could easily be used to implement reference > counting, for instance, which may or may not be implemented in a > threadsafe manner. However, as with your algorithm, reference counting > does not require code generator support, and so would not justify the > introduction of the intrinsics.Well, I might rethink my algorithm based on what I learn here. :)> But to elaborate upon the described class of collectors: The emitted > GC tables do not describe the dynamic call stack; rather, they are > static data compiled into the executable. Only when cross-referenced > with the dynamic call stack do they identify the roots. The garbage > collectors for .NET and Java work in this manner. > > Here's an example of the sort of data these tables might contain. (I > generated this using the -print-gc feature which I have added to llc > locally; the llc in Subversion does cannot print this data because it > does not collect it.) This test case has a simple function with 2 > roots and 1 call. > > GC roots for fun: > 0 8[sp] > 1 4[sp] > GC safe points for fun: > label 1: post-call, live = { 0, 1 } > > Provided this information, the collector can easily identify the live > roots within fun's stack frame at runtime—so long as fun is paused at > the safe point, which it must be if it is not the topmost stack frame.How do you get the info for the non-topmost stack frames in this case? Am I going to need to implement my own version of llvm_cg_walk_gcroots() ? My current mad scheme (which I'm not sure will work) for safe points in a multi-threaded environment uses a global POSIX reader/writer lock. Each running thread keeps a read lock on the global lock when it's running. The old-generation collector attempts to acquire a write-lock when it needs to do a collection. At some semi-regular interval, the application threads will need to momentarily release the lock, telling the global collector its OK to run. (This is only for the old generation - young generation collections are handled per-thread, hopefully. Although cross-thread references make that complex...) The advantage of this scheme is that it doesn't require preemptive thread freezing, icky pagefault-based write barriers or other platform-specific hacks, the disadvantage is that all threads have to wait if some thread doesn't want to release the read lock in a timely manner. The hard part will be analyzing loops in the generated IR to decide whether a call to pulse the lock needs to be inserted in the loop body or not. Obviously, dropping and reacquiring a lock is expensive, so if the loop is provably short, I don't want to do that. So it sounds like I should be passing the pointer to the stack root list at the same time I release the lock? I supposed I could store it in the TLD which I guess wouldn't be too expensive.> Walking the stack; identifying the current safe point and stack > pointer; stopping concurrent threads at safe points; and actually > tracing the heap—all left as exercises to the interested reader. :) > > Here's a resource you might find interesting for further reading: > > http://www.cs.kent.ac.uk/people/staff/rej/gcbib/gcbib.htmlCool. I have tons of papers on garbage collection already, though :) But I will add it to the list.>> BTW, I'm also noticing that the current SemiSpace example code seems >> quite incomplete. > > I observed that myself. > > — Gordon > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >