Couple of questions: 1. void llvm_gc_write(void *V, void *ObjPtr, void **FieldPtr) I haven't seen an adequate explanation of these, but I'm guessing: void *V: value being written to the field void *ObjPtr: current value of the field (ie. ObjPtr == *FieldPtr upon entry to llvm_gc_write) void **FieldPtr: address of the field being written 2. The current semispace collector includes some code which says it should be in a code-generator library. If I were to write a collector, should I also include this code for the time being? Or will this soon be refactored into an external interface? 3. void %llvm.gcroot(<ty>** %ptrloc, <ty2>* %metadata) I don't see an implementation of the llvm.gcroot intrinsic in the semispace collector, so is it implemented elsewhere? Semispace has a function with the same signature, but it's not in the public GC interface ( llvm.cs.uiuc.edu/cvsweb/cvsweb.cgi/llvm/runtime/GC/GCInterface.h). Or is this simply because the llvm_cg_walk_gcroots callback hasn't been refactored as an external interface (as per #2 above)? That's it for now. :-) Sandro
On Mon, 27 Feb 2006, Sandro Magi wrote:> Couple of questions:Ok, it's been a long while since I've worked on the GC stuff, but I'll try to help :)> 1. void llvm_gc_write(void *V, void *ObjPtr, void **FieldPtr) > > I haven't seen an adequate explanation of these, but I'm guessing: > void *V: value being written to the field > void *ObjPtr: current value of the field (ie. ObjPtr == *FieldPtr > upon entry to llvm_gc_write) > void **FieldPtr: address of the field being writtenClose: ObjPtr is designed to be a pointer to the head of an object. For example, to codegen: Obj->Field = Ptr; you'd want to compile it as: llvm_gc_write(Ptr, Obj, &Obj->Field); This is used by the GC if it keeps bits in the object header or other things. If the GC you plan to use doesn't need this info, you don't need to provide it obviously. :)> 2. The current semispace collector includes some code which says it > should be in a code-generator library. If I were to write a collector, > should I also include this code for the time being?Yes, I would suggest including it.> Or will this soon be refactored into an external interface?Right now, noone is pushing the GC interfaces forward. Contributions to help are welcome!> 3. void %llvm.gcroot(<ty>** %ptrloc, <ty2>* %metadata) > > I don't see an implementation of the llvm.gcroot intrinsic in the > semispace collector, so is it implemented elsewhere? Semispace has a > function with the same signature, but it's not in the public GC > interface ( llvm.cs.uiuc.edu/cvsweb/cvsweb.cgi/llvm/runtime/GC/GCInterface.h). > Or is this simply because the llvm_cg_walk_gcroots callback hasn't > been refactored as an external interface (as per #2 above)?The llvm.gcroot intrinsic is turned into magic in the code generator that clients aren't supposed to know about. The public interface to this magic is the llvm_cg_walk_gcroots API function, which provides the dynamic values of the roots. -Chris -- nondot.org/sabre llvm.org
On 2/27/06, Chris Lattner <sabre at nondot.org> wrote:> > 1. void llvm_gc_write(void *V, void *ObjPtr, void **FieldPtr) > > > > I haven't seen an adequate explanation of these, but I'm guessing: > > void *V: value being written to the field > > void *ObjPtr: current value of the field (ie. ObjPtr == *FieldPtr > > upon entry to llvm_gc_write) > > void **FieldPtr: address of the field being written > > Close: ObjPtr is designed to be a pointer to the head of an object. For > example, to codegen: > > Obj->Field = Ptr; > > you'd want to compile it as: > > llvm_gc_write(Ptr, Obj, &Obj->Field); > > This is used by the GC if it keeps bits in the object header or other > things. If the GC you plan to use doesn't need this info, you don't need > to provide it obviously. :)Makes sense. A few more questions: 1. Any generic LLVM GC must be customised for a given front-end, correct? In other words, it requires front-end information regarding the layout of pointers within a block in order to fully trace the reference graph. Or am I missing something? Is semispace a drop-in for any front-end that utilizes the llvm.gc* intrinsics? 1a. Semispace isn't actually complete/working is it? Its llvm_gc_collect() implementation doesn't actually seem to do anything, ie. no copying, no space switching, etc. :-) 2. Are there any GC tests available that I can use to test a GC implementation? 3. The description for llvm.gcroot (llvm.cs.uiuc.edu/docs/GarbageCollection.html#roots) indicates it's used to identify roots on the stack. A front-end can also use it to identify global static data can it not? Sandro
I've written a reference-counting garbage collector for LLVM, but I'm still unclear on a few things. The following piece of code that appears on llvm.cs.uiuc.edu/docs/GarbageCollection.html is unclear: ;; As the pointer goes out of scope, store a null value into ;; it, to indicate that the value is no longer live. store %Object* null, %Object** %X ... How exactly does this indicate X is no longer live? Is this internal code-generator logic/magic? The problem I'm currently unsure of, is how roots would affect refcounts. Should the gcread/gcwrite not be used with stack refs, etc? 1. If root refs are NOT included in the count, then objects of refcount 0 must be tracked in a list of scheduled deletions, but will be continually deferred until the root goes out of scope (the deletion list is filtered during a collection by the roots callback). 2. If root refs ARE included in the count, then this deferral overhead is avoided, at the expense of more refcount increment/decrement costs (on entry and exit from each function). I'm wondering which would be preferable. The collector is currently written assuming #1 is the case, as this is what the docs seemed to imply. Shall I just post the code? Sandro
Again, sorry for the delay. :( On Thu, 9 Mar 2006, Sandro Magi wrote:> I've written a reference-counting garbage collector for LLVM, but I'm > still unclear on a few things.Cool!> The following piece of code that appears on > llvm.cs.uiuc.edu/docs/GarbageCollection.html is unclear: > > ;; As the pointer goes out of scope, store a null value into > ;; it, to indicate that the value is no longer live. > store %Object* null, %Object** %X > ... > > How exactly does this indicate X is no longer live? Is this internal > code-generator logic/magic?No, this just prevents the GC from accidentally thinking that *X is live through that pointer. The collector cannot distinguish between pointers that are out of scope from those that aren't, so this lets it consider all pointers to be in scope, without causing any "dead" pointers to mark objects.> The problem I'm currently unsure of, is how roots would affect > refcounts. Should the gcread/gcwrite not be used with stack refs, etc? > > 1. If root refs are NOT included in the count, then objects of > refcount 0 must be tracked in a list of scheduled deletions, but will > be continually deferred until the root goes out of scope (the deletion > list is filtered during a collection by the roots callback). > > 2. If root refs ARE included in the count, then this deferral overhead > is avoided, at the expense of more refcount increment/decrement costs > (on entry and exit from each function). > > I'm wondering which would be preferable. The collector is currently > written assuming #1 is the case, as this is what the docs seemed to > imply. Shall I just post the code?I would suggest doing experiments to determine which has the higher overhead. Refcounting as a whole is expensive, so you need to find a design that fits your constraints/application. -Chris -- nondot.org/sabre llvm.org