thr3ads.net - llvm dev - [LLVMdev] Garbage collection questions [Feb 2006]

If this information is useful, please help other people find it:
Share via:

Sandro Magi

2006-Feb-27 15:55 UTC

[LLVMdev] Garbage collection questions

Couple of questions:

1. void llvm_gc_write(void *V, void *ObjPtr, void **FieldPtr)

I haven't seen an adequate explanation of these, but I'm guessing:
  void *V: value being written to the field
  void *ObjPtr: current value of the field (ie. ObjPtr == *FieldPtr
upon entry to llvm_gc_write)
  void **FieldPtr: address of the field being written

2. The current semispace collector includes some code which says it
should be in a code-generator library. If I were to write a collector,
should I also include this code for the time being? Or will this soon
be refactored into an external interface?

3. void %llvm.gcroot(<ty>** %ptrloc, <ty2>* %metadata)

I don't see an implementation of the llvm.gcroot intrinsic in the
semispace collector, so is it implemented elsewhere? Semispace has a
function with the same signature, but it's not in the public GC
interface (
http://llvm.cs.uiuc.edu/cvsweb/cvsweb.cgi/llvm/runtime/GC/GCInterface.h).
Or is this simply because the llvm_cg_walk_gcroots callback hasn't
been refactored as an external interface (as per  #2 above)?

That's it for now. :-)

Sandro

Chris Lattner

2006-Feb-27 22:39 UTC

head link

[LLVMdev] Garbage collection questions

On Mon, 27 Feb 2006, Sandro Magi wrote:> Couple of questions:
Ok, it's been a long while since I've worked on the GC stuff, but
I'll try
to help :)
> 1. void llvm_gc_write(void *V, void *ObjPtr, void **FieldPtr)
>
> I haven't seen an adequate explanation of these, but I'm guessing:
>  void *V: value being written to the field
>  void *ObjPtr: current value of the field (ie. ObjPtr == *FieldPtr
> upon entry to llvm_gc_write)
>  void **FieldPtr: address of the field being written
Close: ObjPtr is designed to be a pointer to the head of an object.  For 
example, to codegen:

Obj->Field = Ptr;

you'd want to compile it as:

llvm_gc_write(Ptr, Obj, &Obj->Field);

This is used by the GC if it keeps bits in the object header or other 
things.  If the GC you plan to use doesn't need this info, you don't
need
to provide it obviously. :)
> 2. The current semispace collector includes some code which says it
> should be in a code-generator library. If I were to write a collector,
> should I also include this code for the time being?
Yes, I would suggest including it.
> Or will this soon be refactored into an external interface?
Right now, noone is pushing the GC interfaces forward.  Contributions to 
help are welcome!
> 3. void %llvm.gcroot(<ty>** %ptrloc, <ty2>* %metadata)
>
> I don't see an implementation of the llvm.gcroot intrinsic in the
> semispace collector, so is it implemented elsewhere? Semispace has a
> function with the same signature, but it's not in the public GC
> interface (
http://llvm.cs.uiuc.edu/cvsweb/cvsweb.cgi/llvm/runtime/GC/GCInterface.h).
> Or is this simply because the llvm_cg_walk_gcroots callback hasn't
> been refactored as an external interface (as per  #2 above)?
The llvm.gcroot intrinsic is turned into magic in the code generator that 
clients aren't supposed to know about.  The public interface to this magic 
is the llvm_cg_walk_gcroots API function, which provides the dynamic 
values of the roots.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Sandro Magi

2006-Feb-27 23:58 UTC

head link

[LLVMdev] Garbage collection questions

On 2/27/06, Chris Lattner <sabre at nondot.org>
wrote:> > 1. void llvm_gc_write(void *V, void *ObjPtr, void **FieldPtr)
> >
> > I haven't seen an adequate explanation of these, but I'm
guessing:
> >  void *V: value being written to the field
> >  void *ObjPtr: current value of the field (ie. ObjPtr == *FieldPtr
> > upon entry to llvm_gc_write)
> >  void **FieldPtr: address of the field being written
>
> Close: ObjPtr is designed to be a pointer to the head of an object.  For
> example, to codegen:
>
> Obj->Field = Ptr;
>
> you'd want to compile it as:
>
> llvm_gc_write(Ptr, Obj, &Obj->Field);
>
> This is used by the GC if it keeps bits in the object header or other
> things.  If the GC you plan to use doesn't need this info, you
don't need
> to provide it obviously. :)
Makes sense. A few more questions:

1. Any generic LLVM GC must be customised for a given front-end,
correct? In other words, it requires front-end information regarding
the layout of pointers within a block in order to fully trace the
reference graph. Or am I missing something? Is semispace a drop-in for
any front-end that utilizes the llvm.gc* intrinsics?

  1a. Semispace isn't actually complete/working is it? Its
llvm_gc_collect() implementation doesn't actually seem to do anything,
ie. no copying, no space switching, etc. :-)

2. Are there any GC tests available that I can use to test a GC implementation?

3. The description for llvm.gcroot
(http://llvm.cs.uiuc.edu/docs/GarbageCollection.html#roots) indicates
it's used to identify roots on the stack. A front-end can also use it
to identify global static data can it not?

Sandro

Sandro Magi

2006-Mar-09 21:18 UTC

head link

[LLVMdev] Re: Garbage collection questions

I've written a reference-counting garbage collector for LLVM, but I'm
still unclear on a few things.

The following piece of code that appears on
http://llvm.cs.uiuc.edu/docs/GarbageCollection.html is unclear:

   ;; As the pointer goes out of scope, store a null value into
   ;; it, to indicate that the value is no longer live.
   store %Object* null, %Object** %X
   ...

How exactly does this indicate X is no longer live? Is this internal
code-generator logic/magic?

The problem I'm currently unsure of, is how roots would affect
refcounts. Should the gcread/gcwrite not be used with stack refs, etc?

1. If root refs are NOT included in the count, then objects of
refcount 0 must be tracked in a list of scheduled deletions, but will
be continually deferred until the root goes out of scope (the deletion
list is filtered during a collection by the roots callback).

2. If root refs ARE included in the count, then this deferral overhead
is avoided, at the expense of more refcount increment/decrement costs
(on entry and exit from each function).

I'm wondering which would be preferable. The collector is currently
written assuming #1 is the case, as this is what the docs seemed to
imply. Shall I just post the code?

Sandro

Chris Lattner

2006-Mar-14 19:27 UTC

head link

[LLVMdev] Re: Garbage collection questions

Again, sorry for the delay. :(

On Thu, 9 Mar 2006, Sandro Magi wrote:> I've written a reference-counting garbage collector for LLVM, but
I'm
> still unclear on a few things.
Cool!
> The following piece of code that appears on
> http://llvm.cs.uiuc.edu/docs/GarbageCollection.html is unclear:
>
>   ;; As the pointer goes out of scope, store a null value into
>   ;; it, to indicate that the value is no longer live.
>   store %Object* null, %Object** %X
>   ...
>
> How exactly does this indicate X is no longer live? Is this internal
> code-generator logic/magic?
No, this just prevents the GC from accidentally thinking that *X is live 
through that pointer.  The collector cannot distinguish between 
pointers that are out of scope from those that aren't, so this lets it 
consider all pointers to be in scope, without causing any "dead"
pointers
to mark objects.
> The problem I'm currently unsure of, is how roots would affect
> refcounts. Should the gcread/gcwrite not be used with stack refs, etc?
>
> 1. If root refs are NOT included in the count, then objects of
> refcount 0 must be tracked in a list of scheduled deletions, but will
> be continually deferred until the root goes out of scope (the deletion
> list is filtered during a collection by the roots callback).
>
> 2. If root refs ARE included in the count, then this deferral overhead
> is avoided, at the expense of more refcount increment/decrement costs
> (on entry and exit from each function).
>
> I'm wondering which would be preferable. The collector is currently
> written assuming #1 is the case, as this is what the docs seemed to
> imply. Shall I just post the code?
I would suggest doing experiments to determine which has the higher 
overhead.  Refcounting as a whole is expensive, so you need to find a 
design that fits your constraints/application.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Feb 2006 - [LLVMdev] Garbage collection questions

[LLVMdev] Garbage collection questions

[LLVMdev] Garbage collection questions

[LLVMdev] Garbage collection questions

[LLVMdev] Re: Garbage collection questions

[LLVMdev] Re: Garbage collection questions

Reasonably Related Threads