thr3ads.net - llvm dev - [LLVMdev] More Garbage Collection Questions [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Talin

2007-Sep-15 22:01 UTC

[LLVMdev] More Garbage Collection Questions

I'm still (slowly) working on the project of creating a concurrent 
garbage collector that works with LLVM. I want to ask a little bit more 
about object tags and write barriers and so on.

Let's start with the assumption that a particular language does not use 
per-object type tags. The code generator knows the types of all objects, 
however, and can pass along that type information to llvm.gcroot. And 
since the type of each pointer field is also known, the type information 
can be determined for each traced object, as long as there is no 
polymorphism.

However, what I don't understand is how this works in the presence of 
write barriers. The llvm_gc_write intrinsic takes no type information, 
so there's no way to determine the type of the object except with a tag. 
So I'm not sure what exactly the write barrier can accomplish.

Note that in a real language, most objects would have type tags, but 
there would be a significant number of objects that did not. For 
example, a 'string' class might have a fixed-length header with a type 
tag, which in turn points to a flat array of characters with no tag. The 
character array would not be visible to the outside world, but it would 
still need to be visible to the collector.

-- Talin

Gordon Henriksen

2007-Sep-15 22:49 UTC

head link

[LLVMdev] More Garbage Collection Questions

On 2007-09-15, at 18:01, Talin wrote:
> I'm still (slowly) working on the project of creating a concurrent  
> garbage collector that works with LLVM. I want to ask a little bit  
> more about object tags and write barriers and so on.
>
> Let's start with the assumption that a particular language does not  
> use per-object type tags. The code generator knows the types of all  
> objects, however, and can pass along that type information to  
> llvm.gcroot. And since the type of each pointer field is also  
> known, the type information can be determined for each traced  
> object, as long as there is no polymorphism.
>
> However, what I don't understand is how this works in the presence  
> of write barriers. The llvm_gc_write intrinsic takes no type  
> information, so there's no way to determine the type of the object  
> except with a tag. So I'm not sure what exactly the write barrier  
> can accomplish.
>
> Note that in a real language, most objects would have type tags,  
> but there would be a significant number of objects that did not.  
> For example, a 'string' class might have a fixed-length header with
> a type tag, which in turn points to a flat array of characters with  
> no tag. The character array would not be visible to the outside  
> world, but it would still need to be visible to the collector.
Talin,

Can you be more specific the algorithm for which you need type  
metadata in a write barrier? No algorithms I am aware of perform any  
tracing from a write barrier.

Write barriers are commonly used to record references from old- 
generation objects to new-generation ones, either by recording the  
referencing object, the referencing field, or using a card table. For  
these purposes, the addresses are sufficient.

In concurrent collectors—by which I mean those that run  
asynchronously of the mutator—write barriers may be used to mark the  
written object pointer. Even byte arrays as you describe require an  
object header in which to store mark bits, else the collector cannot  
know whether the object has been marked (mark-sweep collectors) or  
copied (copying collectors).

— Gordon

Talin

2007-Sep-16 03:55 UTC

head link

[LLVMdev] More Garbage Collection Questions

Gordon Henriksen wrote:> Can you be more specific the algorithm for which you need type  
> metadata in a write barrier? No algorithms I am aware of perform any  
> tracing from a write barrier.
>   This one does:

http://citeseer.ist.psu.edu/cache/papers/cs2/442/http:zSzzSzwww.cs.technion.ac.ilzSz~erezzSzPaperszSzms-sliding-views.pdf/an-on-the-fly.pdf> Write barriers are commonly used to record references from old- 
> generation objects to new-generation ones, either by recording the  
> referencing object, the referencing field, or using a card table. For  
> these purposes, the addresses are sufficient.
>   In the paper cited above, the write barrier checks to see if the object 
being mutated has been traced; If it hasn't, then it records the values 
of all pointers contained in the object into a buffer - for which you 
need the location of the pointers.> In concurrent collectors—by which I mean those that run  
> asynchronously of the mutator—write barriers may be used to mark the  
> written object pointer. Even byte arrays as you describe require an  
> object header in which to store mark bits, else the collector cannot  
> know whether the object has been marked (mark-sweep collectors) or  
> copied (copying collectors).
>   In my design, the mark bits are maintained by the allocator as part of 
the allocation header of each memory block.

In other words, I'm trying to maintain a strict separation between the 
allocator/collector and the actual objects stored in the heap. The 
allocator/collector "owns" all of the bits that are before the start 
address of the object, while the language-specific type system owns all 
of the bits after the start address. As a result the collector isn't 
tied to a specific language.

-- Talin

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Sep 2007 - [LLVMdev] More Garbage Collection Questions

[LLVMdev] More Garbage Collection Questions

[LLVMdev] More Garbage Collection Questions

[LLVMdev] More Garbage Collection Questions

Seemingly Similar Threads