thr3ads.net - llvm dev - [LLVMdev] Garbage collection [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Jon Harrop

2009-Feb-27 02:17 UTC

[LLVMdev] Garbage collection

On Thursday 26 February 2009 17:25:56 Chris Lattner
wrote:> In my ideal world, this would be:
>
> 1. Subsystems [with clean interfaces] for thread management,
> finalization, object model interactions, etc.
> 2. Within different high-level designs (e.g. copying, mark/sweep, etc)
> there can be replaceable policy components etc.
> 3. A couple of actual GC implementations built on top of #1/2.
> Ideally there would only be a couple of high-level collectors that can
> be parameterized by replacing subsystems and policies.
> 4. A very simple language implementation that uses the facilities, on
> the order of complexity as the kaleidoscope tutorial.
>
> As far as I know, there is nothing that prevents this from happening
> today, we just need leadership in the area to drive it.  To avoid the
> "ivory tower" problem, I'd strongly recommend starting with a
simple
> GC and language and get the whole thing working top to bottom. From 
> there, the various pieces can be generalized out etc.  This ensures
> that there is always a *problem being solved* and something that works
> and is testable.
I fear that the IR generator and GC are too tightly coupled.

For example, the IR I am generating shares pointers read from the heap even 
across function calls. That is built on the assumption that the pointers are 
immutable and, therefore, that the GC is non-moving. The generated code is 
extremely efficient even though I have not even enabled LLVM's optimizations
yet precisely because of all this shared immutable data.

If you wanted to add a copying GC to my VM you would probably replace every 
lookup of the IR register with a lookup of the code to reload it, generating 
a lot of redundant loads that would greatly degrade performance so you would 
rely upon LLVM's optimization passes to clean it up again. However, I bet 
they do not have enough information to recover all of the lost performance. 
So there is a fundamental conflict here where a simple GC design decision has 
a drastic effect on the IR generator.

Although it is theoretically possible to parameterize the IR generator 
sufficiently to account for all possible combinations of GC designs I suspect 
the result would be a mess. Consequently, perhaps it would be better to 
consider IR generation and the GC as a single entity and, instead, factor 
them both out using a common high-level representation not dissimilar to JVM 
or CLR bytecode in terms of functionality but much more closely related to 
LLVM's IR?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Chris Lattner

2009-Feb-27 05:16 UTC

head link

[LLVMdev] Garbage collection

On Feb 26, 2009, at 6:17 PM, Jon Harrop wrote:> Although it is theoretically possible to parameterize the IR generator
> sufficiently to account for all possible combinations of GC designs  
> I suspect
> the result would be a mess. Consequently, perhaps it would be better  
> to
> consider IR generation and the GC as a single entity and, instead,  
> factor
> them both out using a common high-level representation not  
> dissimilar to JVM
> or CLR bytecode in terms of functionality but much more closely  
> related to
> LLVM's IR?
This falls into the category of "future work to improve codegen".   
Tagging pointers as "magic for GC" is very trivial now, just mark one
LLVM "address space" as being a GC heap.  Apply magic to that address
space, problem solved.

Lets get the basics working well first though :)

-Chris

Mark Shannon

2009-Feb-27 09:32 UTC

head link

[LLVMdev] Garbage collection

Jon Harrop wrote:> On Thursday 26 February 2009 17:25:56 Chris Lattner wrote:
>> In my ideal world, this would be:
>>
>> 1. Subsystems [with clean interfaces] for thread management,
>> finalization, object model interactions, etc.
>> 2. Within different high-level designs (e.g. copying, mark/sweep, etc)
>> there can be replaceable policy components etc.
>> 3. A couple of actual GC implementations built on top of #1/2.
>> Ideally there would only be a couple of high-level collectors that can
>> be parameterized by replacing subsystems and policies.
>> 4. A very simple language implementation that uses the facilities, on
>> the order of complexity as the kaleidoscope tutorial.
>>
>> As far as I know, there is nothing that prevents this from happening
>> today, we just need leadership in the area to drive it.  To avoid the
>> "ivory tower" problem, I'd strongly recommend starting
with a simple
>> GC and language and get the whole thing working top to bottom. From 
>> there, the various pieces can be generalized out etc.  This ensures
>> that there is always a *problem being solved* and something that works
>> and is testable.
> 
> I fear that the IR generator and GC are too tightly coupled.
> 
> For example, the IR I am generating shares pointers read from the heap even
> across function calls. That is built on the assumption that the pointers
are
> immutable and, therefore, that the GC is non-moving. The generated code is 
> extremely efficient even though I have not even enabled LLVM's
optimizations
> yet precisely because of all this shared immutable data.
> 
> If you wanted to add a copying GC to my VM you would probably replace every
> lookup of the IR register with a lookup of the code to reload it,
generating
> a lot of redundant loads that would greatly degrade performance so you
would
> rely upon LLVM's optimization passes to clean it up again. However, I
bet
> they do not have enough information to recover all of the lost performance.
> So there is a fundamental conflict here where a simple GC design decision
has
> a drastic effect on the IR generator.
> 
> Although it is theoretically possible to parameterize the IR generator 
> sufficiently to account for all possible combinations of GC designs I
suspect
> the result would be a mess. Consequently, perhaps it would be better to 
> consider IR generation and the GC as a single entity and, instead, factor 
> them both out using a common high-level representation not dissimilar to
JVM
> or CLR bytecode in terms of functionality but much more closely related to 
> LLVM's IR?
> 
IMHO, it would be better if support for GC was dropped from llvm 
altogether. I say this having written a copying GC for my VM toolkit, 
which also uses llvm to do its JIT compilation. And it works just fine!

I have simply avoided the intrinsics.

The problem with the llvm is that to write a GC using the llvm 
intrinsics, you have to mess around with the code-gen part of llvm.

When I want to add a generational collector to my toolkit in the future, 
it is easy to specify write-barriers in the IR. Modifying code-gen to 
handle the intrinsics is a task I would rather avoid.

Mark.

Andrew Haley

2009-Feb-27 10:37 UTC

head link

[LLVMdev] Garbage collection

Mark Shannon wrote:> Jon Harrop wrote:
>> On Thursday 26 February 2009 17:25:56 Chris Lattner wrote:
>>> In my ideal world, this would be:
>>>
>>> 1. Subsystems [with clean interfaces] for thread management,
>>> finalization, object model interactions, etc.
>>> 2. Within different high-level designs (e.g. copying, mark/sweep,
etc)
>>> there can be replaceable policy components etc.
>>> 3. A couple of actual GC implementations built on top of #1/2.
>>> Ideally there would only be a couple of high-level collectors that
can
>>> be parameterized by replacing subsystems and policies.
>>> 4. A very simple language implementation that uses the facilities,
on
>>> the order of complexity as the kaleidoscope tutorial.
>>>
>>> As far as I know, there is nothing that prevents this from
happening
>>> today, we just need leadership in the area to drive it.  To avoid
the
>>> "ivory tower" problem, I'd strongly recommend
starting with a simple
>>> GC and language and get the whole thing working top to bottom. From
>>> there, the various pieces can be generalized out etc.  This ensures
>>> that there is always a *problem being solved* and something that
works
>>> and is testable.
>> I fear that the IR generator and GC are too tightly coupled.
>>
>> For example, the IR I am generating shares pointers read from the heap
even
>> across function calls. That is built on the assumption that the
pointers are
>> immutable and, therefore, that the GC is non-moving. The generated code
is
>> extremely efficient even though I have not even enabled LLVM's
optimizations
>> yet precisely because of all this shared immutable data.
>>
>> If you wanted to add a copying GC to my VM you would probably replace
every
>> lookup of the IR register with a lookup of the code to reload it,
generating
>> a lot of redundant loads that would greatly degrade performance so you
would
>> rely upon LLVM's optimization passes to clean it up again. However,
I bet
>> they do not have enough information to recover all of the lost
performance.
>> So there is a fundamental conflict here where a simple GC design
decision has
>> a drastic effect on the IR generator.
>>
>> Although it is theoretically possible to parameterize the IR generator 
>> sufficiently to account for all possible combinations of GC designs I
suspect
>> the result would be a mess. Consequently, perhaps it would be better to
>> consider IR generation and the GC as a single entity and, instead,
factor
>> them both out using a common high-level representation not dissimilar
to JVM
>> or CLR bytecode in terms of functionality but much more closely related
to
>> LLVM's IR?
>>
> 
> IMHO, it would be better if support for GC was dropped from llvm 
> altogether. I say this having written a copying GC for my VM toolkit, 
> which also uses llvm to do its JIT compilation. And it works just fine!
> 
> I have simply avoided the intrinsics.
> 
> The problem with the llvm is that to write a GC using the llvm 
> intrinsics, you have to mess around with the code-gen part of llvm.
> 
> When I want to add a generational collector to my toolkit in the future, 
> it is easy to specify write-barriers in the IR. Modifying code-gen to 
> handle the intrinsics is a task I would rather avoid.
People need very different things for GC.  All we need for Java is the
ability to dump all live object pointers into the stack, generate a bitmap
that describes which words on the stack are object pointers. Also, the optimizer
has to be taught that while objects might move during a collection, this will
never cause a valid object pointer to become NULL nor will it change the
contents of any non-reference fields.  I don't think that this is an
enormous task.

Andrew.

Talin

2009-Feb-27 17:10 UTC

head link

[LLVMdev] Garbage collection

Jon Harrop wrote:> On Thursday 26 February 2009 17:25:56 Chris Lattner wrote:
>   
>> In my ideal world, this would be:
>>
>> 1. Subsystems [with clean interfaces] for thread management,
>> finalization, object model interactions, etc.
>> 2. Within different high-level designs (e.g. copying, mark/sweep, etc)
>> there can be replaceable policy components etc.
>> 3. A couple of actual GC implementations built on top of #1/2.
>> Ideally there would only be a couple of high-level collectors that can
>> be parameterized by replacing subsystems and policies.
>> 4. A very simple language implementation that uses the facilities, on
>> the order of complexity as the kaleidoscope tutorial.
>>
>> As far as I know, there is nothing that prevents this from happening
>> today, we just need leadership in the area to drive it.  To avoid the
>> "ivory tower" problem, I'd strongly recommend starting
with a simple
>> GC and language and get the whole thing working top to bottom. From 
>> there, the various pieces can be generalized out etc.  This ensures
>> that there is always a *problem being solved* and something that works
>> and is testable.
>>     
>
> I fear that the IR generator and GC are too tightly coupled.
>
> For example, the IR I am generating shares pointers read from the heap even
> across function calls. That is built on the assumption that the pointers
are
> immutable and, therefore, that the GC is non-moving. The generated code is 
> extremely efficient even though I have not even enabled LLVM's
optimizations
> yet precisely because of all this shared immutable data.
>
> If you wanted to add a copying GC to my VM you would probably replace every
> lookup of the IR register with a lookup of the code to reload it,
generating
> a lot of redundant loads that would greatly degrade performance so you
would
> rely upon LLVM's optimization passes to clean it up again. However, I
bet
> they do not have enough information to recover all of the lost performance.
> So there is a fundamental conflict here where a simple GC design decision
has
> a drastic effect on the IR generator.
>
> Although it is theoretically possible to parameterize the IR generator 
> sufficiently to account for all possible combinations of GC designs I
suspect
> the result would be a mess. Consequently, perhaps it would be better to 
> consider IR generation and the GC as a single entity and, instead, factor 
> them both out using a common high-level representation not dissimilar to
JVM
> or CLR bytecode in terms of functionality but much more closely related to 
> LLVM's IR?
>
>   Most copying collector designs that I have seen rely on explicit 
coordination from the mutator, which means that object addresses won't 
change at any arbitrary time, but only during a "sync point". It's
up to
the mutator to call "sync" at fairly regular intervals (although calls
to allocate memory for an object are implicitly sync points as well.) 
During that call, the heap may get re-arranged, but between sync points 
pointer values are stable. So you can still share pointer values most of 
the time.

-- Talin

Gordon Henriksen

2009-Feb-27 18:42 UTC

head link

[LLVMdev] Garbage collection

On Feb 26, 2009, at 21:17, Jon Harrop wrote:
> For example, the IR I am generating shares pointers read from the  
> heap even across function calls. That is built on the assumption  
> that the pointers are immutable and, therefore, that the GC is non- 
> moving. [...]
>
> If you wanted to add a copying GC to my VM you would probably  
> replace every lookup of the IR register with a lookup of the code to  
> reload it, generating a lot of redundant loads that would greatly  
> degrade performance so you would rely upon LLVM's optimization  
> passes to clean it up again. However, I bet they do not have enough  
> information to recover all of the lost performance. So there is a  
> fundamental conflict here where a simple GC design decision has a  
> drastic effect on the IR generator.

I agree this could be better. I think it would be prudent of you,  
being aware of this problem, to structure your compiler so as to limit  
the number of pieces of code which would be effected when you switch  
to a copying collector. I think such structure in the back-end for  
your compiler addresses the same needs as would be by your "slightly  
less low-level virtual machine."

I order to address it thoroughly, the LLVM GC infrastructure needs to  
track register roots so that it doesn't need to conservatively reload  
from the stack so frequently. This would likely entail a change to the  
IR (either a 'GC' address space as Chris suggests, a new intrinsic to  
'tag' a value as a GC pointer, or even a change to the Type  
hierarchy)--another reason to isolate GC-handling code in your compiler.

— Gordon

Jon Harrop

2009-Feb-28 00:58 UTC

head link

[LLVMdev] Garbage collection

On Friday 27 February 2009 18:42:13 Gordon Henriksen
wrote:> I agree this could be better. I think it would be prudent of you,
> being aware of this problem, to structure your compiler so as to limit
> the number of pieces of code which would be effected when you switch
> to a copying collector.
I think that would make my VM a lot more complicated for no clear practical 
gain.
> I order to address it thoroughly, the LLVM GC infrastructure needs to
> track register roots so that it doesn't need to conservatively reload
> from the stack so frequently.
F# suffers because null references are typeless on .NET but you want to use 
typed null references to represent the empty list and the None option type 
constructor and many other things. The F# team cannot fix their VM so they 
were forced to settle on an inefficient implementation of lists in a language 
where lists are ubiquitous and an implementation of the option type that is 
efficient but does not pretty print correctly due to insufficient type 
information.

I can and did fix my VM by representing reference types as an aggregate 
containing a pointer to shared data (the type) and a pointer to individual 
data (the value). Thus you can have a typed null pointer by having a pointer 
to a valid type and a null pointer for the individual data.

However, that means my roots are now aggregate values. So your proposed GC 
infrastructure for LLVM would need to be able to mark aggregate values as 
roots as well as individual values. Moreover, even if we went through all 
this work of improving LLVM to support this GC infrastructure and altering my 
VM to traverse the system stack instead of my current shadow stack, I am not 
even convinced it would be a significant improvement over what I already 
have.
> This would likely entail a change to the 
> IR (either a 'GC' address space as Chris suggests, a new intrinsic
to
> 'tag' a value as a GC pointer, or even a change to the Type
> hierarchy)
Can you elaborate on a changing of the type hierarchy?
> --another reason to isolate GC-handling code in your compiler. 
I just cannot see how the GC-handling code can be isolated in such a way that 
the result is better than just having separate HLVMs for separate design 
decisions.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Feb 2009 - [LLVMdev] Garbage collection

[LLVMdev] Garbage collection

[LLVMdev] Garbage collection

[LLVMdev] Garbage collection

[LLVMdev] Garbage collection

[LLVMdev] Garbage collection

[LLVMdev] Garbage collection

[LLVMdev] Garbage collection

Maybe Matching Threads