On Friday 27 February 2009 18:42:13 Gordon Henriksen wrote:> I agree this could be better. I think it would be prudent of you, > being aware of this problem, to structure your compiler so as to limit > the number of pieces of code which would be effected when you switch > to a copying collector.I think that would make my VM a lot more complicated for no clear practical gain.> I order to address it thoroughly, the LLVM GC infrastructure needs to > track register roots so that it doesn't need to conservatively reload > from the stack so frequently.F# suffers because null references are typeless on .NET but you want to use typed null references to represent the empty list and the None option type constructor and many other things. The F# team cannot fix their VM so they were forced to settle on an inefficient implementation of lists in a language where lists are ubiquitous and an implementation of the option type that is efficient but does not pretty print correctly due to insufficient type information. I can and did fix my VM by representing reference types as an aggregate containing a pointer to shared data (the type) and a pointer to individual data (the value). Thus you can have a typed null pointer by having a pointer to a valid type and a null pointer for the individual data. However, that means my roots are now aggregate values. So your proposed GC infrastructure for LLVM would need to be able to mark aggregate values as roots as well as individual values. Moreover, even if we went through all this work of improving LLVM to support this GC infrastructure and altering my VM to traverse the system stack instead of my current shadow stack, I am not even convinced it would be a significant improvement over what I already have.> This would likely entail a change to the > IR (either a 'GC' address space as Chris suggests, a new intrinsic to > 'tag' a value as a GC pointer, or even a change to the Type > hierarchy)Can you elaborate on a changing of the type hierarchy?> --another reason to isolate GC-handling code in your compiler.I just cannot see how the GC-handling code can be isolated in such a way that the result is better than just having separate HLVMs for separate design decisions. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
On 2009-02-27, at 19:58, Jon Harrop wrote:> On Friday 27 February 2009 18:42:13 Gordon Henriksen wrote: >> I agree this could be better. I think it would be prudent of you, >> being aware of this problem, to structure your compiler so as to >> limit >> the number of pieces of code which would be effected when you switch >> to a copying collector. > > I think that would make my VM a lot more complicated for no clear > practical > gain. > >> I order to address it thoroughly, the LLVM GC infrastructure needs to >> track register roots so that it doesn't need to conservatively reload >> from the stack so frequently. > > F# suffers because null references are typeless on .NET but you want > to use > typed null references to represent the empty list and the None > option type > constructor and many other things. The F# team cannot fix their VM > so they > were forced to settle on an inefficient implementation of lists in a > language > where lists are ubiquitous and an implementation of the option type > that is > efficient but does not pretty print correctly due to insufficient type > information. > > I can and did fix my VM by representing reference types as an > aggregate > containing a pointer to shared data (the type) and a pointer to > individual > data (the value). Thus you can have a typed null pointer by having a > pointer > to a valid type and a null pointer for the individual data. > > However, that means my roots are now aggregate values. So your > proposed GC > infrastructure for LLVM would need to be able to mark aggregate > values as > roots as well as individual values. Moreover, even if we went > through all > this work of improving LLVM to support this GC infrastructure and > altering my > VM to traverse the system stack instead of my current shadow stack, > I am not > even convinced it would be a significant improvement over what I > already > have.I agree you likely won't see any significant overhead reduction vs. your current implementation unless you adopt a copying collector.>> This would likely entail a change to the >> IR (either a 'GC' address space as Chris suggests, a new intrinsic to >> 'tag' a value as a GC pointer, or even a change to the Type >> hierarchy) > > Can you elaborate on a changing of the type hierarchy?I've considered introducing a separate GCPointer type.>> --another reason to isolate GC-handling code in your compiler. > > I just cannot see how the GC-handling code can be isolated in such a > way that > the result is better than just having separate HLVMs for separate > design > decisions.I simply suggest you group functions to 'emit read barrier', 'allocate GC root', and soforth together in your compiler such that they can be easily changed or updated in one place if it proves necessary. — Gordon
On Saturday 28 February 2009 02:52:21 Gordon Henriksen wrote:> On 2009-02-27, at 19:58, Jon Harrop wrote: > > However, that means my roots are now aggregate values. So your > > proposed GC > > infrastructure for LLVM would need to be able to mark aggregate > > values as > > roots as well as individual values. Moreover, even if we went > > through all > > this work of improving LLVM to support this GC infrastructure and > > altering my > > VM to traverse the system stack instead of my current shadow stack, > > I am not > > even convinced it would be a significant improvement over what I > > already > > have. > > I agree you likely won't see any significant overhead reduction vs. > your current implementation unless you adopt a copying collector.I was saying that I do not believe I will see a significant improvement if I do adopt a copying collector.> > Can you elaborate on a changing of the type hierarchy? > > I've considered introducing a separate GCPointer type.Would that be able to handle aggregate GC reference types?> > I just cannot see how the GC-handling code can be isolated in such a > > way that > > the result is better than just having separate HLVMs for separate > > design > > decisions. > > I simply suggest you group functions to 'emit read barrier', 'allocate > GC root', and soforth together in your compiler such that they can be > easily changed or updated in one place if it proves necessary.I already have that but it is only a tiny part of supporting different GC designs. For example, it knows nothing of sync points. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e