Mark Shannon wrote:> Jon Harrop wrote: >> On Thursday 26 February 2009 17:25:56 Chris Lattner wrote: >>> In my ideal world, this would be: >>> >>> 1. Subsystems [with clean interfaces] for thread management, >>> finalization, object model interactions, etc. >>> 2. Within different high-level designs (e.g. copying, mark/sweep, etc) >>> there can be replaceable policy components etc. >>> 3. A couple of actual GC implementations built on top of #1/2. >>> Ideally there would only be a couple of high-level collectors that can >>> be parameterized by replacing subsystems and policies. >>> 4. A very simple language implementation that uses the facilities, on >>> the order of complexity as the kaleidoscope tutorial. >>> >>> As far as I know, there is nothing that prevents this from happening >>> today, we just need leadership in the area to drive it. To avoid the >>> "ivory tower" problem, I'd strongly recommend starting with a simple >>> GC and language and get the whole thing working top to bottom. From >>> there, the various pieces can be generalized out etc. This ensures >>> that there is always a *problem being solved* and something that works >>> and is testable. >> I fear that the IR generator and GC are too tightly coupled. >> >> For example, the IR I am generating shares pointers read from the heap even >> across function calls. That is built on the assumption that the pointers are >> immutable and, therefore, that the GC is non-moving. The generated code is >> extremely efficient even though I have not even enabled LLVM's optimizations >> yet precisely because of all this shared immutable data. >> >> If you wanted to add a copying GC to my VM you would probably replace every >> lookup of the IR register with a lookup of the code to reload it, generating >> a lot of redundant loads that would greatly degrade performance so you would >> rely upon LLVM's optimization passes to clean it up again. However, I bet >> they do not have enough information to recover all of the lost performance. >> So there is a fundamental conflict here where a simple GC design decision has >> a drastic effect on the IR generator. >> >> Although it is theoretically possible to parameterize the IR generator >> sufficiently to account for all possible combinations of GC designs I suspect >> the result would be a mess. Consequently, perhaps it would be better to >> consider IR generation and the GC as a single entity and, instead, factor >> them both out using a common high-level representation not dissimilar to JVM >> or CLR bytecode in terms of functionality but much more closely related to >> LLVM's IR? >> > > IMHO, it would be better if support for GC was dropped from llvm > altogether. I say this having written a copying GC for my VM toolkit, > which also uses llvm to do its JIT compilation. And it works just fine! > > I have simply avoided the intrinsics. > > The problem with the llvm is that to write a GC using the llvm > intrinsics, you have to mess around with the code-gen part of llvm. > > When I want to add a generational collector to my toolkit in the future, > it is easy to specify write-barriers in the IR. Modifying code-gen to > handle the intrinsics is a task I would rather avoid.People need very different things for GC. All we need for Java is the ability to dump all live object pointers into the stack, generate a bitmap that describes which words on the stack are object pointers. Also, the optimizer has to be taught that while objects might move during a collection, this will never cause a valid object pointer to become NULL nor will it change the contents of any non-reference fields. I don't think that this is an enormous task. Andrew.
Mark Shannon
2009-Feb-27 11:38 UTC
[LLVMdev] Why LLVM should NOT have garbage collection intrinsics[MESSAGE NOT SCANNED]
Hi, I realise this might be a bit controversial ;) Suppose I am writing a VM (such as VMKit), or a VM toolkit, and I want to add a generational GC. If I want to use the llvm.gcwrite intrinsic for my write barrier then I need to write a GC and then implement for each and *every* backend the gcwrite intrinsic for my write barrier. Now, if I don't use the intrinsic, I need to write my write barrier *once* in llvm IR. All I need is a nop intrinsic and ensure that all objects collectable by the GC are reachable from some global variable. This ensures that the optimisation phases know that they cannot rely on memory objects not moving at GC safe points. I have a *copying* collector that works with llvm JITted code, so I know that this works :) In fact, this leads to a more general point: ANY intrinsic that is not guaranteed to be implemented by ALL backends is useless, since a front-end that uses llvm to target multiple architectures MUST avoid them. Mark.
Gordon Henriksen
2009-Feb-27 16:22 UTC
[LLVMdev] Why LLVM should NOT have garbage collection intrinsics[MESSAGE NOT SCANNED]
Hi Mark, I don't think anyone will dispute that it's easier to hack up a shadow stack (or plug into a conservative collector) to get up and running with GC. That is absolutely the route to go if portability trumps performance. If you review the mailing list history, I think you'll also find that developers who do care about performance have been disappointed with the impact of using a shadow stack, either managed with LLVM intrinsics or by hand. Even the current state of LLVM GC (static stack maps) is a significant performance improvement—but it absolutely does require support from the code generator. Return addresses must be mapped to stack maps, and only the code generator knows where return addresses lie and how the stack frame is laid out. The ultimate endgoal is to support schemes with still-lower execution overhead. The next step for LLVM GC would be elimination of the reload penalty for using GC intrinsics with a copying collector. This, again, requires that the code generator perform bookkeeping for GC pointers. On Feb 27, 2009, at 06:38, Mark Shannon wrote:> If I want to use the llvm.gcwrite intrinsic for my write barrier > then I need to write a GC and then implement for each and *every* > backend the gcwrite intrinsic for my write barrier.I'm not sure where such vociferous concern on this subject arises. All the extant collector plugins I'm aware of operate in conjunction with the target-independent framework and require exactly zero code within each target backend. — Gordon
Possibly Parallel Threads
- [LLVMdev] Why LLVM should NOT have garbage collection intrinsics[MESSAGE NOT SCANNED]
- [LLVMdev] Why LLVM should NOT have garbage collection intrinsics
- [LLVMdev] Garbage collection
- [LLVMdev] Why LLVM should NOT have garbage collection intrinsics
- [LLVMdev] Why LLVM should NOT have garbage collection intrinsics