Andrew Trick
2013-Oct-24 21:56 UTC
[LLVMdev] Interfacing llvm with a precise, relocating GC
On Oct 24, 2013, at 2:50 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:> On 24 October 2013 17:32, Sanjoy Das <sanjoy at azulsystems.com> wrote: >> Hello llvm-dev! >> >> My colleages and I are currently evaluating llvm's suitability as a >> JIT compiler interfacing with a precise, relocating garbage collector. >> While we couldn't find code or writeups that deal with the issues >> specific to this design goal, it is entirely possible that we may have >> missed something; we would appreciate references to relevant code or >> writeups that people on this list may be aware of. > > > This would be hard. Currently what we have support for is a non-moving > GC where all the roots are in memory. Adding support for a non-moving > gc with register roots would not be too hard and might be possible to > reuse some of the recent stackmap work. > > For a moving GC you would probably have to change how we represent > pointer arithmetic in the selection dag and MI. It would be quiet a > big change. CCIng Andy and Patrick since they might have an idea of > how much work that would be and what the costs and benefits for LLVM > are.100% agreement.> Also to note is that there are plans to move away from selection dag, > so it might be good to sync this work with whatever we end up using > instead.FYI: when this was talked about, I heard mention that GEPs should be lowered early in the IR->MI pipeline. I didn’t hear any ideas that would make derived pointer tracking easier. -Andy
Hi Rafael, Andrew, Thank you for the prompt reply. One approach we've been considering involves representing the constraint "pointers to heap objects are invalidated at every safepoint" somehow in the IR itself. So, if %a and %b are values the GC is interested in, the safepoint at the back-edge of a loop might look like: ; <label>: body %a = phi i32 [ %a.relocated, %body ] [ %a.initial_value, %pred ] %b = phi i32 [ %b.relocated, %body ] [ %b.initial_value, %pred ] ;; Use %a and %b ;; The safepoint starts here %a.relocated = @llvm.gc_relocate(%a) %b.relocated = @llvm.gc_relocate(%b) br %body This allows us to not bother with relocating derived pointers pointing inside %a and %b, since it is semantically incorrect for llvm to reuse them in the next iteration of the loop. We lower gc_relocate to a pseudo opcode which lowered into nothing after register allocation. The problem is, of course, the performance penalty. Does it make sense to get the register allocator "see" the gc_relocate instruction as a copy so that they get the same register / slot? Will that violate the intended semantics of gc_relocate (for example, after assigning the same register / slot to %a and %a.relocated, are there passes that will try to cache derived pointers across loop iterations)? Thanks, -- Sanjoy
Philip Reames
2013-Oct-25 19:23 UTC
[LLVMdev] Interfacing llvm with a precise, relocating GC
On 10/24/13 3:42 PM, Sanjoy Das wrote:> Hi Rafael, Andrew, > > Thank you for the prompt reply. > > One approach we've been considering involves representing the > constraint "pointers to heap objects are invalidated at every > safepoint" somehow in the IR itself.To say this differently, every heap pointer either needs to be remapped in place or invalidated and reloaded on next use. In principal, you could use a "trap on access to unmapped page" style read barrier which wouldn't require this invariant, but that scheme is undesirable for other reasons. (i.e. performance, unbounded pinning, etc..)> So, if %a and %b are values the > GC is interested in, the safepoint at the back-edge of a loop might > look like: > > ; <label>: body > %a = phi i32 [ %a.relocated, %body ] [ %a.initial_value, %pred ] > %b = phi i32 [ %b.relocated, %body ] [ %b.initial_value, %pred ] > ;; Use %a and %b > > ;; The safepoint starts here > %a.relocated = @llvm.gc_relocate(%a) > %b.relocated = @llvm.gc_relocate(%b) > br %body > > This allows us to not bother with relocating derived pointers pointing > inside %a and %b, since it is semantically incorrect for llvm to reuse > them in the next iteration of the loop. We lower gc_relocate to a > pseudo opcode which lowered into nothing after register allocation. > > The problem is, of course, the performance penalty. Does it make > sense to get the register allocator "see" the gc_relocate instruction > as a copy so that they get the same register / slot? Will that > violate the intended semantics of gc_relocate (for example, after > assigning the same register / slot to %a and %a.relocated, are there > passes that will try to cache derived pointers across loop > iterations)?Not a direct response, but building on the idea... It seems like there might be an entire spectrum of interesting designs here. An initial correct, but slow implementation might introduce explicit redefinitions during the initial construction of the IR. Alternatively, a separate pass could add these explicit relocations before a problematic point in the pipeline if the set of "interesting pointers" could be reliably identified. This would potentially allow incremental performance improvement over time with incremental effort. I haven't spent the time to establish whether identifying the set of "interesting pointers" could be done reliably yet, but I suspect it probably could. Philip
On Thu, Oct 24, 2013 at 6:42 PM, Sanjoy Das <sanjoy at azulsystems.com> wrote:> Hi Rafael, Andrew, > > Thank you for the prompt reply. > > One approach we've been considering involves representing the > constraint "pointers to heap objects are invalidated at every > safepoint" somehow in the IR itself. So, if %a and %b are values the > GC is interested in, the safepoint at the back-edge of a loop might > look like: > > ; <label>: body > %a = phi i32 [ %a.relocated, %body ] [ %a.initial_value, %pred ] > %b = phi i32 [ %b.relocated, %body ] [ %b.initial_value, %pred ] > ;; Use %a and %b > > ;; The safepoint starts here > %a.relocated = @llvm.gc_relocate(%a) > %b.relocated = @llvm.gc_relocate(%b) > br %body > > This allows us to not bother with relocating derived pointers pointing > inside %a and %b, since it is semantically incorrect for llvm to reuse > them in the next iteration of the loop.This is the right general idea, but you can already express this constraint in LLVM as it exists today, by using llvm.gcroot(). As you noted, this also solves the interior-pointer problem by making it the front end's job to convey to LLVM when it would/would not be safe to cache interior pointers across loop iterations. The invariant that a front-end must maintain is that any pointer which is live across a given potential-GC-point must be reloaded from its root after a (relocating) GC might have occurred. This falls naturally out of the viewpoint that %a is not an object pointer, it's the name of an object pointer's value at a given point in time. So, of course, whenever that object pointer's value might change, there must be a new name. The fact that the mutable memory associated with a gcroot() is allocated on the stack (rather than, say, a machine register) is an implementation detail; fixing it doesn't require altering the (conceptual) interface for LLVM's existing GC support, AFAICT.> We lower gc_relocate to a > pseudo opcode which lowered into nothing after register allocation. > > The problem is, of course, the performance penalty. Does it make > sense to get the register allocator "see" the gc_relocate instruction > as a copy so that they get the same register / slot? Will that > violate the intended semantics of gc_relocate (for example, after > assigning the same register / slot to %a and %a.relocated, are there > passes that will try to cache derived pointers across loop > iterations)? > > Thanks, > -- Sanjoy > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/d77b69ff/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Interfacing llvm with a precise, relocating GC
- [LLVMdev] Interfacing llvm with a precise, relocating GC
- [LLVMdev] Interfacing llvm with a precise, relocating GC
- [LLVMdev] Interfacing llvm with a precise, relocating GC
- [LLVMdev] Interfacing llvm with a precise, relocating GC