On Fri, Apr 1, 2011 at 1:58 AM, Jay Foad <jay.foad at gmail.com> wrote:> On 30 March 2011 19:08, Talin <viridia at gmail.com> wrote: > > llvm.gc.declare(alloca, meta). This intrinsic marks an alloca as a > garbage > > collection root. It can occur anywhere within a function, and lasts > either > > until the end of the function, or a until matching call to > > llvm.gc.undeclare(). > > llvm.gc.undeclare(alloca). This intrinsic unmarks and alloca, so that it > is > > no longer considered a root from that point onward. > > Hi Talin, > > What changes to code generation would be necessary to support this? > > I can only describe this in abstract terms, since I know very little aboutLLVM's code generation. (I am primarily a user of LLVM, not a developer of it, although I have made a few minor contributions.)> Is there any intention of supporting a collector that has static stack > maps for each function, i.e. a table telling you, for each point in > the code, where all the roots are on the stack (similar to unwind info > for exception handling)? If so, I think it's a bit dodgy to use >That is already supported in the current LLVM. The changes I am proposing are merely an extension of what we have now, allowing frontends to emit more efficient code.> intrinsic function calls to mark the start/end of the lifetime of a GC > root, because function calls are inherently dynamic. For example, you > can't represent this code with a static stack map: > > if (cond) { > llvm.gc.declare(foo, bar); > } > ... > // foo may or may not be a root here > ... > if (cond) { // same condition as above > llvm.gc.undeclare(foo); > } > > You would need to do the same as what is done today: Move the declareoutside of the condition, and initialize the variable to a null state, such that the garbage collector will know to ignore the variable. In the if-block, you then overwrite the variable with actual data. The difference is that in the today's LLVM, the variable declaration has to be in the first block, and lasts for the entire function - so you have to initialize all of your stack variables to a null state in the first block. By extending the notation to allow stack roots to have a limited lifetime, we can avoid having to initialize the stack root unless we actually enter the block where it is defined. I should mention that the declare/undeclare intrinsics are the least important part of this proposal. The important part is the ability to declare SSA values as roots - that is what will make a world of difference to folks like myself that are writing frontends that use garbage collection.> Even if you're careful not to generate code like this in your front > end, how can you guarantee that LLVM optimisation passes won't > introduce it? > > The old llvm.dbg.region.start and llvm.dbg.region.end had the same > kind of problem. > > Thanks, > Jay. >-- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110401/beadeef0/attachment.html>
>> if (cond) { >> llvm.gc.declare(foo, bar); >> } >> ... >> // foo may or may not be a root here >> ... >> if (cond) { // same condition as above >> llvm.gc.undeclare(foo); >> } >> > You would need to do the same as what is done today: Move the declare > outside of the condition, and initialize the variable to a null state, such > that the garbage collector will know to ignore the variable. In the > if-block, you then overwrite the variable with actual data.I think there's a real problem here. The code in LLVM that creates the static stack map will want the llvm.gc.declare / llvm.gc.undeclare calls to be "well-formed" in some sense: nicely paired and nested, with each declare and undeclare being at the same depth inside any loops or "if"s in the CFG. But consider: 1. Your front end generates well-formed llvm.gc.* calls. 2. The LLVM optimisers kick in and do jump threading, tail merging and whatnot. 3. The code that creates the static stack map finds a complete mess. (And by this point I think it would be too late to do any transformations like you suggest above: "Move the declare outside of the condition ...") I think it's a good idea to have information in the IR about the lifetime of GC roots, but I think intrinsic calls are the wrong representation for that information. This is very similar to the problem of representing lexical scopes in debug info. The llvm.dbg.region.* intrinsics were the wrong way of doing it, because of the problems I mentioned above. Now we use metadata attached to each instruction to say what scope it is in, which is much better, because it is robust against optimisation passes. Jay.
Reid Kleckner
2011-Apr-01 19:52 UTC
[LLVMdev] Proposal for improving llvm.gcroot (summarized)
On Fri, Apr 1, 2011 at 2:17 PM, Jay Foad <jay.foad at gmail.com> wrote:> This is very similar to the problem of representing lexical scopes in > debug info. The llvm.dbg.region.* intrinsics were the wrong way of > doing it, because of the problems I mentioned above. Now we use > metadata attached to each instruction to say what scope it is in, > which is much better, because it is robust against optimisation > passes.Of course, using metadata isn't acceptable for gc because it can be dropped, and adding something like it for gc wouldn't be acceptable to people writing optimizations. Reid
Possibly Parallel Threads
- [LLVMdev] Proposal for improving llvm.gcroot (summarized)
- [LLVMdev] Proposal for improving llvm.gcroot (summarized)
- [LLVMdev] Proposal for improving llvm.gcroot (summarized)
- [LLVMdev] Proposal for improving llvm.gcroot (summarized)
- [LLVMdev] llvm.gcroot suggestion