thr3ads.net - llvm dev - [LLVMdev] Proposal for improving llvm.gcroot (summarized) [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Talin

2011-Apr-01 17:34 UTC

[LLVMdev] Proposal for improving llvm.gcroot (summarized)

On Fri, Apr 1, 2011 at 1:58 AM, Jay Foad <jay.foad at gmail.com> wrote:
> On 30 March 2011 19:08, Talin <viridia at gmail.com> wrote:
> > llvm.gc.declare(alloca, meta). This intrinsic marks an alloca as a
> garbage
> > collection root. It can occur anywhere within a function, and lasts
> either
> > until the end of the function, or a until matching call to
> > llvm.gc.undeclare().
> > llvm.gc.undeclare(alloca). This intrinsic unmarks and alloca, so that
it
> is
> > no longer considered a root from that point onward.
>
> Hi Talin,
>
> What changes to code generation would be necessary to support this?
>
> I can only describe this in abstract terms, since I know very little aboutLLVM's code generation. (I am primarily a user of LLVM, not a developer of
it, although I have made a few minor contributions.)

> Is there any intention of supporting a collector that has static stack
> maps for each function, i.e. a table telling you, for each point in
> the code, where all the roots are on the stack (similar to unwind info
> for exception handling)? If so, I think it's a bit dodgy to use
>
That is already supported in the current LLVM. The changes I am proposing
are merely an extension of what we have now, allowing frontends to emit more
efficient code.

> intrinsic function calls to mark the start/end of the lifetime of a GC
> root, because function calls are inherently dynamic. For example, you
> can't represent this code with a static stack map:
>
> if (cond) {
>  llvm.gc.declare(foo, bar);
> }
> ...
> // foo may or may not be a root here
> ...
> if (cond) { // same condition as above
>  llvm.gc.undeclare(foo);
> }
>
> You would need to do the same as what is done today: Move the declareoutside of the condition, and initialize the variable to a null state, such
that the garbage collector will know to ignore the variable. In the
if-block, you then overwrite the variable with actual data.

The difference is that in the today's LLVM, the variable declaration has to
be in the first block, and lasts for the entire function - so you have to
initialize all of your stack variables to a null state in the first block.
By extending the notation to allow stack roots to have a limited lifetime,
we can avoid having to initialize the stack root unless we actually enter
the block where it is defined.

I should mention that the declare/undeclare intrinsics are the least
important part of this proposal. The important part is the ability to
declare SSA values as roots - that is what will make a world of difference
to folks like myself that are writing frontends that use garbage collection.

> Even if you're careful not to generate code like this in your front
> end, how can you guarantee that LLVM optimisation passes won't
> introduce it?
>
> The old llvm.dbg.region.start and llvm.dbg.region.end had the same
> kind of problem.
>
> Thanks,
> Jay.
>


-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110401/beadeef0/attachment.html>

Jay Foad

2011-Apr-01 18:17 UTC

head link

[LLVMdev] Proposal for improving llvm.gcroot (summarized)

>> if (cond) {
>>  llvm.gc.declare(foo, bar);
>> }
>> ...
>> // foo may or may not be a root here
>> ...
>> if (cond) { // same condition as above
>>  llvm.gc.undeclare(foo);
>> }
>>
> You would need to do the same as what is done today: Move the declare
> outside of the condition, and initialize the variable to a null state, such
> that the garbage collector will know to ignore the variable. In the
> if-block, you then overwrite the variable with actual data.
I think there's a real problem here. The code in LLVM that creates the
static stack map will want the llvm.gc.declare / llvm.gc.undeclare
calls to be "well-formed" in some sense: nicely paired and nested,
with each declare and undeclare being at the same depth inside any
loops or "if"s in the CFG. But consider:

1. Your front end generates well-formed llvm.gc.* calls.
2. The LLVM optimisers kick in and do jump threading, tail merging and whatnot.
3. The code that creates the static stack map finds a complete mess.
(And by this point I think it would be too late to do any
transformations like you suggest above: "Move the declare outside of
the condition ...")

I think it's a good idea to have information in the IR about the
lifetime of GC roots, but I think intrinsic calls are the wrong
representation for that information.

This is very similar to the problem of representing lexical scopes in
debug info. The llvm.dbg.region.* intrinsics were the wrong way of
doing it, because of the problems I mentioned above. Now we use
metadata attached  to each instruction to say what scope it is in,
which is much better, because it is robust against optimisation
passes.

Jay.

Reid Kleckner

2011-Apr-01 19:52 UTC

head link

[LLVMdev] Proposal for improving llvm.gcroot (summarized)

On Fri, Apr 1, 2011 at 2:17 PM, Jay Foad <jay.foad at gmail.com>
wrote:> This is very similar to the problem of representing lexical scopes in
> debug info. The llvm.dbg.region.* intrinsics were the wrong way of
> doing it, because of the problems I mentioned above. Now we use
> metadata attached  to each instruction to say what scope it is in,
> which is much better, because it is robust against optimisation
> passes.
Of course, using metadata isn't acceptable for gc because it can be
dropped, and adding something like it for gc wouldn't be acceptable to
people writing optimizations.

Reid

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Apr 2011 - [LLVMdev] Proposal for improving llvm.gcroot (summarized)

[LLVMdev] Proposal for improving llvm.gcroot (summarized)

[LLVMdev] Proposal for improving llvm.gcroot (summarized)

[LLVMdev] Proposal for improving llvm.gcroot (summarized)

Possibly Parallel Threads