thr3ads.net - llvm dev - [llvm-dev] Status of Garbage Collection with Statepoints in LLVM [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Martin Kustermann via llvm-dev

2016-Mar-03 18:42 UTC

[llvm-dev] Status of Garbage Collection with Statepoints in LLVM

Hello LLVM community,

We have been experimenting with using LLVM IR as a target for a managed
(dynamically typed) language via an AOT compiler (including a backend for
ARM). One main challenge is getting the garbage collection right: We would
like to be able to implement a moving collector. This requires us to a)
find a precise set of root pointers and b) be able to rewrite those
pointers after objects moved.

LLVM seems to provide two mechanisms for doing this: via "gcroot" and
via
"statepoints". The [statepoints] documentation indicates the first
option,
namely "gcroot", is only viable for conservative collectors and
statepoints
might eventually replace gcroot. So we wanted to try out the statepoints
approach.

Though it turned out that:
  * the pass for inserting statepoints is hard-coded to only work with
samples and CLR (see [placesafepoints])
  * the pass for rewriting statepoints is hard-coded to only work with
samples and CLR (see [rewritestatepoints])
  * the only backend supporting statepoints right now seems to be 64-bit
intel (see [backend-x64])

Since the ARM backend (and e.g. 32-bit intel) doesn't seem to have support
for lowering statepoints-using IR, we were rather disappointed.

We are now experimenting with keeping a shadow stack which contains all the
managed pointers, but since the IR we emit contains reads/write to this
shadow stack (and the shadow stack escapes on calls), the performance
suffers significantly, since LLVM can't perform a lot of the optimizations
it could otherwise do (IIRC the statepoints approach doesn't have this
problem, since the statepoints can be inserted *after* optimization passes
were run).

Is there any timeline for the statepoints support in LLVM?
Is there a list of things that currently work / don't work with safepoints?
What is the recommended approach for moving GCs when using LLVM (what are
others doing)?

Thanks in advance,
Martin

Sidenote: It would be beneficial for users if the [statepoints]
documentation would highlight the current status and limitations.

[statepoints] http://llvm.org/docs/Statepoints.html
[placesafepoints]
https://github.com/llvm-mirror/llvm/blob/a40ba754c3f765768d441b9b4b534da917f8ad3c/lib/Transforms/Scalar/PlaceSafepoints.cpp#L441
[rewritestatepoints]
https://github.com/llvm-mirror/llvm/blob/a40ba754c3f765768d441b9b4b534da917f8ad3c/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp#L2286
[backend-x64]
https://github.com/llvm-mirror/llvm/blob/a40ba754c3f765768d441b9b4b534da917f8ad3c/lib/Target/X86/X86MCInstLower.cpp#L841
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160303/818cb257/attachment.html>

Philip Reames via llvm-dev

2016-Mar-03 23:41 UTC

head link

[llvm-dev] Status of Garbage Collection with Statepoints in LLVM

On 03/03/2016 10:42 AM, Martin Kustermann via llvm-dev
wrote:> Hello LLVM community,
>
> We have been experimenting with using LLVM IR as a target for a 
> managed (dynamically typed) language via an AOT compiler (including a 
> backend for ARM). One main challenge is getting the garbage collection 
> right: We would like to be able to implement a moving collector. This 
> requires us to a) find a precise set of root pointers and b) be able 
> to rewrite those pointers after objects moved.Welcome to the community!  Glad to have other interested users
involved.>
> LLVM seems to provide two mechanisms for doing this: via "gcroot"
and
> via "statepoints". The [statepoints] documentation indicates the
first
> option, namely "gcroot", is only viable for conservative
collectors
> and statepoints might eventually replace gcroot. So we wanted to try 
> out the statepoints approach.This is still true.  I'd be very cautious about relying on the 
correctness of gcroot in it's current form.  If you do decide you want 
to pursue that option, I'll have some suggestions on how to improve the 
situation in the backend, but I don't really recommend
this.>
> Though it turned out that:
>   * the pass for inserting statepoints is hard-coded to only work with 
> samples and CLR (see [placesafepoints])
>   * the pass for rewriting statepoints is hard-coded to only work with 
> samples and CLR (see [rewritestatepoints])In both cases, you're are going to need to introduce your own GCStrategy 
type.  We don't have a good way to ask questions about the GCStrategy 
instance from IR transformation passes yet - it's on my long term todo, 
but got stalled due to some infrastructure issues - so we had to match 
names in a couple of places.  I think you found both of them.

More generally, can I back up and ask an important question?  Do you 
have to support deoptimization (i.e. osr side exits) in any form? If you 
do, you'll probably want to avoid the PlaceSafepoints utility pass.  If 
you need to support this case, let me know and we can share some code 
which hasn't made it upstream yet.>   * the only backend supporting statepoints right now seems to be 
> 64-bit intel (see [backend-x64])
>
> Since the ARM backend (and e.g. 32-bit intel) doesn't seem to have 
> support for lowering statepoints-using IR, we were rather disappointed.This is explicitly documented: 
http://llvm.org/docs/Statepoints.html#supported-architectures

Adding support for ARM (32?, 64?) shouldn't be too complicated.  If you 
search for STATEPOINT in lib/Target/X86, you'll see there are only a 
small handful of places which need architectural support. Most of the 
complexity is in the generic CodeGen parts.

Adding support for 32 bit x86 should be even easier.  If I'm reading the 
code correctly, it looks like the only issue is in generating the right 
call sequence.>
> We are now experimenting with keeping a shadow stack which contains 
> all the managed pointers, but since the IR we emit contains 
> reads/write to this shadow stack (and the shadow stack escapes on 
> calls), the performance suffers significantly, since LLVM can't 
> perform a lot of the optimizations it could otherwise do (IIRC the 
> statepoints approach doesn't have this problem, since the statepoints 
> can be inserted *after* optimization passes were run).That's pretty much the entire idea behind the late rewriting model. I'll
comment just for clarity that statepoints *could* be inserted early as 
well, but I don't recommend it.

One thing I want to ask: have you implemented inlining?  One thing we 
found was that the relative importance of how we represented safepoints 
dropped substantially once we got aggressive inlining in place.  
Essentially, all of our hot safepoints disappeared or became inliner 
bugs.  :)

> Is there any timeline for the statepoints support in LLVM?Not explicitly.  This is directly driven by those of us using and 
contributing to them.  As we find problems in our use cases, we fix them.

Just to give some context, we (Azul) have reached what we believe to be 
a stable state and are mostly focused on (non-gc related) performance 
issues.  Not all of our changes have made it upstream - specifically, 
the gc-pointer distinction and exception handling mentioned in the list 
above - but most of them have.  On the platform we care about (x86-64) 
and the configurations we use (early poll insert, late rewriting), 
things appear stable.
> Is there a list of things that currently work / don't work with 
> safepoints?There wasn't a public list.  Rather than replying with one here, I've 
added to the statepoint documentation with the start of such a list.

http://llvm.org/docs/Statepoints.html#problem-areas-and-active-work

If you have questions on any of these, please ask.

> What is the recommended approach for moving GCs when using LLVM (what 
> are others doing)?Currently, we (Azul) and the CoreCLR llilc team are the only folks I 
know of using LLVM with a relocating GC.  Both of us are using statepoints.
>
> Thanks in advance,
> Martin
>
> Sidenote: It would be beneficial for users if the [statepoints] 
> documentation would highlight the current status and limitations.If you have suggestions for documentation fixes, please let me know.  
I'm happy to either review changes or make the changes myself if you 
point out problems.>
> [statepoints] http://llvm.org/docs/Statepoints.html
> [placesafepoints] 
>
https://github.com/llvm-mirror/llvm/blob/a40ba754c3f765768d441b9b4b534da917f8ad3c/lib/Transforms/Scalar/PlaceSafepoints.cpp#L441
> [rewritestatepoints] 
>
https://github.com/llvm-mirror/llvm/blob/a40ba754c3f765768d441b9b4b534da917f8ad3c/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp#L2286
> [backend-x64] 
>
https://github.com/llvm-mirror/llvm/blob/a40ba754c3f765768d441b9b4b534da917f8ad3c/lib/Target/X86/X86MCInstLower.cpp#L841
Philip

p.s. We're happy to talk on the phone or in person about these topics as 
well.  Having a higher bandwidth conversation can be quite helpful.  Let 
me know if you're interested in arranging such a meeting.  Or, if you're
local to the bay area, consider coming to one of the socials.  Sanjoy 
and I both generally attend and either of us can answer further 
questions you might have.

Sanjoy Das via llvm-dev

2016-Mar-04 01:02 UTC

head link

[llvm-dev] Status of Garbage Collection with Statepoints in LLVM

Hi Martin,

Philip covered all of it very well, I'll just add one minor comment:
> More generally, can I back up and ask an important question?  Do you have
to
> support deoptimization (i.e. osr side exits) in any form? If you do,
you'll
> probably want to avoid the PlaceSafepoints utility pass.  If you need to
PlaceSafepoints is inadequate only if you have asynchronous
invalidation -- i.e. thread X is spinning in a long running loop while
thread Y loaded a class that makes the code running in thread X
invalid, so thread X needs to be polling for deopt safepoints.  If all
your invalidation events are synchronous then deopt bundles +
PlaceSafepoints should be enough for both deoptimization and precise
relocating GC.

If you don't need to *poll* for safepoints at all (perhaps the entire
VM is fully single threaded, so you safepoint only on allocation),
then PlaceSafepoints is not needed at all, and you can directly run
RewriteStatepointsForGC.

-- Sanjoy

llvm dev - Mar 2016 - Status of Garbage Collection with Statepoints in LLVM

[llvm-dev] Status of Garbage Collection with Statepoints in LLVM

[llvm-dev] Status of Garbage Collection with Statepoints in LLVM

[llvm-dev] Status of Garbage Collection with Statepoints in LLVM