thr3ads.net - llvm dev - [llvm-dev] RFC: Strong GC References in LLVM [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Daniel Berlin via llvm-dev

2016-Jul-15 18:24 UTC

[llvm-dev] RFC: Strong GC References in LLVM

On Thu, Jul 14, 2016 at 4:48 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> Hi all,
>
> It looks like the key controversial point is the bit about the extra
> control dependence on loads and stores[0].  Generally the consensus is
> that (please chime if you think otherwise) it is not reasonable to
> make the safety (or semantics) of a load instruction depend on the
> type it is loading.  Introducing such a thing would involve changing
> the IR semantics in a fundamental way, and therefore has a high bar
> for acceptance.
>
> Here is a summary of the alternative solutions that were proposed here
> and on IRC (thanks Chandler, Andy, Eli!):
>
>  1. Model loads and stores of GC references as intrinsic calls: add
>     llvm.gc_load, llvm.gc_store intrinsics, and optimize them as loads
>     and stores whenever appropriate and legal.
>
>  2. Introduce a flag on load and stores that either
>       a. Denotes a "gc_safety" control dependence.
>       b. Denotes a "blackbox_safety" control dependence.  In this
case
>          we will probably have some optional metadata on loads and
>          stores to indicate that the control dependence is actually on
>          GC safety.
>
>     As a starting point, LLVM will conservatively not speculate such
>     loads and stores; and will leave open the potential to upstream
>     logic that will have a more precise sense of when these loads and
>     stores are safe to speculate.
>
I think you need to define what you mean by control dependence here.  If
you mean speculation, you should say speculation :)
As you describe below, it is not enough to simply not speculate them. You
also are saying you don't want to change the conditions on which they
execute.
That is very different from speculation.

>  3. Introduce a general way to denote control dependence on loads and
>     stores.  This can be helpful to LLVM in general, and will let us
>     basically implement a more precise version of (2).
>
>
>
> [0]: I may have (incorrectly) mentioned otherwise on IRC, but we need
> to model safety properties of stores as well, to avoid transforms
> like:
>
>   %x = malloc()  ;; known thread local
>   if (cond_0) {
>     store GCREF %val to %x
>   }
>   if (cond_1) {
>     store i64 %val to %x
>   }
>
> to
>
>   %x = malloc()  ;; known thread local
>   if (cond_0 || cond_1) {
>     store GCREF %val to %x  ;; This "speculative" store is bad
>     if (cond_1) {
>       store i64 %val to %x
>     }
>   }
>

FWIW: This raises one of the same issues we have now with may-throw, which
is that, if all you have is a flag on the instruction, now you have to look
at every instruction in every block to know whether a *CFG* transform is
correct.

That means any pass that wants to just touch the CFG can't do so without
also looking at the instruction stream.  It will also make a bunch of
things currently O(N), O(N^2) (see the sets of patches fixing may-throw
places, and extrapolate to more places).

This is theoretically fixable in each pass by taking a single pass over the
instruction stream and marking which blocks have these instructions, etc,
and then using that info.

But we shouldn't have to do that in each pass,  especially if they just
want to make CFG manipulations.

The TL;DR  I would really like to see this also made a BB level flag that
says whether the block contains instructions with unknown control
dependences (or really, 1 bb level flag for each attribute you introduce) .
 I don't think this is terribly difficult, and at the very least, will keep
us from having to look at every instruction in the block in the common case
that there is nothing in the block to worry about ;)

Also note that these CFG transforms will also now need post-dominance info
in a bunch of cases to sanely determine if they are changing the control
dependence structure.

Let me ask another question:

Given

  %x = malloc()  ;; known thread local
  if (cond_0) {
    store GCREF %val to %x
  }
  if (cond_1) {
    store i64 %val to %x
  }
Assume i can prove cond0 and cond1 the same.

I change this to:
  %x = malloc()  ;; known thread local
  if (cond_0) {
    store i64 %val to %x
    store GCREF %val to %x
  }

Is this okay to happen?
Note that i did not actually change the control dependence of it by the
normal definition of control dependence.

But you can end up "hoisting" or "sinking" instructions by
moving every
other instruction :)

Is that okay, or are they really barriers as well (like may throw), in
which case, they probably need some real representation in control flow if
you want to make most algorithms O(N) (you can get away with the bb level
flags if you are willing to make some algorithms N^2 in cases where these
things exist).
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/dfc1973b/attachment.html>

Sanjoy Das via llvm-dev

2016-Jul-15 19:21 UTC

head link

[llvm-dev] RFC: Strong GC References in LLVM

Hi Daniel,

Daniel Berlin wrote:
 >          As a starting point, LLVM will conservatively not speculate such
 >          loads and stores; and will leave open the potential to upstream
 >          logic that will have a more precise sense of when these 
loads and
 >          stores are safe to speculate.
 >
 >
 > I think you need to define what you mean by control dependence here.  If
 > you mean speculation, you should say speculation :)

Apologies for being non-specific -- this is really just "don't
speculate".

 > As you describe below, it is not enough to simply not speculate them.

I'm not sure where I said that?

 > You also are saying you don't want to change the conditions on which
 > they execute.
 > That is very different from speculation.

If I implied that somehow then I (or the example) was wrong. :)

We can't speculate these instructions (without special knowledge of
the GC and the Java type system), and that's it.

 > FWIW: This raises one of the same issues we have now with may-throw,
 > which is that, if all you have is a flag on the instruction, now you
 > have to look at every instruction in every block to know whether a *CFG*
 > transform is correct.
 >
 > That means any pass that wants to just touch the CFG can't do so
without
 > also looking at the instruction stream.  It will also make a bunch of
 > things currently O(N), O(N^2) (see the sets of patches fixing may-throw
 > places, and extrapolate to more places).

As I said, I'm only proposing a "don't speculate" flag, so
this does
not (?) apply.

However, I didn't quite understand your point about may-throw -- how
is may-throw different from a generic side-effect (volatile store,
syscall etc.)?  All of those can't be hoisted or sunk -- we have to
make sure that they execute in semantically the same conditions that
they did in the original program.

 > This is theoretically fixable in each pass by taking a single pass over
 > the instruction stream and marking which blocks have these instructions,
 > etc, and then using that info.
 >
 > But we shouldn't have to do that in each pass,  especially if they
just
 > want to make CFG manipulations.
 >
 > The TL;DR  I would really like to see this also made a BB level flag
 > that says whether the block contains instructions with unknown control
 > dependences (or really, 1 bb level flag for each attribute you 
introduce) .
 >   I don't think this is terribly difficult, and at the very least,
will
 > keep us from having to look at every instruction in the block in the
 > common case that there is nothing in the block to worry about ;)
 >
 > Also note that these CFG transforms will also now need post-dominance
 > info in a bunch of cases to sanely determine if they are changing the
 > control dependence structure.
 >
 > Let me ask another question:
 >
 > Given
 >
 >    %x = malloc()  ;; known thread local
 >    if (cond_0) {
 >      store GCREF %val to %x
 >    }
 >    if (cond_1) {
 >      store i64 %val to %x
 >    }
 > Assume i can prove cond0 and cond1 the same.
 >
 >
 > I change this to:
 >    %x = malloc()  ;; known thread local
 >    if (cond_0) {
 >      store i64 %val to %x
 >      store GCREF %val to %x
 >    }
 >
 > Is this okay to happen?

Yes.  The only restriction is you can't issue a GCREF load or store
that wouldn't have been issued in the original program (even if the
location is thread local, in case of stores).

-- Sanjoy

 > Note that i did not actually change the control dependence of it by the
 > normal definition of control dependence.
 >
 > But you can end up "hoisting" or "sinking"
instructions by moving every
 > other instruction :)
 >
 > Is that okay, or are they really barriers as well (like may throw), in
 > which case, they probably need some real representation in control flow
 > if you want to make most algorithms O(N) (you can get away with the bb
 > level flags if you are willing to make some algorithms N^2 in cases
 > where these things exist).

Daniel Berlin via llvm-dev

2016-Jul-15 19:36 UTC

head link

[llvm-dev] RFC: Strong GC References in LLVM

On Fri, Jul 15, 2016 at 12:21 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> Hi Daniel,
>
> Daniel Berlin wrote:
> >          As a starting point, LLVM will conservatively not speculate
such
> >          loads and stores; and will leave open the potential to
upstream
> >          logic that will have a more precise sense of when these loads
> and
> >          stores are safe to speculate.
> >
> >
> > I think you need to define what you mean by control dependence here. 
If
> > you mean speculation, you should say speculation :)
>
> Apologies for being non-specific -- this is really just "don't
> speculate".
>
> > As you describe below, it is not enough to simply not speculate them.
>
> I'm not sure where I said that?


>
>
> > You also are saying you don't want to change the conditions on
which
> > they execute.
> > That is very different from speculation.
>
> If I implied that somehow then I (or the example) was wrong. :)
>
:)

>
> We can't speculate these instructions (without special knowledge of
> the GC and the Java type system), and that's it.
Okey.

>
>
> > FWIW: This raises one of the same issues we have now with may-throw,
> > which is that, if all you have is a flag on the instruction, now you
> > have to look at every instruction in every block to know whether a
*CFG*
> > transform is correct.
> >
> > That means any pass that wants to just touch the CFG can't do so
without
> > also looking at the instruction stream.  It will also make a bunch of
> > things currently O(N), O(N^2) (see the sets of patches fixing
may-throw
> > places, and extrapolate to more places).
>
> As I said, I'm only proposing a "don't speculate" flag,
so this does
> not (?) apply.
>
As long as it applies only to the instructions, and they do not act as
"barriers" to hoisting/sinking, then yes, it should not apply.

(In theory it still means things have to look at instructions, but they had
to look at them anyway at that point :P)
>
> However, I didn't quite understand your point about may-throw -- how
> is may-throw different from a generic side-effect (volatile store,
> syscall etc.)?  All of those can't be hoisted or sunk -- we have to
> make sure that they execute in semantically the same conditions that
> they did in the original program.
>
> may-throw is, AFAIK, worse. They act as barriers to sinking *otherthings*. You cannot sink a store past a may-throw, or hoist a load above
them.  You can't optimize stores across them either:

See:
[PATCH] D21007: DSE: Don't remove stores made live by a call which unwinds.
for the latter

[llvm] r270828 - [MergedLoadStoreMotion] Don't transform across may-throw
calls
for the former.

"It is unsafe to hoist a load before a function call which may throw, the
throw might prevent a pointer dereference.

Likewise, it is unsafe to sink a store after a call which may throw.
The caller might be able to observe the difference."

This then leads to the problem i mentioned - because the may-throwness is
not expressed at the bb level (or in the CFG, by having the call end the
block, or at the least, a fake abnormal CFG edge), everything has to go
checking every instruction along the entire path they want to hoist,
whereas hoisting is normally just a simple dataflow problem with BB level
properties :)

>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/a94f4e87/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Jul 2016 - RFC: Strong GC References in LLVM

[llvm-dev] RFC: Strong GC References in LLVM

[llvm-dev] RFC: Strong GC References in LLVM

[llvm-dev] RFC: Strong GC References in LLVM

Apparently Analagous Threads