thr3ads.net - llvm dev - [LLVMdev] Optimization hints for "constant" loads [Oct 2014]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das

2014-Oct-21 07:29 UTC

[LLVMdev] Optimization hints for "constant" loads

> I've never realy understood how the llvm.invariant intrinsics could be
> put into practice. There is the problem that "end" can occur
anywhere
> as you suggested fixing with a flag.
I was under this impression too, but after re-reading the language
reference I'm not so sure -- it says about invariant.start: "This
intrinsic indicates that
until an llvm.invariant.end that uses the return value, the referenced
memory location is constant and unchanging.".  I think this implies
that if the result of an `llvm.invariant.start` doesn't escape, we can
safely conclude that there can be no matching `llvm.invariant.end`.

I'm still a little hazy on all of the use cases you want to support,
but one problem with llvm.safe_cast is, as you note, stores to the
pointer passed to safe_cast will not be forwarded to loads from the
pointer it returns.  I think this issue can be solved by modeling
`llvm.safe_cast` not as a transformation on a pointer, but as a check.
Specifically, I imagine a readonly intrinsic `llvm.assert_immutable`
that "asserts" that the pointer passed to it is immutable, and unwinds
if that fact isn't true (these are only imaginary semantics -- we know
the intrinsic will never actually unwind).  This means
`llvm.assert_immutable` can't be marked nounwind (otherwise it would
just get optimized away); but since it doesn't need to unwind to the
same frame, a CallInst is sufficient (an InvokeInst would've increased
the CFG's complexity).

So

  %p = unaliased address
  store 42, %p
  %a = llvm.safe_cast %p ; just made this up
  %val = load %a !invariant

becomes

  %p = unaliased address
  store 42, %p
  call void @llvm.assert_immutable(%p)
  %val = load %p !invariant

AFAICT, "load %p" can not, in general, be re-ordered to before the
call to @llvm.assert_immutable because we don't know what condition it
uses to decide if it should unwind.  There are cases where I think
such a re-ordering would be legal (an unused %val is one example,
another example is where the only use of %val is a comparison with
undef) but if I understand correctly, re-ordering a load with the call
to @llvm.assert_immutable can only be a performance loss -- it will
change dominance in a way that we'll "forget" that a load was
immutable.  And I think in most of the cases that are interesting to
optimize, the re-ordering will not be legal.

It is still legal to value forward the store though:

  %p = unaliased address
  store 42, %p
  call void @llvm.assert_immutable(%p)
  %val = 42

because *if* assert_immutable does not unwind, %val will see the
dominating store, because assert_immutable is readonly.

Again,

  %p1 = ...
  %p2 = llvm.safe_cast %p1
  %a = select i1 %is_immutable, %p1, %p2
  load %a !invariant

could become

  %p1 = ...
  if (not %is_immutable) {
    call llvm.assert_immutable(%p1)
  }
  load %p1

or, if we wish to reduce control flow, we could define
`llvm.assert_immutable` to be a no-op on null:

  %p1 = ...
  %a = select i1 %is_immutable, null, %p1,
  call llvm.assert_immutable(%a)
  load %p1 !invariant


`llvm.assert_immutable` does with control dependencies what
`llvm.safe_cast` does with data dependencies.
> llvm.invariant is inherently control dependent, which doesn't really
> allow the optimizer to hoist the invariant loads the way we want it
> to.
I'm curious about why you think so -- do you have examples that
demonstrate this?  And is this a flaw in how llvm implements llvm.invariant
or something fundamental about control dependencies?

-- Sanjoy

Andrew Trick

2014-Oct-21 17:58 UTC

head link

[LLVMdev] Optimization hints for "constant" loads

> On Oct 21, 2014, at 12:29 AM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> 
>> I've never realy understood how the llvm.invariant intrinsics could
be
>> put into practice. There is the problem that "end" can occur
anywhere
>> as you suggested fixing with a flag.
> 
> I was under this impression too, but after re-reading the language
> reference I'm not so sure -- it says about invariant.start: "This
> intrinsic indicates that
> until an llvm.invariant.end that uses the return value, the referenced
> memory location is constant and unchanging.".  I think this implies
> that if the result of an `llvm.invariant.start` doesn't escape, we can
> safely conclude that there can be no matching `llvm.invariant.end`.
> 
> I'm still a little hazy on all of the use cases you want to support,
> but one problem with llvm.safe_cast is, as you note, stores to the
> pointer passed to safe_cast will not be forwarded to loads from the
> pointer it returns.  I think this issue can be solved by modeling
> `llvm.safe_cast` not as a transformation on a pointer, but as a check.
> Specifically, I imagine a readonly intrinsic `llvm.assert_immutable`
> that "asserts" that the pointer passed to it is immutable, and
unwinds
> if that fact isn't true (these are only imaginary semantics -- we know
> the intrinsic will never actually unwind).  This means
> `llvm.assert_immutable` can't be marked nounwind (otherwise it would
> just get optimized away); but since it doesn't need to unwind to the
> same frame, a CallInst is sufficient (an InvokeInst would've increased
> the CFG's complexity).
> 
> So
> 
>  %p = unaliased address
>  store 42, %p
>  %a = llvm.safe_cast %p ; just made this up
>  %val = load %a !invariant
> 
> becomes
> 
>  %p = unaliased address
>  store 42, %p
>  call void @llvm.assert_immutable(%p)
>  %val = load %p !invariant
> 
> AFAICT, "load %p" can not, in general, be re-ordered to before
the
> call to @llvm.assert_immutable because we don't know what condition it
> uses to decide if it should unwind.  There are cases where I think
> such a re-ordering would be legal (an unused %val is one example,
> another example is where the only use of %val is a comparison with
> undef) but if I understand correctly, re-ordering a load with the call
> to @llvm.assert_immutable can only be a performance loss -- it will
> change dominance in a way that we'll "forget" that a load was
> immutable.  And I think in most of the cases that are interesting to
> optimize, the re-ordering will not be legal.
> 
> It is still legal to value forward the store though:
> 
>  %p = unaliased address
>  store 42, %p
>  call void @llvm.assert_immutable(%p)
>  %val = 42
> 
> because *if* assert_immutable does not unwind, %val will see the
> dominating store, because assert_immutable is readonly.
> 
> Again,
> 
>  %p1 = ...
>  %p2 = llvm.safe_cast %p1
>  %a = select i1 %is_immutable, %p1, %p2
>  load %a !invariant
> 
> could become
> 
>  %p1 = ...
>  if (not %is_immutable) {
>    call llvm.assert_immutable(%p1)
>  }
>  load %p1
> 
> or, if we wish to reduce control flow, we could define
> `llvm.assert_immutable` to be a no-op on null:
> 
>  %p1 = ...
>  %a = select i1 %is_immutable, null, %p1,
>  call llvm.assert_immutable(%a)
>  load %p1 !invariant
Just so you know, I am not a big fan of adding cast nodes into the use-def
chain. It has to be a last resort. I proposed it this time because I don't
see another way to fully express the semantics, and in many cases the casts are
limited to the initialization code. I think the optimizer could be taught to
store->load forward across them with some effort.

AFAICT The only difference between assert_immutable and invariant.start is that
assert_immutable "maythrow", and we don't need to find all uses of
the pointer.

assert_immutable will handle your use case safely as long as you don't mark
it "readonly" (loads can move above readonly calls).
http://llvm.1065342.n5.nabble.com/Does-nounwind-have-semantics-td59631.html

The problems that we will face with assert_immutable are:

- Because it is maythrow and is not readonly, there is a dependence between
assert_immutable and *all* subsequent loads. This could seriously impact
optimization.

- Since the invariant load is not actually marked invariant, it can't be
reordered with other stores and nonreadonly calls.

- It does not allow us in the future to speculatively hoist certain invariant
loads above maythrow calls and branches.

So, I'm ok with assert_immutable as a solution to some subset of problems.
But it is incomplete and suboptimal. I suspect that introducing non-readonly
calls will do more harm than the benefit you get from marking some loads
invariant.

Also note that this could get interesting when we introduce TBAA on calls:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-October/066214.html

We would like to hoist loads above calls that have non-interfering TBAA tags,
but also may throw (patchpoints and stackmaps are marked maythrow).

Say we declare that loads can be hoisted above maythrow calls with TBAA. If TBAA
is dropped it becomes conservative. Then you could give your assert_immutable
calls TBAA that only interferes with the array length load. That would solve
some of your optimization issues, but only really works well if most of your
loads and calls have precise TBAA tags.

The more aggressive solution I was proposing was to mark loads that are guarded
with explicit safety checks as "maySpeculate". If we do that we
definitely need a way to create a data dependence on a cast - this approach is
incompatible with assert_immutable.
> `llvm.assert_immutable` does with control dependencies what
> `llvm.safe_cast` does with data dependencies.
> 
>> llvm.invariant is inherently control dependent, which doesn't
really
>> allow the optimizer to hoist the invariant loads the way we want it
>> to.
> 
> I'm curious about why you think so -- do you have examples that
> demonstrate this?  And is this a flaw in how llvm implements llvm.invariant
> or something fundamental about control dependencies?
I am saying exactly what you said just above the quote: llvm.invariant and
llvm.assert_immutable guarantee safety by relying on implicit control
dependence. That doesn't really give us full freedom to optimize the guarded
invariant loads. They still can't be optimized across other non-readonly
calls or hoisted above branches or out of loops. This is fundamental to the
representation - it's not an implementation detail.

-Andy
> 
> -- Sanjoy

Sanjoy Das

2014-Oct-21 18:46 UTC

head link

[LLVMdev] Optimization hints for "constant" loads

Thank you for the explanation, I think I misunderstood your solution
initially.  Is it accurate to say: by making the definition of the
source pointer of an !invariant load control (or data) dependent on
some initialization event, you can effectively separate the
initialization event (where the location is not invariant) and the
rest of the execution (where the location is invariant).

This approach looks similar to Ruby's "freeze" method:
http://ruby-doc.org/core-2.1.3/Object.html#method-i-freeze

What is the aliasing relationship between a non-invariant load from
the result of a safe_cast and a load from (or store to) the argument
we pass to it?  Is it sensible to make them MustAlias?  I'm trying to
think if a frontend needs to be careful about using the return value
of check_cast only for !invariant loads (or some other restriction),
or is it legal (or desirable) to mix and match both kinds of loads
from both kinds of pointers.

-- Sanjoy

llvm dev - Oct 2014 - [LLVMdev] Optimization hints for "constant" loads

[LLVMdev] Optimization hints for "constant" loads

[LLVMdev] Optimization hints for "constant" loads

[LLVMdev] Optimization hints for "constant" loads