thr3ads.net - llvm dev - [LLVMdev] Implementing devirtualization [Dec 2011]

If this information is useful, please help other people find it:
Share via:

John McCall

2011-Dec-09 19:14 UTC

[LLVMdev] Implementing devirtualization

On Dec 8, 2011, at 10:03 PM, Nick Lewycky wrote:> Noalias returns, nocapture, SCC refinement, linkonce_odr and 
> available_externally were added with the goal of making devirtualization 
> in LLVM happen, but as orthogonal independent optimizations. I think 
> LLVM should continue with this design. If you want to implement a single 
> substantial optimization pass, I would suggest FSIPSCCP as the largest 
> thing you should write.
This is a lot of work that is going to be completely foiled by the presence
of almost any opaque call at all.

What's needed here is a language-independent way to exploit
language-specific guarantees like C++ [basic.life]p7, i.e. that certain
provenances of pointer guarantee that certain members have known,
immutable values.  There are analogous guarantees basically everywhere;
for example, Java arrays have a length, ADTs in System F have a
discriminator, etc.

I would suggest an intrinsic like
  declare i8* @llvm.bless.known.memory(i8*, …) nounwind readnone
where the … is a sequence of offset/value pairs.  The load peephole is
then quite obvious.

An interesting extra case from [basic.life]p7 is that we can also state
that 'const' fields are invariant after the "blessing point",
although
(unlike ivars) we can't necessarily assign a fixed value to them right
then.

John.

Nick Lewycky

2011-Dec-10 20:26 UTC

head link

[LLVMdev] Implementing devirtualization

John McCall wrote:> On Dec 8, 2011, at 10:03 PM, Nick Lewycky wrote:
>> Noalias returns, nocapture, SCC refinement, linkonce_odr and
>> available_externally were added with the goal of making
devirtualization
>> in LLVM happen, but as orthogonal independent optimizations. I think
>> LLVM should continue with this design. If you want to implement a
single
>> substantial optimization pass, I would suggest FSIPSCCP as the largest
>> thing you should write.
>
> This is a lot of work that is going to be completely foiled by the presence
> of almost any opaque call at all.
Yes, but it's still useful.

Also, anything based on knowing the type hierarchy could be foiled by 
new derivations in other translation units, or that show up with dlopen.
> What's needed here is a language-independent way to exploit
> language-specific guarantees like C++ [basic.life]p7, i.e. that certain
> provenances of pointer guarantee that certain members have known,
> immutable values.  There are analogous guarantees basically everywhere;
> for example, Java arrays have a length, ADTs in System F have a
> discriminator, etc.
>
> I would suggest an intrinsic like
>    declare i8* @llvm.bless.known.memory(i8*, …) nounwind readnone
> where the … is a sequence of offset/value pairs.  The load peephole is
> then quite obvious.
>
> An interesting extra case from [basic.life]p7 is that we can also state
> that 'const' fields are invariant after the "blessing
point", although
> (unlike ivars) we can't necessarily assign a fixed value to them right
> then.
There's two things going on in your proposal, one is taking advantage of 
the C++ guarantee that the vptr (or const fields) can't change after 
construction is complete, and the other is noting down what we know the 
vptr is equal to. I'm separating these because we can more often solve 
the fact that the vptr didn't change in a function regardless of what 
its callees do, than we can know what the vptr is on entry to the function.

You don't know what the vptr really is unless you can track from the 
point where the object was constructed. Without that, the program could 
add a new derived type in a plugin, then pass it to your function. 
That's why I'm advocating constant propagation as the fix here, just 
start at the point of allocation and propagate outwards.

As for immutability of the vptr and const fields, that's what I was 
aiming for with the invariant intrinsics:
   http://llvm.org/docs/LangRef.html#int_invariant_start
but it's generally considered that the design of these intrinsics isn't 
quite right. (A secondary goal of those intrinsics is to allow LICM to 
hoist things out, then in the event of register pressure have the 
backend reload by loading through the invariant pointer, saving us a 
spill onto the stack.)

Nick

John McCall

2011-Dec-11 11:18 UTC

head link

[LLVMdev] Implementing devirtualization

On Dec 10, 2011, at 12:26 PM, Nick Lewycky wrote:> John McCall wrote:
>> On Dec 8, 2011, at 10:03 PM, Nick Lewycky wrote:
>>> Noalias returns, nocapture, SCC refinement, linkonce_odr and
>>> available_externally were added with the goal of making
devirtualization
>>> in LLVM happen, but as orthogonal independent optimizations. I
think
>>> LLVM should continue with this design. If you want to implement a
single
>>> substantial optimization pass, I would suggest FSIPSCCP as the
largest
>>> thing you should write.
>> 
>> This is a lot of work that is going to be completely foiled by the
presence
>> of almost any opaque call at all.
> 
> Yes, but it's still useful.
Sure, there are generally applications for any general optimization you can
suggest.  I'm just saying that FSIPSCCP is not really a very compelling way
to do devirtualization.
> Also, anything based on knowing the type hierarchy could be foiled by new
derivations in other translation units, or that show up with dlopen.
I am not proposing anything that requires full-program knowledge of class
hierarchies.  If that's the idea, we are going to actually have to have
full-program knowledge somehow, which I don't remember being one of the
many, many appositions in FSIPSCCP, either. :)

Both of our proposals obviously only work when we can statically see the
construction point of an object in some way.  However, using a generic memory
optimization would require us to be able to see both the actual store to the
vtable field and the entire intervening history of that memory to verify that
there are no subsequent stores.  That analysis is likely prohibitively expensive
even where possible, and it will frequently *not* be possible:
  Example #1:  I have a constructor which is not defined in this translation
unit.  You are doomed.
  Example #2:  I pass the address of a mutable global variable to a function
which performs a virtual call on it.  You must prove that literally no code
(except possibly a global constructor) can ever store to that vtable.
  Example #3:  I construct an object, call a global function foo(), and then do
a virtual call on my object.  You must either prove that foo() cannot possibly
have a handle to the object or hope it's defined in this translation unit.

Language guarantees are *really, really useful*.  I understand the desire to
improve optimizations that don't require language-specific annotations, but
I am not sure it is very practical.

John.

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Dec 2011 - [LLVMdev] Implementing devirtualization

[LLVMdev] Implementing devirtualization

[LLVMdev] Implementing devirtualization

[LLVMdev] Implementing devirtualization

Apparently Analagous Threads