thr3ads.net - llvm dev - [LLVMdev] Implementing devirtualization [Dec 2011]

If this information is useful, please help other people find it:
Share via:

John McCall

2011-Dec-11 11:18 UTC

[LLVMdev] Implementing devirtualization

On Dec 10, 2011, at 12:26 PM, Nick Lewycky wrote:> John McCall wrote:
>> On Dec 8, 2011, at 10:03 PM, Nick Lewycky wrote:
>>> Noalias returns, nocapture, SCC refinement, linkonce_odr and
>>> available_externally were added with the goal of making
devirtualization
>>> in LLVM happen, but as orthogonal independent optimizations. I
think
>>> LLVM should continue with this design. If you want to implement a
single
>>> substantial optimization pass, I would suggest FSIPSCCP as the
largest
>>> thing you should write.
>> 
>> This is a lot of work that is going to be completely foiled by the
presence
>> of almost any opaque call at all.
> 
> Yes, but it's still useful.
Sure, there are generally applications for any general optimization you can
suggest.  I'm just saying that FSIPSCCP is not really a very compelling way
to do devirtualization.
> Also, anything based on knowing the type hierarchy could be foiled by new
derivations in other translation units, or that show up with dlopen.
I am not proposing anything that requires full-program knowledge of class
hierarchies.  If that's the idea, we are going to actually have to have
full-program knowledge somehow, which I don't remember being one of the
many, many appositions in FSIPSCCP, either. :)

Both of our proposals obviously only work when we can statically see the
construction point of an object in some way.  However, using a generic memory
optimization would require us to be able to see both the actual store to the
vtable field and the entire intervening history of that memory to verify that
there are no subsequent stores.  That analysis is likely prohibitively expensive
even where possible, and it will frequently *not* be possible:
  Example #1:  I have a constructor which is not defined in this translation
unit.  You are doomed.
  Example #2:  I pass the address of a mutable global variable to a function
which performs a virtual call on it.  You must prove that literally no code
(except possibly a global constructor) can ever store to that vtable.
  Example #3:  I construct an object, call a global function foo(), and then do
a virtual call on my object.  You must either prove that foo() cannot possibly
have a handle to the object or hope it's defined in this translation unit.

Language guarantees are *really, really useful*.  I understand the desire to
improve optimizations that don't require language-specific annotations, but
I am not sure it is very practical.

John.

Nick Lewycky

2011-Dec-13 02:16 UTC

head link

[LLVMdev] Implementing devirtualization

On 11 December 2011 03:18, John McCall <rjmccall at apple.com>
wrote:> On Dec 10, 2011, at 12:26 PM, Nick Lewycky wrote:
>> John McCall wrote:
>>> On Dec 8, 2011, at 10:03 PM, Nick Lewycky wrote:
>>>> Noalias returns, nocapture, SCC refinement, linkonce_odr and
>>>> available_externally were added with the goal of making
devirtualization
>>>> in LLVM happen, but as orthogonal independent optimizations. I
think
>>>> LLVM should continue with this design. If you want to implement
a single
>>>> substantial optimization pass, I would suggest FSIPSCCP as the
largest
>>>> thing you should write.
>>>
>>> This is a lot of work that is going to be completely foiled by the
presence
>>> of almost any opaque call at all.
>>
>> Yes, but it's still useful.
>
> Sure, there are generally applications for any general optimization you can
suggest.  I'm just saying that FSIPSCCP is not really a very compelling way
to do devirtualization.
>
>> Also, anything based on knowing the type hierarchy could be foiled by
new derivations in other translation units, or that show up with dlopen.
>
> I am not proposing anything that requires full-program knowledge of class
hierarchies.  If that's the idea, we are going to actually have to have
full-program knowledge somehow, which I don't remember being one of the
many, many appositions in FSIPSCCP, either. :)
>
> Both of our proposals obviously only work when we can statically see the
construction point of an object in some way.  However, using a generic memory
optimization would require us to be able to see both the actual store to the
vtable field and the entire intervening history of that memory to verify that
there are no subsequent stores.  That analysis is likely prohibitively expensive
even where possible, and it will frequently *not* be possible:
>  Example #1:  I have a constructor which is not defined in this translation
unit.  You are doomed.
>  Example #2:  I pass the address of a mutable global variable to a function
which performs a virtual call on it.  You must prove that literally no code
(except possibly a global constructor) can ever store to that vtable.
>  Example #3:  I construct an object, call a global function foo(), and then
do a virtual call on my object.  You must either prove that foo() cannot
possibly have a handle to the object or hope it's defined in this
translation unit.
You didn't elaborate on how the llvm.bless markers get there. Is your
idea to put them at the point of allocation and then let llvm
propagate them around to functions that get the pointer? If so, then
it sounds very similar to fsipsccp plus immutable markers.
> Language guarantees are *really, really useful*.  I understand the desire
to improve optimizations that don't require language-specific annotations,
but I am not sure it is very practical.
I don't disagree at all. I really do want to optimize based on the
program's behaviour as much as possible, falling back on additional
language-provided guarantees only when necessary. Taking advantage of
the immutable nature of certain fields is something that I agree we
probably want to do.

Nick

Chris Lattner

2011-Dec-13 07:43 UTC

head link

[LLVMdev] Implementing devirtualization

On Dec 12, 2011, at 6:16 PM, Nick Lewycky wrote:>> Language guarantees are *really, really useful*.  I understand the
desire to improve optimizations that don't require language-specific
annotations, but I am not sure it is very practical.
> 
> I don't disagree at all. I really do want to optimize based on the
> program's behaviour as much as possible, falling back on additional
> language-provided guarantees only when necessary.
I think that this is a great general approach, particularly given that devirt is
just a special case of other very common patterns (e.g. default constructors
usually initialize all fields).  This could be implemented with relatively
straight-forward improvements to our AA/memdep infrastructure to allow it to
forward values from calls to loads.

The only thing I'm worried about is how expensive the analysis for this
would be.  This could be handled with some simple summary functions stuff along
the lines of global mod/ref (with a threshold to bail out in insane cases).

Once the general infrastructure exists, we could add intrinsics or something
else for languages that know stuff to communicate it to the optimizer.

-Chris

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Dec 2011 - [LLVMdev] Implementing devirtualization

[LLVMdev] Implementing devirtualization

[LLVMdev] Implementing devirtualization

[LLVMdev] Implementing devirtualization

Possibly Parallel Threads