thr3ads.net - llvm dev - [LLVMdev] Is there a "callback optimization"? [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Kenneth Uildriks

2010-Jun-04 17:29 UTC

[LLVMdev] Is there a "callback optimization"?

When I used -std-compile-opts -disable-inlining, my transform didn't
happen.  I think in your test, the inline of UseCallback into foo
automatically made the function pointer into a constant, which turned
it into a direct call that was then inlined.

If UseCallback is too big to inline and uses the callback parameter
inside a loop, this transform is potentially valuable, particularly if
UseCallback is called multiple times with the same callback parameter.

Interestingly, when I had foo call UseCallback multiple times with
*only* callback1, it yanked the function pointer parameter out of
UseCallback and turned the thing into a direct call.  (I'm guessing
dead argument elimination came into play here)  But as soon as I added
a call to UseCallback with callback2 to the mix, it went back to not
making any indirect call elimination.

On Fri, Jun 4, 2010 at 11:11 AM, Duncan Sands <baldrick at free.fr>
wrote:> Hi Kenneth,
>
>> By that I mean an optimization pass (or a combination of them) that
turns:
> ...
>> With that transform in place, lots of inlining becomes possible, and
>> direct function calls replace indirect function calls if inlining
>> isn't appropriate.  If this transform is combined with argpromotion
>> and scalarrepl, it can be used for devirtualization of C++ virtual
>> function calls.
>>
>> There seems to be an awful lot of C++ code out there that uses
>> templates to perform this same optimization in source code.
>
> yes, LLVM does this.  For example, running your example through the LLVM
> optimizers gives:
>
> define void @foo() nounwind readnone {
> entry:
>   ret void
> }
>
> As you can see, the indirect function calls were resolved into direct
> function calls and inlined.
>
> I don't know which passes take care of this however.
>
> Ciao,
>
> Duncan.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Eugene Toder

2010-Jun-04 18:35 UTC

head link

[LLVMdev] Is there a "callback optimization"?

It should be relatively simple to write a pass that turns each call
that has constant argument(s) into a call to specialized version of
the callee. To devirtualize C++ calls it needs to be smarter, since
the argument is not a constant, but a pointer to a struct that points
to a constant. However, the trick here is
1) Knowing when to perform specialization. If the call was not inlined
the function is probably big. Getting this wrong will generate *a lot*
of code for very small (if not negative) speed gain.
2) Sharing of specializations from different call sites that have the
same constants.
Getting 1) right is crucial but hard. Easy cases are already taken by
inline and dead argument elimination. If some good profiling
information is available it can be used for speed/space trade off
estimation (specialize calls from hot code).

Eugene

On Fri, Jun 4, 2010 at 6:29 PM, Kenneth Uildriks <kennethuil at gmail.com>
wrote:> When I used -std-compile-opts -disable-inlining, my transform didn't
> happen.  I think in your test, the inline of UseCallback into foo
> automatically made the function pointer into a constant, which turned
> it into a direct call that was then inlined.
>
> If UseCallback is too big to inline and uses the callback parameter
> inside a loop, this transform is potentially valuable, particularly if
> UseCallback is called multiple times with the same callback parameter.
>
> Interestingly, when I had foo call UseCallback multiple times with
> *only* callback1, it yanked the function pointer parameter out of
> UseCallback and turned the thing into a direct call.  (I'm guessing
> dead argument elimination came into play here)  But as soon as I added
> a call to UseCallback with callback2 to the mix, it went back to not
> making any indirect call elimination.
>
> On Fri, Jun 4, 2010 at 11:11 AM, Duncan Sands <baldrick at free.fr>
wrote:
>> Hi Kenneth,
>>
>>> By that I mean an optimization pass (or a combination of them) that
turns:
>> ...
>>> With that transform in place, lots of inlining becomes possible,
and
>>> direct function calls replace indirect function calls if inlining
>>> isn't appropriate.  If this transform is combined with
argpromotion
>>> and scalarrepl, it can be used for devirtualization of C++ virtual
>>> function calls.
>>>
>>> There seems to be an awful lot of C++ code out there that uses
>>> templates to perform this same optimization in source code.
>>
>> yes, LLVM does this.  For example, running your example through the
LLVM
>> optimizers gives:
>>
>> define void @foo() nounwind readnone {
>> entry:
>>   ret void
>> }
>>
>> As you can see, the indirect function calls were resolved into direct
>> function calls and inlined.
>>
>> I don't know which passes take care of this however.
>>
>> Ciao,
>>
>> Duncan.
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Kenneth Uildriks

2010-Jun-04 18:47 UTC

head link

[LLVMdev] Is there a "callback optimization"?

On Fri, Jun 4, 2010 at 1:35 PM, Eugene Toder <eltoder at gmail.com>
wrote:> It should be relatively simple to write a pass that turns each call
> that has constant argument(s) into a call to specialized version of
> the callee. To devirtualize C++ calls it needs to be smarter, since
> the argument is not a constant, but a pointer to a struct that points
> to a constant. However, the trick here is
> 1) Knowing when to perform specialization. If the call was not inlined
> the function is probably big. Getting this wrong will generate *a lot*
> of code for very small (if not negative) speed gain.
> 2) Sharing of specializations from different call sites that have the
> same constants.
> Getting 1) right is crucial but hard. Easy cases are already taken by
> inline and dead argument elimination. If some good profiling
> information is available it can be used for speed/space trade off
> estimation (specialize calls from hot code).
As the number of callsites using the same constant grows, inlining
gets more expensive while specializing does not - the cost of
specializing only grows with the number of unique constants combos
specialized.  So cases where you'd want to specialize but not inline
shouldn't be all that uncommon, and different cost calculations are
needed to set the threshold.

I didn't see the partial specialization pass in the docs, but I'll
take a look at it now.

Cornelius

2010-Jun-04 20:01 UTC

head link

[LLVMdev] Is there a "callback optimization"?

Hi,> 1) Knowing when to perform specialization. If the call was not inlined
> the function is probably big. Getting this wrong will generate *a lot*
> of code for very small (if not negative) speed gain.Could you elaborate why just having (lots of) more code in the final 
executable will incur a performance _penalty_?
I was thinking of something similiar, but for type-specializations of 
functions of a dynamicly-typed language, so that the frontend creates 
more than one function for each function in the sourcecode.
> 2) Sharing of specializations from different call sites that have the
> same constants.
> Getting 1) right is crucial but hard. Easy cases are already taken by
> inline and dead argument elimination. If some good profiling
> information is available it can be used for speed/space trade off
> estimation (specialize calls from hot code).
>
> Eugene
>
Cornelius

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Jun 2010 - [LLVMdev] Is there a "callback optimization"?

[LLVMdev] Is there a "callback optimization"?

[LLVMdev] Is there a "callback optimization"?

[LLVMdev] Is there a "callback optimization"?

[LLVMdev] Is there a "callback optimization"?

Maybe Matching Threads