thr3ads.net - llvm dev - [LLVMdev] How to use BlockFrequency in inter-procedural context? [Dec 2014]

If this information is useful, please help other people find it:
Share via:

Das, Dibyendu

2014-Dec-27 06:24 UTC

[LLVMdev] How to use BlockFrequency in inter-procedural context?

David

Is this true for static heuristics as well ?

If the bb freqs are scaled wrt to the entry block freq and a) use such scaled
freqs for the bb's that have calls b) propagate this info topologically over
the call graph, how representative will be the info if one just wants to use in
a comparative sense ?

-Dibyendu

Sent from my Windows Phone
________________________________
From: Xinliang David Li<mailto:xinliangli at gmail.com>
Sent: ‎12/‎27/‎2014 10:05 AM
To: ibaev at codeaurora.org<mailto:ibaev at codeaurora.org>
Cc: llvmdev<mailto:llvmdev at cs.uiuc.edu>
Subject: Re: [LLVMdev] How to use BlockFrequency in inter-procedural context?


On Fri, Dec 26, 2014 at 7:12 PM, <ibaev at codeaurora.org<mailto:ibaev at
codeaurora.org>> wrote:
The BlockFrequency analysis has been useful for machine block placement,
register allocation and other function-level optimizations. How could we
extend it for use in an inter-procedural (whole-program) context? For
example, if we would like to compare the hotness of two call sites in
different functions, or calculate the hotness of two global variables
referenced in multiple functions.

BlockFrequency can not be used for inter-procedural analysis.


If the ratio of a block BB frequency to the entry block frequency is the
expected number of times the block BB will execute per entry to the
function (according to LLVM Block Frequency Terminology page), would the
multiplication of that ratio to the profile count of the function be a
reasonable approximation of BB total execution count?

The answer depends on how BB frequency is computed. If BB frequency is directly
scaled from BB profile count (Execution count), then the answer is
'yes'. In the current implementation, the answer is 'no'.  LLVM
only keeps branch probability information from the profile data, and BB
frequency is computed from branch probabilities:  There are known limitations of
the frequency propagation algorithm to make the resulting frequency
'distorted'. To name a few: loop scale is capped at 4096; branch weight
is incremented by 1 (leading to issues such as computed loop trip count as low
as half of the actual count); limitation of handling irreducible loops etc. In
fact, frequency can not be reliably be used for comparison across different loop
nests.

We have plans to address those issues to improve PGO.

thanks,

David





Thanks,
Ivan



_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141227/70806756/attachment.html>

Xinliang David Li

2014-Dec-27 08:22 UTC

head link

[LLVMdev] How to use BlockFrequency in inter-procedural context?

On Fri, Dec 26, 2014 at 10:24 PM, Das, Dibyendu <Dibyendu.Das at amd.com>
wrote:
>  David
>
> Is this true for static heuristics as well ?
>
> If the bb freqs are scaled wrt to the entry block freq and a) use such
> scaled freqs for the bb's that have calls b) propagate this info
> topologically over the call graph, how representative will be the info if
> one just wants to use in a comparative sense ?
>
>The main issue with inter-procedural propagation of static frequencies is
the existence of indirect calls -- their targets are usually not resolvable
statically, not to mention their frequency distributions. Another issue is
with loops -- static heuristics tend to estimate loop trip count very
conservatively, and it favors BBs in deep loop nests more. The global
'hotness' info computed can be misleading and harmful.

David


> -Dibyendu
>
> Sent from my Windows Phone
>  ------------------------------
> From: Xinliang David Li <xinliangli at gmail.com>
> Sent: ‎12/‎27/‎2014 10:05 AM
> To: ibaev at codeaurora.org
> Cc: llvmdev <llvmdev at cs.uiuc.edu>
> Subject: Re: [LLVMdev] How to use BlockFrequency in inter-procedural
> context?
>
>
> On Fri, Dec 26, 2014 at 7:12 PM, <ibaev at codeaurora.org> wrote:
>>
>> The BlockFrequency analysis has been useful for machine block
placement,
>> register allocation and other function-level optimizations. How could
we
>> extend it for use in an inter-procedural (whole-program) context? For
>> example, if we would like to compare the hotness of two call sites in
>> different functions, or calculate the hotness of two global variables
>> referenced in multiple functions.
>>
>
>  BlockFrequency can not be used for inter-procedural analysis.
>
>
>>
>> If the ratio of a block BB frequency to the entry block frequency is
the
>> expected number of times the block BB will execute per entry to the
>> function (according to LLVM Block Frequency Terminology page), would
the
>> multiplication of that ratio to the profile count of the function be a
>> reasonable approximation of BB total execution count?
>>
>
>  The answer depends on how BB frequency is computed. If BB frequency is
> directly scaled from BB profile count (Execution count), then the answer is
> 'yes'. In the current implementation, the answer is 'no'. 
LLVM only keeps
> branch probability information from the profile data, and BB frequency is
> computed from branch probabilities:  There are known limitations of the
> frequency propagation algorithm to make the resulting frequency
> 'distorted'. To name a few: loop scale is capped at 4096; branch
weight is
> incremented by 1 (leading to issues such as computed loop trip count as low
> as half of the actual count); limitation of handling irreducible loops etc.
> In fact, frequency can not be reliably be used for comparison across
> different loop nests.
>
>  We have plans to address those issues to improve PGO.
>
>  thanks,
>
>  David
>
>
>
>
>
>>
>> Thanks,
>> Ivan
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141227/f53eb6f5/attachment.html>

Das, Dibyendu

2014-Dec-27 08:34 UTC

head link

[LLVMdev] How to use BlockFrequency in inter-procedural context?

Thanks David for the insight. I was thinking whether we can use some of the
learning algorithms like Support Vector Machine to estimate freqs in a non-pgo
setup. And these can use the pgo-based results for supervised learning.

Sent from my Windows Phone
________________________________
From: Xinliang David Li<mailto:xinliangli at gmail.com>
Sent: ‎12/‎27/‎2014 1:53 PM
To: Das, Dibyendu<mailto:Dibyendu.Das at amd.com>
Cc: ibaev at codeaurora.org<mailto:ibaev at codeaurora.org>;
llvmdev<mailto:llvmdev at cs.uiuc.edu>
Subject: Re: [LLVMdev] How to use BlockFrequency in inter-procedural context?

On Fri, Dec 26, 2014 at 10:24 PM, Das, Dibyendu <Dibyendu.Das at
amd.com<mailto:Dibyendu.Das at amd.com>> wrote:
David

Is this true for static heuristics as well ?

If the bb freqs are scaled wrt to the entry block freq and a) use such scaled
freqs for the bb's that have calls b) propagate this info topologically over
the call graph, how representative will be the info if one just wants to use in
a comparative sense ?

The main issue with inter-procedural propagation of static frequencies is the
existence of indirect calls -- their targets are usually not resolvable
statically, not to mention their frequency distributions. Another issue is with
loops -- static heuristics tend to estimate loop trip count very conservatively,
and it favors BBs in deep loop nests more. The global 'hotness' info
computed can be misleading and harmful.

David

-Dibyendu

Sent from my Windows Phone
________________________________
From: Xinliang David Li<mailto:xinliangli at gmail.com>
Sent: ‎12/‎27/‎2014 10:05 AM
To: ibaev at codeaurora.org<mailto:ibaev at codeaurora.org>
Cc: llvmdev<mailto:llvmdev at cs.uiuc.edu>
Subject: Re: [LLVMdev] How to use BlockFrequency in inter-procedural context?

On Fri, Dec 26, 2014 at 7:12 PM, <ibaev at codeaurora.org<mailto:ibaev at
codeaurora.org>> wrote:
The BlockFrequency analysis has been useful for machine block placement,
register allocation and other function-level optimizations. How could we
extend it for use in an inter-procedural (whole-program) context? For
example, if we would like to compare the hotness of two call sites in
different functions, or calculate the hotness of two global variables
referenced in multiple functions.

BlockFrequency can not be used for inter-procedural analysis.

If the ratio of a block BB frequency to the entry block frequency is the
expected number of times the block BB will execute per entry to the
function (according to LLVM Block Frequency Terminology page), would the
multiplication of that ratio to the profile count of the function be a
reasonable approximation of BB total execution count?

The answer depends on how BB frequency is computed. If BB frequency is directly
scaled from BB profile count (Execution count), then the answer is
'yes'. In the current implementation, the answer is 'no'.  LLVM
only keeps branch probability information from the profile data, and BB
frequency is computed from branch probabilities:  There are known limitations of
the frequency propagation algorithm to make the resulting frequency
'distorted'. To name a few: loop scale is capped at 4096; branch weight
is incremented by 1 (leading to issues such as computed loop trip count as low
as half of the actual count); limitation of handling irreducible loops etc. In
fact, frequency can not be reliably be used for comparison across different loop
nests.

We have plans to address those issues to improve PGO.

thanks,

David

Thanks,
Ivan

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141227/147d0f11/attachment.html>

llvm dev - Dec 2014 - [LLVMdev] How to use BlockFrequency in inter-procedural context?

[LLVMdev] How to use BlockFrequency in inter-procedural context?

[LLVMdev] How to use BlockFrequency in inter-procedural context?

[LLVMdev] How to use BlockFrequency in inter-procedural context?