thr3ads.net - llvm dev - [llvm-dev] [RFC] Profile guided section layout [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Michael Spencer via llvm-dev

2017-Jun-15 17:55 UTC

[llvm-dev] [RFC] Profile guided section layout

On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch <
tobias at codeaurora.org> wrote:
> Hi Michael,
>
> This is cool stuff, thanks for sharing!
>
> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote:
>
>> The first is a new llvm pass which uses branch frequency info to get
>> counts for each call instruction and then adds a module flags metatdata
>> table of function -> function edges along with their counts.
>>
>> The second takes the module flags metadata and writes it into a
>> .note.llvm.callgraph section in the object file. This currently just
dumps
>> it as text, but could save space by reusing the string table.
>>
> Have you considered reading the profile in the linker and extracting that
> information directly from the profile? The profile should contain call
> sites and their sample counts and you could match these up with relocations
> (calls) in the section?

I did this using IR PGO instead of sample PGO so the profile data can only
be applied in the same place in the pipeline it is generated. Even for
sample based this would be complicated as the linker would actually need to
generate machine basic blocks from sections to be able to accurately match
sample counts to relocations, as there may be cold calls in hot functions.

It may be useful however for the linker to directly accept an externally
generated call graph profile. The current approach can actually do this by
embedding it into an extra object file.

>
>
> It doesn't currently work for LTO as the llvm pass needs to be run
after
>> all inlining decisions have been made and LTO codegen has to be done
with
>> -ffunction-sections.
>>
> So this is just an implementation issue, right? You can make LTO run with
> -ffunction-sections (by setting TargetOptions.FunctionSections=true) and
> insert your pass in the appropriate place in the pipeline.
>
Yeah, just an implementation issue. Just need to build the pass pipeline
differently for LTO and add a way to do -ffunction-sections in lld.

- Michael Spencer

>
> Thanks,
> Tobias
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project.
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/d34eebec/attachment.html>

Teresa Johnson via llvm-dev

2017-Jun-15 18:05 UTC

head link

[llvm-dev] [RFC] Profile guided section layout

Great! Cc'ing Sriraman who implemented something like this several years
back on our gcc google branch. We may want to utilize your work in the
gold-plugin as well.


On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch <
> tobias at codeaurora.org> wrote:
>
>> Hi Michael,
>>
>> This is cool stuff, thanks for sharing!
>>
>> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote:
>>
>>> The first is a new llvm pass which uses branch frequency info to
get
>>> counts for each call instruction and then adds a module flags
metatdata
>>> table of function -> function edges along with their counts.
>>>
>>> The second takes the module flags metadata and writes it into a
>>> .note.llvm.callgraph section in the object file. This currently
just dumps
>>> it as text, but could save space by reusing the string table.
>>>
>> Have you considered reading the profile in the linker and extracting
that
>> information directly from the profile? The profile should contain call
>> sites and their sample counts and you could match these up with
relocations
>> (calls) in the section?
>
>
> I did this using IR PGO instead of sample PGO so the profile data can only
> be applied in the same place in the pipeline it is generated. Even for
> sample based this would be complicated as the linker would actually need to
> generate machine basic blocks from sections to be able to accurately match
> sample counts to relocations, as there may be cold calls in hot functions.
>
> It may be useful however for the linker to directly accept an externally
> generated call graph profile. The current approach can actually do this by
> embedding it into an extra object file.
>
Also, doing this in the LLVM IR is best IMO because we can share the work
between different linkers.

>
>
>>
>>
>> It doesn't currently work for LTO as the llvm pass needs to be run
after
>>> all inlining decisions have been made and LTO codegen has to be
done with
>>> -ffunction-sections.
>>>
>> So this is just an implementation issue, right? You can make LTO run
with
>> -ffunction-sections (by setting TargetOptions.FunctionSections=true)
and
>> insert your pass in the appropriate place in the pipeline.
>>
>
> Yeah, just an implementation issue. Just need to build the pass pipeline
> differently for LTO and add a way to do -ffunction-sections in lld.
>
Yeah I implemented support to pass that down for the gold-plugin (r284140),
I guess you need to do something similar for lld.
>
> - Michael Spencer
>
>
>>
>> Thanks,
>> Tobias
>>
>> --
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> a Linux Foundation Collaborative Project.
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
<(408)%20460-2413>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/1f1f2619/attachment-0001.html>

Xinliang David Li via llvm-dev

2017-Jun-15 18:09 UTC

head link

[llvm-dev] [RFC] Profile guided section layout

On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch <
> tobias at codeaurora.org> wrote:
>
>> Hi Michael,
>>
>> This is cool stuff, thanks for sharing!
>>
>> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote:
>>
>>> The first is a new llvm pass which uses branch frequency info to
get
>>> counts for each call instruction and then adds a module flags
metatdata
>>> table of function -> function edges along with their counts.
>>>
>>> The second takes the module flags metadata and writes it into a
>>> .note.llvm.callgraph section in the object file. This currently
just dumps
>>> it as text, but could save space by reusing the string table.
>>>
>> Have you considered reading the profile in the linker and extracting
that
>> information directly from the profile? The profile should contain call
>> sites and their sample counts and you could match these up with
relocations
>> (calls) in the section?
>
>The main reason is that IPO transformations such as inlining and clonining
will change the hotness of functions, so the original profile can not be
directly for the purpose of function layout.   There is a similar support
in Gold plugin for Google GCC.

David



>
> I did this using IR PGO instead of sample PGO so the profile data can only
> be applied in the same place in the pipeline it is generated. Even for
> sample based this would be complicated as the linker would actually need to
> generate machine basic blocks from sections to be able to accurately match
> sample counts to relocations, as there may be cold calls in hot functions.
>
> It may be useful however for the linker to directly accept an externally
> generated call graph profile. The current approach can actually do this by
> embedding it into an extra object file.
>
>
>>
>>
>> It doesn't currently work for LTO as the llvm pass needs to be run
after
>>> all inlining decisions have been made and LTO codegen has to be
done with
>>> -ffunction-sections.
>>>
>> So this is just an implementation issue, right? You can make LTO run
with
>> -ffunction-sections (by setting TargetOptions.FunctionSections=true)
and
>> insert your pass in the appropriate place in the pipeline.
>>
>
> Yeah, just an implementation issue. Just need to build the pass pipeline
> differently for LTO and add a way to do -ffunction-sections in lld.
>
> - Michael Spencer
>
>
>>
>> Thanks,
>> Tobias
>>
>> --
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> a Linux Foundation Collaborative Project.
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/7031c4c3/attachment.html>

Sean Silva via llvm-dev

2017-Jun-15 21:30 UTC

head link

[llvm-dev] [RFC] Profile guided section layout

On Thu, Jun 15, 2017 at 11:09 AM, Xinliang David Li via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch <
>> tobias at codeaurora.org> wrote:
>>
>>> Hi Michael,
>>>
>>> This is cool stuff, thanks for sharing!
>>>
>>> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote:
>>>
>>>> The first is a new llvm pass which uses branch frequency info
to get
>>>> counts for each call instruction and then adds a module flags
metatdata
>>>> table of function -> function edges along with their counts.
>>>>
>>>> The second takes the module flags metadata and writes it into a
>>>> .note.llvm.callgraph section in the object file. This currently
just dumps
>>>> it as text, but could save space by reusing the string table.
>>>>
>>> Have you considered reading the profile in the linker and
extracting
>>> that information directly from the profile? The profile should
contain call
>>> sites and their sample counts and you could match these up with
relocations
>>> (calls) in the section?
>>
>>
> The main reason is that IPO transformations such as inlining and clonining
> will change the hotness of functions, so the original profile can not be
> directly for the purpose of function layout.   There is a similar support
> in Gold plugin for Google GCC.
>
Will this cause issues with ThinLTO? E.g. the thinlto backends are doing
inlining of imported functions. Do we have a mechanism for those decisions
to be reflected in a global profile for the linker to look at?

In theory the thinlto backends can keep a history of their IPO decisions in
metadata or something and emit a section for the linker to aggregate and
reconstruct an accurate global profile, but that seems relatively invasive.

-- Sean Silva

>
> David
>
>
>
>
>>
>> I did this using IR PGO instead of sample PGO so the profile data can
>> only be applied in the same place in the pipeline it is generated. Even
for
>> sample based this would be complicated as the linker would actually
need to
>> generate machine basic blocks from sections to be able to accurately
match
>> sample counts to relocations, as there may be cold calls in hot
functions.
>>
>> It may be useful however for the linker to directly accept an
externally
>> generated call graph profile. The current approach can actually do this
by
>> embedding it into an extra object file.
>>
>>
>>>
>>>
>>> It doesn't currently work for LTO as the llvm pass needs to be
run after
>>>> all inlining decisions have been made and LTO codegen has to be
done with
>>>> -ffunction-sections.
>>>>
>>> So this is just an implementation issue, right? You can make LTO
run
>>> with -ffunction-sections (by setting
TargetOptions.FunctionSections=true)
>>> and insert your pass in the appropriate place in the pipeline.
>>>
>>
>> Yeah, just an implementation issue. Just need to build the pass
pipeline
>> differently for LTO and add a way to do -ffunction-sections in lld.
>>
>> - Michael Spencer
>>
>>
>>>
>>> Thanks,
>>> Tobias
>>>
>>> --
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>> a Linux Foundation Collaborative Project.
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/99c83a7d/attachment.html>

llvm dev - Jun 2017 - [RFC] Profile guided section layout

[llvm-dev] [RFC] Profile guided section layout

[llvm-dev] [RFC] Profile guided section layout

[llvm-dev] [RFC] Profile guided section layout

[llvm-dev] [RFC] Profile guided section layout