Sean Silva via llvm-dev
2017-Jun-15 21:30 UTC
[llvm-dev] [RFC] Profile guided section layout
On Thu, Jun 15, 2017 at 11:09 AM, Xinliang David Li via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch < >> tobias at codeaurora.org> wrote: >> >>> Hi Michael, >>> >>> This is cool stuff, thanks for sharing! >>> >>> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote: >>> >>>> The first is a new llvm pass which uses branch frequency info to get >>>> counts for each call instruction and then adds a module flags metatdata >>>> table of function -> function edges along with their counts. >>>> >>>> The second takes the module flags metadata and writes it into a >>>> .note.llvm.callgraph section in the object file. This currently just dumps >>>> it as text, but could save space by reusing the string table. >>>> >>> Have you considered reading the profile in the linker and extracting >>> that information directly from the profile? The profile should contain call >>> sites and their sample counts and you could match these up with relocations >>> (calls) in the section? >> >> > The main reason is that IPO transformations such as inlining and clonining > will change the hotness of functions, so the original profile can not be > directly for the purpose of function layout. There is a similar support > in Gold plugin for Google GCC. >Will this cause issues with ThinLTO? E.g. the thinlto backends are doing inlining of imported functions. Do we have a mechanism for those decisions to be reflected in a global profile for the linker to look at? In theory the thinlto backends can keep a history of their IPO decisions in metadata or something and emit a section for the linker to aggregate and reconstruct an accurate global profile, but that seems relatively invasive. -- Sean Silva> > David > > > > >> >> I did this using IR PGO instead of sample PGO so the profile data can >> only be applied in the same place in the pipeline it is generated. Even for >> sample based this would be complicated as the linker would actually need to >> generate machine basic blocks from sections to be able to accurately match >> sample counts to relocations, as there may be cold calls in hot functions. >> >> It may be useful however for the linker to directly accept an externally >> generated call graph profile. The current approach can actually do this by >> embedding it into an extra object file. >> >> >>> >>> >>> It doesn't currently work for LTO as the llvm pass needs to be run after >>>> all inlining decisions have been made and LTO codegen has to be done with >>>> -ffunction-sections. >>>> >>> So this is just an implementation issue, right? You can make LTO run >>> with -ffunction-sections (by setting TargetOptions.FunctionSections=true) >>> and insert your pass in the appropriate place in the pipeline. >>> >> >> Yeah, just an implementation issue. Just need to build the pass pipeline >> differently for LTO and add a way to do -ffunction-sections in lld. >> >> - Michael Spencer >> >> >>> >>> Thanks, >>> Tobias >>> >>> -- >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >>> a Linux Foundation Collaborative Project. >>> >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/99c83a7d/attachment.html>
Xinliang David Li via llvm-dev
2017-Jun-15 21:33 UTC
[llvm-dev] [RFC] Profile guided section layout
On Thu, Jun 15, 2017 at 2:30 PM, Sean Silva <chisophugis at gmail.com> wrote:> > > On Thu, Jun 15, 2017 at 11:09 AM, Xinliang David Li via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> >> On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch < >>> tobias at codeaurora.org> wrote: >>> >>>> Hi Michael, >>>> >>>> This is cool stuff, thanks for sharing! >>>> >>>> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote: >>>> >>>>> The first is a new llvm pass which uses branch frequency info to get >>>>> counts for each call instruction and then adds a module flags metatdata >>>>> table of function -> function edges along with their counts. >>>>> >>>>> The second takes the module flags metadata and writes it into a >>>>> .note.llvm.callgraph section in the object file. This currently just dumps >>>>> it as text, but could save space by reusing the string table. >>>>> >>>> Have you considered reading the profile in the linker and extracting >>>> that information directly from the profile? The profile should contain call >>>> sites and their sample counts and you could match these up with relocations >>>> (calls) in the section? >>> >>> >> The main reason is that IPO transformations such as inlining and >> clonining will change the hotness of functions, so the original profile can >> not be directly for the purpose of function layout. There is a similar >> support in Gold plugin for Google GCC. >> > > Will this cause issues with ThinLTO? E.g. the thinlto backends are doing > inlining of imported functions. Do we have a mechanism for those decisions > to be reflected in a global profile for the linker to look at? > > In theory the thinlto backends can keep a history of their IPO decisions > in metadata or something and emit a section for the linker to aggregate and > reconstruct an accurate global profile, but that seems relatively invasive. >Yes, it will cause problems which is also known to GCC's LIPO. We have an intern working on that problem :) David> > -- Sean Silva > > >> >> David >> >> >> >> >>> >>> I did this using IR PGO instead of sample PGO so the profile data can >>> only be applied in the same place in the pipeline it is generated. Even for >>> sample based this would be complicated as the linker would actually need to >>> generate machine basic blocks from sections to be able to accurately match >>> sample counts to relocations, as there may be cold calls in hot functions. >>> >>> It may be useful however for the linker to directly accept an externally >>> generated call graph profile. The current approach can actually do this by >>> embedding it into an extra object file. >>> >>> >>>> >>>> >>>> It doesn't currently work for LTO as the llvm pass needs to be run >>>>> after all inlining decisions have been made and LTO codegen has to be done >>>>> with -ffunction-sections. >>>>> >>>> So this is just an implementation issue, right? You can make LTO run >>>> with -ffunction-sections (by setting TargetOptions.FunctionSections=true) >>>> and insert your pass in the appropriate place in the pipeline. >>>> >>> >>> Yeah, just an implementation issue. Just need to build the pass pipeline >>> differently for LTO and add a way to do -ffunction-sections in lld. >>> >>> - Michael Spencer >>> >>> >>>> >>>> Thanks, >>>> Tobias >>>> >>>> -- >>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >>>> a Linux Foundation Collaborative Project. >>>> >>>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/7b0cf718/attachment-0001.html>
Sean Silva via llvm-dev
2017-Jun-15 21:39 UTC
[llvm-dev] [RFC] Profile guided section layout
On Thu, Jun 15, 2017 at 2:33 PM, Xinliang David Li <xinliangli at gmail.com> wrote:> > > On Thu, Jun 15, 2017 at 2:30 PM, Sean Silva <chisophugis at gmail.com> wrote: > >> >> >> On Thu, Jun 15, 2017 at 11:09 AM, Xinliang David Li via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> >>> >>> On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch < >>>> tobias at codeaurora.org> wrote: >>>> >>>>> Hi Michael, >>>>> >>>>> This is cool stuff, thanks for sharing! >>>>> >>>>> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote: >>>>> >>>>>> The first is a new llvm pass which uses branch frequency info to get >>>>>> counts for each call instruction and then adds a module flags metatdata >>>>>> table of function -> function edges along with their counts. >>>>>> >>>>>> The second takes the module flags metadata and writes it into a >>>>>> .note.llvm.callgraph section in the object file. This currently just dumps >>>>>> it as text, but could save space by reusing the string table. >>>>>> >>>>> Have you considered reading the profile in the linker and extracting >>>>> that information directly from the profile? The profile should contain call >>>>> sites and their sample counts and you could match these up with relocations >>>>> (calls) in the section? >>>> >>>> >>> The main reason is that IPO transformations such as inlining and >>> clonining will change the hotness of functions, so the original profile can >>> not be directly for the purpose of function layout. There is a similar >>> support in Gold plugin for Google GCC. >>> >> >> Will this cause issues with ThinLTO? E.g. the thinlto backends are doing >> inlining of imported functions. Do we have a mechanism for those decisions >> to be reflected in a global profile for the linker to look at? >> >> In theory the thinlto backends can keep a history of their IPO decisions >> in metadata or something and emit a section for the linker to aggregate and >> reconstruct an accurate global profile, but that seems relatively invasive. >> > > > Yes, it will cause problems which is also known to GCC's LIPO. We have an > intern working on that problem :) >Nice! What approach is being used to solve it? -- Sean Silva> > David > > >> >> -- Sean Silva >> >> >>> >>> David >>> >>> >>> >>> >>>> >>>> I did this using IR PGO instead of sample PGO so the profile data can >>>> only be applied in the same place in the pipeline it is generated. Even for >>>> sample based this would be complicated as the linker would actually need to >>>> generate machine basic blocks from sections to be able to accurately match >>>> sample counts to relocations, as there may be cold calls in hot functions. >>>> >>>> It may be useful however for the linker to directly accept an >>>> externally generated call graph profile. The current approach can actually >>>> do this by embedding it into an extra object file. >>>> >>>> >>>>> >>>>> >>>>> It doesn't currently work for LTO as the llvm pass needs to be run >>>>>> after all inlining decisions have been made and LTO codegen has to be done >>>>>> with -ffunction-sections. >>>>>> >>>>> So this is just an implementation issue, right? You can make LTO run >>>>> with -ffunction-sections (by setting TargetOptions.FunctionSections=true) >>>>> and insert your pass in the appropriate place in the pipeline. >>>>> >>>> >>>> Yeah, just an implementation issue. Just need to build the pass >>>> pipeline differently for LTO and add a way to do -ffunction-sections in lld. >>>> >>>> - Michael Spencer >>>> >>>> >>>>> >>>>> Thanks, >>>>> Tobias >>>>> >>>>> -- >>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >>>>> a Linux Foundation Collaborative Project. >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/5e3dd7cb/attachment.html>