Duncan P. N. Exon Smith
2015-May-27 18:13 UTC
[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
> On 2015 May 27, at 07:42, Diego Novillo <dnovillo at google.com> wrote: > > On Tue, May 26, 2015 at 11:47 PM, Lee Hunt <leehu at exchange.microsoft.com> wrote: > >> For example, from reading different pages on how Clang PGO, it’s unclear if >> it does “block reordering” (i.e. moving unexecuted code blocks to a distant >> code page, leaving only ‘hot’ executed code packed together for greater code >> density). I find mention of “hot arc” optimization (-fprofile-arcs) , but >> I’m unclear if this is the same thing. Does Clang PGO do block reordering? > > A small clarification. Clang itself does not implement any > optimizations. Clang limits itself to generate LLVM IR. The > annotated IR is then used by some LLVM optimizers to guide decisions. > At this time, there are few optimization passes that use the profile > information: block reordering and register allocation (to avoid > spilling on cold paths). > > There are no other significant transformations that use profiling > information. We are working on that. Notably, we'd like to add > profiling-based decisions to the inlinerJust a quick note about the inliner. Although the inliner itself doesn't know how to use the profile, clang's IRGen has been modified to add an 'inlinehint' attribute to hot functions and the 'cold' attribute to cold functions. Indirectly, PGO does affect the inliner. (We'll remove this once the inliner does the right thing on its own.)> , loop optimizers and the > vectorizer. > > > Diego. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Philip Reames
2015-May-28 16:56 UTC
[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
On 05/27/2015 11:13 AM, Duncan P. N. Exon Smith wrote:>> On 2015 May 27, at 07:42, Diego Novillo <dnovillo at google.com> wrote: >> >> On Tue, May 26, 2015 at 11:47 PM, Lee Hunt <leehu at exchange.microsoft.com> wrote: >> >>> For example, from reading different pages on how Clang PGO, it’s unclear if >>> it does “block reordering” (i.e. moving unexecuted code blocks to a distant >>> code page, leaving only ‘hot’ executed code packed together for greater code >>> density). I find mention of “hot arc” optimization (-fprofile-arcs) , but >>> I’m unclear if this is the same thing. Does Clang PGO do block reordering? >> A small clarification. Clang itself does not implement any >> optimizations. Clang limits itself to generate LLVM IR. The >> annotated IR is then used by some LLVM optimizers to guide decisions. >> At this time, there are few optimization passes that use the profile >> information: block reordering and register allocation (to avoid >> spilling on cold paths). >> >> There are no other significant transformations that use profiling >> information. We are working on that. Notably, we'd like to add >> profiling-based decisions to the inliner > Just a quick note about the inliner. Although the inliner itself > doesn't know how to use the profile, clang's IRGen has been modified > to add an 'inlinehint' attribute to hot functions and the 'cold' > attribute to cold functions. Indirectly, PGO does affect the > inliner. (We'll remove this once the inliner does the right thing on > its own.)OT: Can you give me a pointer to the clang code involved? I wasn't aware of this.> >> , loop optimizers and the >> vectorizer. >> >> >> Diego. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Duncan P. N. Exon Smith
2015-May-28 18:05 UTC
[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
> On 2015-May-28, at 09:56, Philip Reames <listmail at philipreames.com> wrote: > > > > On 05/27/2015 11:13 AM, Duncan P. N. Exon Smith wrote: >>> On 2015 May 27, at 07:42, Diego Novillo <dnovillo at google.com> wrote: >>> >>> On Tue, May 26, 2015 at 11:47 PM, Lee Hunt <leehu at exchange.microsoft.com> wrote: >>> >>>> For example, from reading different pages on how Clang PGO, it’s unclear if >>>> it does “block reordering” (i.e. moving unexecuted code blocks to a distant >>>> code page, leaving only ‘hot’ executed code packed together for greater code >>>> density). I find mention of “hot arc” optimization (-fprofile-arcs) , but >>>> I’m unclear if this is the same thing. Does Clang PGO do block reordering? >>> A small clarification. Clang itself does not implement any >>> optimizations. Clang limits itself to generate LLVM IR. The >>> annotated IR is then used by some LLVM optimizers to guide decisions. >>> At this time, there are few optimization passes that use the profile >>> information: block reordering and register allocation (to avoid >>> spilling on cold paths). >>> >>> There are no other significant transformations that use profiling >>> information. We are working on that. Notably, we'd like to add >>> profiling-based decisions to the inliner >> Just a quick note about the inliner. Although the inliner itself >> doesn't know how to use the profile, clang's IRGen has been modified >> to add an 'inlinehint' attribute to hot functions and the 'cold' >> attribute to cold functions. Indirectly, PGO does affect the >> inliner. (We'll remove this once the inliner does the right thing on >> its own.) > OT: Can you give me a pointer to the clang code involved? I wasn't aware of this.Have a look at `CodeGenPGO::applyFunctionAttributes()` around line 760 of lib/CodeGen/CodeGenPGO.cpp.
Teresa Johnson
2015-May-28 18:08 UTC
[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
On Thu, May 28, 2015 at 9:56 AM, Philip Reames <listmail at philipreames.com> wrote:> > > On 05/27/2015 11:13 AM, Duncan P. N. Exon Smith wrote: >>> >>> On 2015 May 27, at 07:42, Diego Novillo <dnovillo at google.com> wrote: >>> >>> On Tue, May 26, 2015 at 11:47 PM, Lee Hunt <leehu at exchange.microsoft.com> >>> wrote: >>> >>>> For example, from reading different pages on how Clang PGO, it’s unclear >>>> if >>>> it does “block reordering” (i.e. moving unexecuted code blocks to a >>>> distant >>>> code page, leaving only ‘hot’ executed code packed together for greater >>>> code >>>> density). I find mention of “hot arc” optimization (-fprofile-arcs) , >>>> but >>>> I’m unclear if this is the same thing. Does Clang PGO do block >>>> reordering? >>> >>> A small clarification. Clang itself does not implement any >>> optimizations. Clang limits itself to generate LLVM IR. The >>> annotated IR is then used by some LLVM optimizers to guide decisions. >>> At this time, there are few optimization passes that use the profile >>> information: block reordering and register allocation (to avoid >>> spilling on cold paths). >>> >>> There are no other significant transformations that use profiling >>> information. We are working on that. Notably, we'd like to add >>> profiling-based decisions to the inliner >> >> Just a quick note about the inliner. Although the inliner itself >> doesn't know how to use the profile, clang's IRGen has been modified >> to add an 'inlinehint' attribute to hot functions and the 'cold' >> attribute to cold functions. Indirectly, PGO does affect the >> inliner. (We'll remove this once the inliner does the right thing on >> its own.) > > OT: Can you give me a pointer to the clang code involved? I wasn't aware of > this.It is set in clang/lib/CodeGen/CodeGenPGO.cpp CodeGenPGO::applyFunctionAttributes. Note that it uses the function entry count to determine hotness. This means that functions entered infrequently but containing very hot loops would be marked cold, perhaps this works since it is only used for inlining and is presumably a stand-in for call edge hotness. The MaxFunctionCount for the profile is also the max of all the function entry counts (set during profile writing). Teresa> >> >>> , loop optimizers and the >>> vectorizer. >>> >>> >>> Diego. >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
Apparently Analagous Threads
- [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
- [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
- [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
- [LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)
- [LLVMdev] FW: Capabilities of Clang's PGO (e.g. improving code density)