aditya kumar via llvm-dev
2020-Aug-12 17:24 UTC
[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data
> Just chiming in about the outliner stuff. (In general, I think it'sdesirable to have multiple options for how early/late a pass runs.) I'm wondering if MachineOutliner can be augmented to add MachineFunctionSplitter functionalities as well. If the analysis part of MachineOutliner can allow single basic block outlining with some cost models. Aditya Kumar Compiler Engineer https://bitsimplify.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200812/c9322eec/attachment.html>
Snehasish Kumar via llvm-dev
2020-Aug-12 18:55 UTC
[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data
On Wed, Aug 12, 2020 at 10:24 AM aditya kumar <hiraditya at gmail.com> wrote:> > Just chiming in about the outliner stuff. (In general, I think it's > desirable to have multiple options for how early/late a pass runs.) > > I'm wondering if MachineOutliner can be augmented to add > MachineFunctionSplitter functionalities as well. If the analysis part of > MachineOutliner can allow single basic block outlining with some cost > models. >The MachineOutliner and MachineFunctionSplitter target orthogonal use cases. Namely, the MachineOutliner optimizes for binary size while the MachineFunctionSplitter optimizes for performance. Attempting to reconcile the differences to fully address the opportunity along both dimensions doesn't seem like a fruitful goal. Furthermore, the key to better performance is not only the timing of the MachineFunctionSplitter pass but also the extraction methodology, i.e. using basic block sections. Basic blocks sections is a nascent feature and needs more widespread use and rigorous testing before being incorporated with a mature, more widely used pass. Today, we only use basic block sections for x86 ELF targets. There is an interesting follow on though -- can we use basicblock sections as the extraction methodology in MachineOutliner, thus lowering the overhead of outlining? This is something we can revisit once basic block sections is more mature.> > Aditya Kumar > Compiler Engineer > https://bitsimplify.com > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200812/e3ae342c/attachment.html>
Sriraman Tallam via llvm-dev
2020-Aug-12 19:18 UTC
[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data
On Wed, Aug 12, 2020 at 11:56 AM Snehasish Kumar via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Wed, Aug 12, 2020 at 10:24 AM aditya kumar <hiraditya at gmail.com> wrote: > >> > Just chiming in about the outliner stuff. (In general, I think it's >> desirable to have multiple options for how early/late a pass runs.) >> >> I'm wondering if MachineOutliner can be augmented to add >> MachineFunctionSplitter functionalities as well. If the analysis part of >> MachineOutliner can allow single basic block outlining with some cost >> models. >> > > The MachineOutliner and MachineFunctionSplitter target orthogonal use > cases. Namely, the MachineOutliner optimizes for binary size while the > MachineFunctionSplitter optimizes for performance. Attempting to reconcile > the differences to fully address the opportunity along both dimensions > doesn't seem like a fruitful goal. Furthermore, the key to better > performance is not only the timing of the MachineFunctionSplitter pass but > also the extraction methodology, i.e. using basic block sections. Basic > blocks sections is a nascent feature and needs more widespread use and > rigorous testing before being incorporated with a mature, more widely used > pass. Today, we only use basic block sections for x86 ELF targets. >I agree with this. The only thing common between MachineOutliner (MO) and MachineFunctionSplitter(MFS) is that they move code out of the function. Otherwise, they do very different things. MO uses powerful analyses to determine basic blocks that could be shared across multiple functions to outline. MFS just moves cold basic blocks out of the function. MO could be tweaked independently for performance , like not outlining a hot basic block for locality reasons. Even the mechanism to move basic blocks out of the function is very different for MFS and MO. MO uses call-ret semantics where MFS uses basic block sections which just uses jump instructions. Please see below for why it is going to be hard for MO to use basic block sections.> > There is an interesting follow on though -- can we use basicblock sections > as the extraction methodology in MachineOutliner, thus lowering the > overhead of outlining? This is something we can revisit once basic block > sections is more mature. >It is going to take some work to have MO use basic block sections and is not straightforward. MFS can use basic block sections trivially because the split basic block still belongs to only one function. However, with MO, when a basic block is shared, the call-return semantic is the easier choice when outlining, as it needs to know where to return to. Basic block sections do not have that power and we need to inject code to remember where to return to, mimic call semantics.> >> Aditya Kumar >> Compiler Engineer >> https://bitsimplify.com >> >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200812/e744808d/attachment.html>