thr3ads.net - llvm dev - [llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data [Aug 2020]

If this information is useful, please help other people find it:
Share via:

aditya kumar via llvm-dev

2020-Aug-12 17:24 UTC

[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data

> Just chiming in about the outliner stuff. (In general, I think it'sdesirable to have multiple options for how early/late a pass runs.)

I'm wondering if MachineOutliner can be augmented to add
MachineFunctionSplitter functionalities as well. If the analysis part of
MachineOutliner can allow single basic block outlining with some cost
models.

Aditya Kumar
Compiler Engineer
https://bitsimplify.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200812/c9322eec/attachment.html>

Snehasish Kumar via llvm-dev

2020-Aug-12 18:55 UTC

head link

[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data

On Wed, Aug 12, 2020 at 10:24 AM aditya kumar <hiraditya at gmail.com>
wrote:
> > Just chiming in about the outliner stuff. (In general, I think
it's
> desirable to have multiple options for how early/late a pass runs.)
>
> I'm wondering if MachineOutliner can be augmented to add
> MachineFunctionSplitter functionalities as well. If the analysis part of
> MachineOutliner can allow single basic block outlining with some cost
> models.
>
The MachineOutliner and MachineFunctionSplitter target orthogonal use
cases. Namely, the MachineOutliner optimizes for binary size while the
MachineFunctionSplitter optimizes for performance. Attempting to reconcile
the differences to fully address the opportunity along both dimensions
doesn't seem like a fruitful goal. Furthermore, the key to better
performance is not only the timing of the MachineFunctionSplitter pass but
also the extraction methodology, i.e. using basic block sections. Basic
blocks sections is a nascent feature and needs more widespread use and
rigorous testing before being incorporated with a mature, more widely used
pass. Today, we only use basic block sections for x86 ELF targets.

There is an interesting follow on though -- can we use basicblock sections
as the extraction methodology in MachineOutliner, thus lowering the
overhead of outlining? This is something we can revisit once basic block
sections is more mature.
>
> Aditya Kumar
> Compiler Engineer
> https://bitsimplify.com
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200812/e3ae342c/attachment.html>

Sriraman Tallam via llvm-dev

2020-Aug-12 19:18 UTC

head link

[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data

On Wed, Aug 12, 2020 at 11:56 AM Snehasish Kumar via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Wed, Aug 12, 2020 at 10:24 AM aditya kumar <hiraditya at
gmail.com> wrote:
>
>> > Just chiming in about the outliner stuff. (In general, I think
it's
>> desirable to have multiple options for how early/late a pass runs.)
>>
>> I'm wondering if MachineOutliner can be augmented to add
>> MachineFunctionSplitter functionalities as well. If the analysis part
of
>> MachineOutliner can allow single basic block outlining with some cost
>> models.
>>
>
> The MachineOutliner and MachineFunctionSplitter target orthogonal use
> cases. Namely, the MachineOutliner optimizes for binary size while the
> MachineFunctionSplitter optimizes for performance. Attempting to reconcile
> the differences to fully address the opportunity along both dimensions
> doesn't seem like a fruitful goal. Furthermore, the key to better
> performance is not only the timing of the MachineFunctionSplitter pass but
> also the extraction methodology, i.e. using basic block sections. Basic
> blocks sections is a nascent feature and needs more widespread use and
> rigorous testing before being incorporated with a mature, more widely used
> pass. Today, we only use basic block sections for x86 ELF targets.
>
I agree with this.  The only thing common between MachineOutliner (MO) and
MachineFunctionSplitter(MFS) is that they move code out of the function.
Otherwise, they do very different things.  MO uses powerful analyses to
determine basic blocks that could be shared across multiple functions to
outline. MFS just moves cold basic blocks out of the function.  MO could be
tweaked independently for performance , like not outlining a hot basic
block for locality reasons.

Even the mechanism to move basic blocks out of the function is very
different for MFS and MO.  MO uses call-ret semantics where MFS uses basic
block sections which just uses jump instructions.  Please see below for why
it is going to be hard for MO to use basic block sections.
>
> There is an interesting follow on though -- can we use basicblock sections
> as the extraction methodology in MachineOutliner, thus lowering the
> overhead of outlining? This is something we can revisit once basic block
> sections is more mature.
>
It is going to take some work to have MO use basic block sections and is
not straightforward.  MFS can use basic block sections trivially because
the split basic block still belongs to only one function.  However, with
MO, when a basic block is shared, the call-return semantic is the easier
choice when outlining, as it needs to know where to return to.  Basic block
sections do not have that power and we need to inject code to remember
where to return to, mimic call semantics.

>
>> Aditya Kumar
>> Compiler Engineer
>> https://bitsimplify.com
>>
>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200812/e744808d/attachment.html>

llvm dev - Aug 2020 - [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data

[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data

[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data

[llvm-dev] [RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data