thr3ads.net - llvm dev - [llvm-dev] [RFC] Add IR level interprocedural outliner for code size. [Aug 2017]

If this information is useful, please help other people find it:
Share via:

Andrey Bokhanko via llvm-dev

2017-Aug-01 08:17 UTC

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

...and adding $0.02 to the "IR outline + inline + MIR outline" idea,
my gut
feeling (yes, only a "feeling" -- and one coming from my gut, not
head!) is
that inlining correcting wrong IR outlining decisions with MIR outlining
correcting wrong inlining decisions is absolutely unrealistic and a
heuristics nightmare at best.

Inliner's heuristics are already complex enough and not 100% bulletproof;
if we add IR outliner heuristics to the mix -- and then just a little bit
of MIR outliner heuristics (which are more precise but, as demonstrated
above, not 100% precise as well) on top... you can imagine.

Yours,
Andrey


On Tue, Aug 1, 2017 at 10:07 AM, Andrey Bokhanko <andreybokhanko at
gmail.com>
wrote:
> All,
>
> +1 to what Mehdi said.
>
> It's a fair concern to question whatever we need yet another Outlining
> pass. I believe this concern has been cleared by River -- both with
> theoretical arguments and practical data (benchmark numbers).
>
> Jessica's pipeline proposal is completely orthogonal. It's not fair
to
> request River to implement / fit into what she suggested. Sure, it's a
> valid topic to discuss -- but yet completely orthogonal one. If anything,
> accepting River's implementation would enable us to do experiments /
> developments like pipeline changes of this ilk!
>
> Yours,
> Andrey
> ==> Compiler Architect
> NXP
>
>
> On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> Apologies for delayed joining of this discussion, but I had a few
notes
>>> from this thread that I really wanted to chime in about.
>>>
>>> River,
>>>
>>> I don't mean to put you on the spot, but I do want to start on
a
>>> semantic issue. In several places in the thread you used the words
"we" and
>>> "our" to imply that you're not alone in writing this
(which is totally
>>> fine), but your initial thread presented this as entirely your own
work.
>>> So, when you said things like "we feel there's an
advantage to being at the
>>> IR level", can you please clarify who is "we"?
>>>
>>> Given that there are a number of disagreements and opinions
floating
>>> around I think it benefits us all to speak clearly about who is
taking what
>>> stances.
>>>
>>> One particular disagreement that I think very much needs to be
revisited
>>> in this thread was Jessica's proposal of a pipeline of:
>>>
>>>    1. IR outline
>>>    2. Inline
>>>    3. MIR outline
>>>
>>> In your response to that proposal you dismissed it out of hand with
>>> "feelings" but not data. Given that the proposal came
from Jessica (a
>>> community member with significant relevant experience in
outlining), and it
>>> was also recognized as interesting by Eric Christopher (a long-time
member
>>> of the community with wide reaching expertise), I think dismissing
it may
>>> have been a little premature.
>>>
>>
>> It isn't clear to me how much the *exact* pipeline and ordering of
passes
>> is relevant to consider if "having an outliner at the IR
level" is a good
>> idea.
>>
>>
>>
>>> I also want to visit a few procedural notes.
>>>
>>> Mehdi commented on the thread that it wouldn't be fair to ask
for a
>>> comparative study because the MIR outliner didn't have one.
While I don't
>>> think anyone is asking for a comparative study, I want to point out
that I
>>> think it is completely fair.
>>>
>> If a new contributor approached the community with a new SROA pass and
>>> wanted to land it in-tree it would be appropriate to ask for a
comparative
>>> analysis against the existing pass. How is this different?
>>>
>>
>> It seems quite different to me because there is no outliner at the IR
>> level and so they don't provide the same functionality. The
"Why at the IR
>> level" section of the original email combined with the performance
numbers
>> seems largely enough to me to explain why it isn't redundant to the
>> Machine-level outliner.
>> I'd consider this work for inclusion upstream purely on its
technical
>> merit at this point.
>> Discussing inclusion as part of any of the default pipeline is a
>> different story.
>>
>> Similarly last year, the IR-level PGO was also implemented even though
we
>> already had a PGO implementation, because 1) it provided a generic
>> solutions for other frontend (just like here it could be said that it
>> provides a generic solution for targets) and 2) it supported cases that
>> FE-PGO didn't (especially around better counter-context using
pre-inlining
>> and such).
>>
>>
>>
>>>
>>> Adding a new IR outliner is a different situation from when the MIR
one
>>> was added. When the MIR outliner was introduced there was no
in-tree
>>> analog.
>>>
>>
>> We still usually discuss design extensively. Skipping the IR-level
option
>> didn't seem obvious to me, to say the least. And it wasn't
really much
>> discussed/considered extensively upstream.
>> If the idea is that implementing a concept at the machine level may
>> preclude a future implementation at the IR level, it means we should be
*a
>> lot* more picky before accepting such contribution.
>> In this case, if I had anticipated any push-back on an IR-level
>> implementation only based on the fact that we have now a Machine-level
one,
>> I'd likely have pushed back on the machine-level one.
>>
>>
>>
>>> When someone comes to the community with something that has no
existing
>>> in-tree analog it isn't fair to necessarily ask them to
implement it
>>> multiple different ways to prove their solution is the best.
>>>
>>
>> It may or may not be fair, but there is a tradeoff in how much effort
we
>> would require them to convince the community that this is *the* right
way
>> to go, depending on what it implies for future approaches.
>>
>> --
>> Mehdi
>>
>>
>>> However, as a community, we do still exercise the right to reject
>>> contributions we disagree with, and we frequently request changes
to the
>>> implementation (as is shown every time someone tries to add SPIR-V
support).
>>>
>>> In the LLVM community we have a long history of approaching large
>>> contributions (especially ones from new contributors) with scrutiny
and
>>> discussion. It would be a disservice to the project to forget that.
>>>
>>> River, as a last note. I see that you've started uploading
patches to
>>> Phabricator, and I know you're relatively new to the community.
When
>>> uploading patches it helps to include appropriate reviewers so that
the
>>> right people see the patches as they come in. To that end can you
please
>>> include Jessica as a reviewer? Given her relevant domain experience
I think
>>> her feedback on the patches will be very valuable.
>>>
>>> Thank you,
>>> -Chris
>>>
>>> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hey Sanjoy,
>>>
>>> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi,
>>>>
>>>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at
gmail.com>
>>>> wrote:
>>>> > The way I interpret Quentin's statement is something
like:
>>>> >
>>>> > - Inlining turns an interprocedural problem into an
intraprocedural
>>>> problem
>>>> > - Outlining turns an intraprocedural problem into an
interprocedural
>>>> problem
>>>> >
>>>> > Insofar as our intraprocedural analyses and
transformations are
>>>> strictly
>>>> > more powerful than interprocedural, then there is a
precise sense in
>>>> which
>>>> > inlining exposes optimization opportunities while
outlining does not.
>>>>
>>>> While I think our intra-proc optimizations are *generally* more
>>>> powerful, I don't think they are *always* more powerful. 
For
>>>> instance, LICM (today) won't hoist full regions but it can
hoist
>>>> single function calls.  If we can extract out a region into a
>>>> readnone+nounwind function call then LICM will hoist it to the
>>>> preheader if the safety checks pass.
>>>>
>>>> > Actually, for his internship last summer River wrote a
profile-guided
>>>> > outliner / partial inliner (it didn't try to do
deduplication; so it
>>>> was
>>>> > more like PartialInliner.cpp). IIRC he found that
LLVM's
>>>> interprocedural
>>>> > analyses were so bad that there were pretty adverse
effects from many
>>>> of the
>>>> > outlining decisions. E.g. if you outline from the left
side of a
>>>> diamond,
>>>> > that side basically becomes a black box to most LLVM
analyses and
>>>> forces
>>>> > downstream dataflow meet points to give an overly
conservative
>>>> result, even
>>>> > though our standard intraprocedural analyses would have
happily dug
>>>> through
>>>> > the left side of the diamond if the code had not been
outlined.
>>>> >
>>>> > Also, River's patch (the one in this thread) does
parameterized
>>>> outlining.
>>>> > For example, two sequences containing stores can be
outlined even if
>>>> the
>>>> > corresponding stores have different pointers. The pointer
to be
>>>> loaded from
>>>> > is passed as a parameter to the outlined function. In that
sense, the
>>>> > outlined function's behavior becomes a conservative
approximation of
>>>> both
>>>> > which in principle loses precision.
>>>>
>>>> Can we outline only once we've already done all of these
optimizations
>>>> that outlining would block?
>>>>
>>>
>>>   The outliner is able to run at any point in the interprocedural
>>> pipeline. There are currently two locations: Early outlining(pre
inliner)
>>> and late outlining(practically the last pass to run). It is
configured to
>>> run either Early+Late, or just Late.
>>>
>>>
>>>> > I like your EarlyCSE example and it is interesting that
combined with
>>>> > functionattrs it can make a "cheap" pass get a
transformation that an
>>>> > "expensive" pass would otherwise be needed. Are
there any cases where
>>>> we
>>>> > only have the "cheap" pass and thus the
outlining would be essential
>>>> for our
>>>> > optimization pipeline to get the optimization right?
>>>> >
>>>> > The case that comes to mind for me is cases where we have
some cutoff
>>>> of
>>>> > search depth. Reducing a sequence to a single call (+
functionattr
>>>> > inference) can essentially summarize the sequence and
effectively
>>>> increase
>>>> > search depth, which might give more results. That seems
like a bit of
>>>> a weak
>>>> > example though.
>>>>
>>>> I don't know if River's patch outlines entire control
flow regions at
>>>> a time, but if it does then we could use cheap basic block
scanning
>>>> analyses for things that would normally require CFG-level
analysis.
>>>>
>>>
>>>   The current patch currently just supports outlining from within a
>>> single block. Although, I had a working prototype for Region based
>>> outlining, I kept it from this patch for simplicity. So its
entirely
>>> possible to add that kind of functionality because I've already
tried.
>>> Thanks,
>>>   River Riddle
>>>
>>>
>>>>
>>>> -- Sanjoy
>>>>
>>>> >
>>>> > -- Sean Silva
>>>> >
>>>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev
>>>> > <llvm-dev at lists.llvm.org> wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via
llvm-dev
>>>> >> <llvm-dev at lists.llvm.org> wrote:
>>>> >> > No, I mean in terms of enabling other
optimizations in the
>>>> pipeline like
>>>> >> > vectorizer. Outliner does not expose any of that.
>>>> >>
>>>> >> I have not made a lot of effort to understand the full
discussion
>>>> here (so
>>>> >> what
>>>> >> I say below may be off-base), but I think there are
some cases where
>>>> >> outlining
>>>> >> (especially working with function-attrs) can make
optimization
>>>> easier.
>>>> >>
>>>> >> It can help transforms that duplicate code (like loop
unrolling and
>>>> >> inlining) be
>>>> >> more profitable -- I'm thinking of cases where
unrolling/inlining
>>>> would
>>>> >> have to
>>>> >> duplicate a lot of code, but after outlining would
require
>>>> duplicating
>>>> >> only a
>>>> >> few call instructions.
>>>> >>
>>>> >>
>>>> >> It can help EarlyCSE do things that require GVN today:
>>>> >>
>>>> >> void foo() {
>>>> >>   ... complex computation that computes func()
>>>> >>   ... complex computation that computes func()
>>>> >> }
>>>> >>
>>>> >> outlining=>
>>>> >>
>>>> >> int func() { ... }
>>>> >>
>>>> >> void foo() {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> functionattrs=>
>>>> >>
>>>> >> int func() readonly { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> earlycse=>
>>>> >>
>>>> >> int func(int t) readnone { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func(a);
>>>> >>   int y = x;
>>>> >> }
>>>> >>
>>>> >> GVN will catch this, but EarlyCSE is (at least
supposed to be!)
>>>> cheaper.
>>>> >>
>>>> >>
>>>> >> Once we have an analysis that can prove that certain
functions can't
>>>> trap,
>>>> >> outlining can allow LICM etc. to speculate entire
outlined regions
>>>> out of
>>>> >> loops.
>>>> >>
>>>> >>
>>>> >> Generally, I think outlining exposes information that
certain
>>>> regions of
>>>> >> the
>>>> >> program are doing identical things.  We should expect
to get some
>>>> mileage
>>>> >> out of
>>>> >> this information.
>>>> >>
>>>> >> -- Sanjoy
>>>> >> _______________________________________________
>>>> >> LLVM Developers mailing list
>>>> >> llvm-dev at lists.llvm.org
>>>> >>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >
>>>> >
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/49ddd482/attachment.html>

Sean Silva via llvm-dev

2017-Aug-01 09:53 UTC

head link

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

On Aug 1, 2017 1:18 AM, "Andrey Bokhanko via llvm-dev" <
llvm-dev at lists.llvm.org> wrote:

...and adding $0.02 to the "IR outline + inline + MIR outline" idea,
my gut
feeling (yes, only a "feeling" -- and one coming from my gut, not
head!) is
that inlining correcting wrong IR outlining decisions with MIR outlining
correcting wrong inlining decisions is absolutely unrealistic and a
heuristics nightmare at best.

Inliner's heuristics are already complex enough and not 100% bulletproof;
if we add IR outliner heuristics to the mix -- and then just a little bit
of MIR outliner heuristics (which are more precise but, as demonstrated
above, not 100% precise as well)


The post-RA MachineOutliner can in principle have a perfect cost model (and
it's already pretty close I think).

However, the post-RA MachineOutliner would nonetheless have a hard time
undoing an inlining decision because it would have to get lucky with the
register assignment matching at different inlining sites. (same applies to
scheduling decisions)

I can't think of any plausible way to avoid that sort of problem. A pre-RA
MIR outliner could avoid needing to get lucky on the register assignment at
least, but any string-based algorithm will inherently be thwarted by
different scheduling.

-- Sean Silva


on top... you can imagine.

Yours,
Andrey


On Tue, Aug 1, 2017 at 10:07 AM, Andrey Bokhanko <andreybokhanko at
gmail.com>
wrote:
> All,
>
> +1 to what Mehdi said.
>
> It's a fair concern to question whatever we need yet another Outlining
> pass. I believe this concern has been cleared by River -- both with
> theoretical arguments and practical data (benchmark numbers).
>
> Jessica's pipeline proposal is completely orthogonal. It's not fair
to
> request River to implement / fit into what she suggested. Sure, it's a
> valid topic to discuss -- but yet completely orthogonal one. If anything,
> accepting River's implementation would enable us to do experiments /
> developments like pipeline changes of this ilk!
>
> Yours,
> Andrey
> ==> Compiler Architect
> NXP
>
>
> On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> Apologies for delayed joining of this discussion, but I had a few
notes
>>> from this thread that I really wanted to chime in about.
>>>
>>> River,
>>>
>>> I don't mean to put you on the spot, but I do want to start on
a
>>> semantic issue. In several places in the thread you used the words
"we" and
>>> "our" to imply that you're not alone in writing this
(which is totally
>>> fine), but your initial thread presented this as entirely your own
work.
>>> So, when you said things like "we feel there's an
advantage to being at the
>>> IR level", can you please clarify who is "we"?
>>>
>>> Given that there are a number of disagreements and opinions
floating
>>> around I think it benefits us all to speak clearly about who is
taking what
>>> stances.
>>>
>>> One particular disagreement that I think very much needs to be
revisited
>>> in this thread was Jessica's proposal of a pipeline of:
>>>
>>>    1. IR outline
>>>    2. Inline
>>>    3. MIR outline
>>>
>>> In your response to that proposal you dismissed it out of hand with
>>> "feelings" but not data. Given that the proposal came
from Jessica (a
>>> community member with significant relevant experience in
outlining), and it
>>> was also recognized as interesting by Eric Christopher (a long-time
member
>>> of the community with wide reaching expertise), I think dismissing
it may
>>> have been a little premature.
>>>
>>
>> It isn't clear to me how much the *exact* pipeline and ordering of
passes
>> is relevant to consider if "having an outliner at the IR
level" is a good
>> idea.
>>
>>
>>
>>> I also want to visit a few procedural notes.
>>>
>>> Mehdi commented on the thread that it wouldn't be fair to ask
for a
>>> comparative study because the MIR outliner didn't have one.
While I don't
>>> think anyone is asking for a comparative study, I want to point out
that I
>>> think it is completely fair.
>>>
>> If a new contributor approached the community with a new SROA pass and
>>> wanted to land it in-tree it would be appropriate to ask for a
comparative
>>> analysis against the existing pass. How is this different?
>>>
>>
>> It seems quite different to me because there is no outliner at the IR
>> level and so they don't provide the same functionality. The
"Why at the IR
>> level" section of the original email combined with the performance
numbers
>> seems largely enough to me to explain why it isn't redundant to the
>> Machine-level outliner.
>> I'd consider this work for inclusion upstream purely on its
technical
>> merit at this point.
>> Discussing inclusion as part of any of the default pipeline is a
>> different story.
>>
>> Similarly last year, the IR-level PGO was also implemented even though
we
>> already had a PGO implementation, because 1) it provided a generic
>> solutions for other frontend (just like here it could be said that it
>> provides a generic solution for targets) and 2) it supported cases that
>> FE-PGO didn't (especially around better counter-context using
pre-inlining
>> and such).
>>
>>
>>
>>>
>>> Adding a new IR outliner is a different situation from when the MIR
one
>>> was added. When the MIR outliner was introduced there was no
in-tree
>>> analog.
>>>
>>
>> We still usually discuss design extensively. Skipping the IR-level
option
>> didn't seem obvious to me, to say the least. And it wasn't
really much
>> discussed/considered extensively upstream.
>> If the idea is that implementing a concept at the machine level may
>> preclude a future implementation at the IR level, it means we should be
*a
>> lot* more picky before accepting such contribution.
>> In this case, if I had anticipated any push-back on an IR-level
>> implementation only based on the fact that we have now a Machine-level
one,
>> I'd likely have pushed back on the machine-level one.
>>
>>
>>
>>> When someone comes to the community with something that has no
existing
>>> in-tree analog it isn't fair to necessarily ask them to
implement it
>>> multiple different ways to prove their solution is the best.
>>>
>>
>> It may or may not be fair, but there is a tradeoff in how much effort
we
>> would require them to convince the community that this is *the* right
way
>> to go, depending on what it implies for future approaches.
>>
>> --
>> Mehdi
>>
>>
>>> However, as a community, we do still exercise the right to reject
>>> contributions we disagree with, and we frequently request changes
to the
>>> implementation (as is shown every time someone tries to add SPIR-V
support).
>>>
>>> In the LLVM community we have a long history of approaching large
>>> contributions (especially ones from new contributors) with scrutiny
and
>>> discussion. It would be a disservice to the project to forget that.
>>>
>>> River, as a last note. I see that you've started uploading
patches to
>>> Phabricator, and I know you're relatively new to the community.
When
>>> uploading patches it helps to include appropriate reviewers so that
the
>>> right people see the patches as they come in. To that end can you
please
>>> include Jessica as a reviewer? Given her relevant domain experience
I think
>>> her feedback on the patches will be very valuable.
>>>
>>> Thank you,
>>> -Chris
>>>
>>> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hey Sanjoy,
>>>
>>> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi,
>>>>
>>>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at
gmail.com>
>>>> wrote:
>>>> > The way I interpret Quentin's statement is something
like:
>>>> >
>>>> > - Inlining turns an interprocedural problem into an
intraprocedural
>>>> problem
>>>> > - Outlining turns an intraprocedural problem into an
interprocedural
>>>> problem
>>>> >
>>>> > Insofar as our intraprocedural analyses and
transformations are
>>>> strictly
>>>> > more powerful than interprocedural, then there is a
precise sense in
>>>> which
>>>> > inlining exposes optimization opportunities while
outlining does not.
>>>>
>>>> While I think our intra-proc optimizations are *generally* more
>>>> powerful, I don't think they are *always* more powerful. 
For
>>>> instance, LICM (today) won't hoist full regions but it can
hoist
>>>> single function calls.  If we can extract out a region into a
>>>> readnone+nounwind function call then LICM will hoist it to the
>>>> preheader if the safety checks pass.
>>>>
>>>> > Actually, for his internship last summer River wrote a
profile-guided
>>>> > outliner / partial inliner (it didn't try to do
deduplication; so it
>>>> was
>>>> > more like PartialInliner.cpp). IIRC he found that
LLVM's
>>>> interprocedural
>>>> > analyses were so bad that there were pretty adverse
effects from many
>>>> of the
>>>> > outlining decisions. E.g. if you outline from the left
side of a
>>>> diamond,
>>>> > that side basically becomes a black box to most LLVM
analyses and
>>>> forces
>>>> > downstream dataflow meet points to give an overly
conservative
>>>> result, even
>>>> > though our standard intraprocedural analyses would have
happily dug
>>>> through
>>>> > the left side of the diamond if the code had not been
outlined.
>>>> >
>>>> > Also, River's patch (the one in this thread) does
parameterized
>>>> outlining.
>>>> > For example, two sequences containing stores can be
outlined even if
>>>> the
>>>> > corresponding stores have different pointers. The pointer
to be
>>>> loaded from
>>>> > is passed as a parameter to the outlined function. In that
sense, the
>>>> > outlined function's behavior becomes a conservative
approximation of
>>>> both
>>>> > which in principle loses precision.
>>>>
>>>> Can we outline only once we've already done all of these
optimizations
>>>> that outlining would block?
>>>>
>>>
>>>   The outliner is able to run at any point in the interprocedural
>>> pipeline. There are currently two locations: Early outlining(pre
inliner)
>>> and late outlining(practically the last pass to run). It is
configured to
>>> run either Early+Late, or just Late.
>>>
>>>
>>>> > I like your EarlyCSE example and it is interesting that
combined with
>>>> > functionattrs it can make a "cheap" pass get a
transformation that an
>>>> > "expensive" pass would otherwise be needed. Are
there any cases where
>>>> we
>>>> > only have the "cheap" pass and thus the
outlining would be essential
>>>> for our
>>>> > optimization pipeline to get the optimization right?
>>>> >
>>>> > The case that comes to mind for me is cases where we have
some cutoff
>>>> of
>>>> > search depth. Reducing a sequence to a single call (+
functionattr
>>>> > inference) can essentially summarize the sequence and
effectively
>>>> increase
>>>> > search depth, which might give more results. That seems
like a bit of
>>>> a weak
>>>> > example though.
>>>>
>>>> I don't know if River's patch outlines entire control
flow regions at
>>>> a time, but if it does then we could use cheap basic block
scanning
>>>> analyses for things that would normally require CFG-level
analysis.
>>>>
>>>
>>>   The current patch currently just supports outlining from within a
>>> single block. Although, I had a working prototype for Region based
>>> outlining, I kept it from this patch for simplicity. So its
entirely
>>> possible to add that kind of functionality because I've already
tried.
>>> Thanks,
>>>   River Riddle
>>>
>>>
>>>>
>>>> -- Sanjoy
>>>>
>>>> >
>>>> > -- Sean Silva
>>>> >
>>>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev
>>>> > <llvm-dev at lists.llvm.org> wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via
llvm-dev
>>>> >> <llvm-dev at lists.llvm.org> wrote:
>>>> >> > No, I mean in terms of enabling other
optimizations in the
>>>> pipeline like
>>>> >> > vectorizer. Outliner does not expose any of that.
>>>> >>
>>>> >> I have not made a lot of effort to understand the full
discussion
>>>> here (so
>>>> >> what
>>>> >> I say below may be off-base), but I think there are
some cases where
>>>> >> outlining
>>>> >> (especially working with function-attrs) can make
optimization
>>>> easier.
>>>> >>
>>>> >> It can help transforms that duplicate code (like loop
unrolling and
>>>> >> inlining) be
>>>> >> more profitable -- I'm thinking of cases where
unrolling/inlining
>>>> would
>>>> >> have to
>>>> >> duplicate a lot of code, but after outlining would
require
>>>> duplicating
>>>> >> only a
>>>> >> few call instructions.
>>>> >>
>>>> >>
>>>> >> It can help EarlyCSE do things that require GVN today:
>>>> >>
>>>> >> void foo() {
>>>> >>   ... complex computation that computes func()
>>>> >>   ... complex computation that computes func()
>>>> >> }
>>>> >>
>>>> >> outlining=>
>>>> >>
>>>> >> int func() { ... }
>>>> >>
>>>> >> void foo() {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> functionattrs=>
>>>> >>
>>>> >> int func() readonly { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> earlycse=>
>>>> >>
>>>> >> int func(int t) readnone { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func(a);
>>>> >>   int y = x;
>>>> >> }
>>>> >>
>>>> >> GVN will catch this, but EarlyCSE is (at least
supposed to be!)
>>>> cheaper.
>>>> >>
>>>> >>
>>>> >> Once we have an analysis that can prove that certain
functions can't
>>>> trap,
>>>> >> outlining can allow LICM etc. to speculate entire
outlined regions
>>>> out of
>>>> >> loops.
>>>> >>
>>>> >>
>>>> >> Generally, I think outlining exposes information that
certain
>>>> regions of
>>>> >> the
>>>> >> program are doing identical things.  We should expect
to get some
>>>> mileage
>>>> >> out of
>>>> >> this information.
>>>> >>
>>>> >> -- Sanjoy
>>>> >> _______________________________________________
>>>> >> LLVM Developers mailing list
>>>> >> llvm-dev at lists.llvm.org
>>>> >>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >
>>>> >
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/d80774e6/attachment-0001.html>

Sean Silva via llvm-dev

2017-Aug-01 10:16 UTC

head link

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

On Aug 1, 2017 2:53 AM, "Sean Silva" <chisophugis at gmail.com>
wrote:



On Aug 1, 2017 1:18 AM, "Andrey Bokhanko via llvm-dev" <
llvm-dev at lists.llvm.org> wrote:

...and adding $0.02 to the "IR outline + inline + MIR outline" idea,
my gut
feeling (yes, only a "feeling" -- and one coming from my gut, not
head!) is
that inlining correcting wrong IR outlining decisions with MIR outlining
correcting wrong inlining decisions is absolutely unrealistic and a
heuristics nightmare at best.

Inliner's heuristics are already complex enough and not 100% bulletproof;
if we add IR outliner heuristics to the mix -- and then just a little bit
of MIR outliner heuristics (which are more precise but, as demonstrated
above, not 100% precise as well)


The post-RA MachineOutliner can in principle have a perfect cost model (and
it's already pretty close I think).

However, the post-RA MachineOutliner would nonetheless have a hard time
undoing an inlining decision because it would have to get lucky with the
register assignment matching at different inlining sites. (same applies to
scheduling decisions)

I can't think of any plausible way to avoid that sort of problem. A pre-RA
MIR outliner could avoid needing to get lucky on the register assignment at
least, but any string-based algorithm will inherently be thwarted by
different scheduling.


And I think that at least with Selection DAG we don't really have a notion
of "pre-sched" MIR. I'm not super familiar with backend stuff
though. I
assume GlobalIsel must have pre-sched MIR available.

-- Sean Silva


-- Sean Silva


on top... you can imagine.

Yours,
Andrey


On Tue, Aug 1, 2017 at 10:07 AM, Andrey Bokhanko <andreybokhanko at
gmail.com>
wrote:
> All,
>
> +1 to what Mehdi said.
>
> It's a fair concern to question whatever we need yet another Outlining
> pass. I believe this concern has been cleared by River -- both with
> theoretical arguments and practical data (benchmark numbers).
>
> Jessica's pipeline proposal is completely orthogonal. It's not fair
to
> request River to implement / fit into what she suggested. Sure, it's a
> valid topic to discuss -- but yet completely orthogonal one. If anything,
> accepting River's implementation would enable us to do experiments /
> developments like pipeline changes of this ilk!
>
> Yours,
> Andrey
> ==> Compiler Architect
> NXP
>
>
> On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> Apologies for delayed joining of this discussion, but I had a few
notes
>>> from this thread that I really wanted to chime in about.
>>>
>>> River,
>>>
>>> I don't mean to put you on the spot, but I do want to start on
a
>>> semantic issue. In several places in the thread you used the words
"we" and
>>> "our" to imply that you're not alone in writing this
(which is totally
>>> fine), but your initial thread presented this as entirely your own
work.
>>> So, when you said things like "we feel there's an
advantage to being at the
>>> IR level", can you please clarify who is "we"?
>>>
>>> Given that there are a number of disagreements and opinions
floating
>>> around I think it benefits us all to speak clearly about who is
taking what
>>> stances.
>>>
>>> One particular disagreement that I think very much needs to be
revisited
>>> in this thread was Jessica's proposal of a pipeline of:
>>>
>>>    1. IR outline
>>>    2. Inline
>>>    3. MIR outline
>>>
>>> In your response to that proposal you dismissed it out of hand with
>>> "feelings" but not data. Given that the proposal came
from Jessica (a
>>> community member with significant relevant experience in
outlining), and it
>>> was also recognized as interesting by Eric Christopher (a long-time
member
>>> of the community with wide reaching expertise), I think dismissing
it may
>>> have been a little premature.
>>>
>>
>> It isn't clear to me how much the *exact* pipeline and ordering of
passes
>> is relevant to consider if "having an outliner at the IR
level" is a good
>> idea.
>>
>>
>>
>>> I also want to visit a few procedural notes.
>>>
>>> Mehdi commented on the thread that it wouldn't be fair to ask
for a
>>> comparative study because the MIR outliner didn't have one.
While I don't
>>> think anyone is asking for a comparative study, I want to point out
that I
>>> think it is completely fair.
>>>
>> If a new contributor approached the community with a new SROA pass and
>>> wanted to land it in-tree it would be appropriate to ask for a
comparative
>>> analysis against the existing pass. How is this different?
>>>
>>
>> It seems quite different to me because there is no outliner at the IR
>> level and so they don't provide the same functionality. The
"Why at the IR
>> level" section of the original email combined with the performance
numbers
>> seems largely enough to me to explain why it isn't redundant to the
>> Machine-level outliner.
>> I'd consider this work for inclusion upstream purely on its
technical
>> merit at this point.
>> Discussing inclusion as part of any of the default pipeline is a
>> different story.
>>
>> Similarly last year, the IR-level PGO was also implemented even though
we
>> already had a PGO implementation, because 1) it provided a generic
>> solutions for other frontend (just like here it could be said that it
>> provides a generic solution for targets) and 2) it supported cases that
>> FE-PGO didn't (especially around better counter-context using
pre-inlining
>> and such).
>>
>>
>>
>>>
>>> Adding a new IR outliner is a different situation from when the MIR
one
>>> was added. When the MIR outliner was introduced there was no
in-tree
>>> analog.
>>>
>>
>> We still usually discuss design extensively. Skipping the IR-level
option
>> didn't seem obvious to me, to say the least. And it wasn't
really much
>> discussed/considered extensively upstream.
>> If the idea is that implementing a concept at the machine level may
>> preclude a future implementation at the IR level, it means we should be
*a
>> lot* more picky before accepting such contribution.
>> In this case, if I had anticipated any push-back on an IR-level
>> implementation only based on the fact that we have now a Machine-level
one,
>> I'd likely have pushed back on the machine-level one.
>>
>>
>>
>>> When someone comes to the community with something that has no
existing
>>> in-tree analog it isn't fair to necessarily ask them to
implement it
>>> multiple different ways to prove their solution is the best.
>>>
>>
>> It may or may not be fair, but there is a tradeoff in how much effort
we
>> would require them to convince the community that this is *the* right
way
>> to go, depending on what it implies for future approaches.
>>
>> --
>> Mehdi
>>
>>
>>> However, as a community, we do still exercise the right to reject
>>> contributions we disagree with, and we frequently request changes
to the
>>> implementation (as is shown every time someone tries to add SPIR-V
support).
>>>
>>> In the LLVM community we have a long history of approaching large
>>> contributions (especially ones from new contributors) with scrutiny
and
>>> discussion. It would be a disservice to the project to forget that.
>>>
>>> River, as a last note. I see that you've started uploading
patches to
>>> Phabricator, and I know you're relatively new to the community.
When
>>> uploading patches it helps to include appropriate reviewers so that
the
>>> right people see the patches as they come in. To that end can you
please
>>> include Jessica as a reviewer? Given her relevant domain experience
I think
>>> her feedback on the patches will be very valuable.
>>>
>>> Thank you,
>>> -Chris
>>>
>>> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hey Sanjoy,
>>>
>>> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi,
>>>>
>>>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at
gmail.com>
>>>> wrote:
>>>> > The way I interpret Quentin's statement is something
like:
>>>> >
>>>> > - Inlining turns an interprocedural problem into an
intraprocedural
>>>> problem
>>>> > - Outlining turns an intraprocedural problem into an
interprocedural
>>>> problem
>>>> >
>>>> > Insofar as our intraprocedural analyses and
transformations are
>>>> strictly
>>>> > more powerful than interprocedural, then there is a
precise sense in
>>>> which
>>>> > inlining exposes optimization opportunities while
outlining does not.
>>>>
>>>> While I think our intra-proc optimizations are *generally* more
>>>> powerful, I don't think they are *always* more powerful. 
For
>>>> instance, LICM (today) won't hoist full regions but it can
hoist
>>>> single function calls.  If we can extract out a region into a
>>>> readnone+nounwind function call then LICM will hoist it to the
>>>> preheader if the safety checks pass.
>>>>
>>>> > Actually, for his internship last summer River wrote a
profile-guided
>>>> > outliner / partial inliner (it didn't try to do
deduplication; so it
>>>> was
>>>> > more like PartialInliner.cpp). IIRC he found that
LLVM's
>>>> interprocedural
>>>> > analyses were so bad that there were pretty adverse
effects from many
>>>> of the
>>>> > outlining decisions. E.g. if you outline from the left
side of a
>>>> diamond,
>>>> > that side basically becomes a black box to most LLVM
analyses and
>>>> forces
>>>> > downstream dataflow meet points to give an overly
conservative
>>>> result, even
>>>> > though our standard intraprocedural analyses would have
happily dug
>>>> through
>>>> > the left side of the diamond if the code had not been
outlined.
>>>> >
>>>> > Also, River's patch (the one in this thread) does
parameterized
>>>> outlining.
>>>> > For example, two sequences containing stores can be
outlined even if
>>>> the
>>>> > corresponding stores have different pointers. The pointer
to be
>>>> loaded from
>>>> > is passed as a parameter to the outlined function. In that
sense, the
>>>> > outlined function's behavior becomes a conservative
approximation of
>>>> both
>>>> > which in principle loses precision.
>>>>
>>>> Can we outline only once we've already done all of these
optimizations
>>>> that outlining would block?
>>>>
>>>
>>>   The outliner is able to run at any point in the interprocedural
>>> pipeline. There are currently two locations: Early outlining(pre
inliner)
>>> and late outlining(practically the last pass to run). It is
configured to
>>> run either Early+Late, or just Late.
>>>
>>>
>>>> > I like your EarlyCSE example and it is interesting that
combined with
>>>> > functionattrs it can make a "cheap" pass get a
transformation that an
>>>> > "expensive" pass would otherwise be needed. Are
there any cases where
>>>> we
>>>> > only have the "cheap" pass and thus the
outlining would be essential
>>>> for our
>>>> > optimization pipeline to get the optimization right?
>>>> >
>>>> > The case that comes to mind for me is cases where we have
some cutoff
>>>> of
>>>> > search depth. Reducing a sequence to a single call (+
functionattr
>>>> > inference) can essentially summarize the sequence and
effectively
>>>> increase
>>>> > search depth, which might give more results. That seems
like a bit of
>>>> a weak
>>>> > example though.
>>>>
>>>> I don't know if River's patch outlines entire control
flow regions at
>>>> a time, but if it does then we could use cheap basic block
scanning
>>>> analyses for things that would normally require CFG-level
analysis.
>>>>
>>>
>>>   The current patch currently just supports outlining from within a
>>> single block. Although, I had a working prototype for Region based
>>> outlining, I kept it from this patch for simplicity. So its
entirely
>>> possible to add that kind of functionality because I've already
tried.
>>> Thanks,
>>>   River Riddle
>>>
>>>
>>>>
>>>> -- Sanjoy
>>>>
>>>> >
>>>> > -- Sean Silva
>>>> >
>>>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev
>>>> > <llvm-dev at lists.llvm.org> wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via
llvm-dev
>>>> >> <llvm-dev at lists.llvm.org> wrote:
>>>> >> > No, I mean in terms of enabling other
optimizations in the
>>>> pipeline like
>>>> >> > vectorizer. Outliner does not expose any of that.
>>>> >>
>>>> >> I have not made a lot of effort to understand the full
discussion
>>>> here (so
>>>> >> what
>>>> >> I say below may be off-base), but I think there are
some cases where
>>>> >> outlining
>>>> >> (especially working with function-attrs) can make
optimization
>>>> easier.
>>>> >>
>>>> >> It can help transforms that duplicate code (like loop
unrolling and
>>>> >> inlining) be
>>>> >> more profitable -- I'm thinking of cases where
unrolling/inlining
>>>> would
>>>> >> have to
>>>> >> duplicate a lot of code, but after outlining would
require
>>>> duplicating
>>>> >> only a
>>>> >> few call instructions.
>>>> >>
>>>> >>
>>>> >> It can help EarlyCSE do things that require GVN today:
>>>> >>
>>>> >> void foo() {
>>>> >>   ... complex computation that computes func()
>>>> >>   ... complex computation that computes func()
>>>> >> }
>>>> >>
>>>> >> outlining=>
>>>> >>
>>>> >> int func() { ... }
>>>> >>
>>>> >> void foo() {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> functionattrs=>
>>>> >>
>>>> >> int func() readonly { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func();
>>>> >>   int y = func();
>>>> >> }
>>>> >>
>>>> >> earlycse=>
>>>> >>
>>>> >> int func(int t) readnone { ... }
>>>> >>
>>>> >> void foo(int a, int b) {
>>>> >>   int x = func(a);
>>>> >>   int y = x;
>>>> >> }
>>>> >>
>>>> >> GVN will catch this, but EarlyCSE is (at least
supposed to be!)
>>>> cheaper.
>>>> >>
>>>> >>
>>>> >> Once we have an analysis that can prove that certain
functions can't
>>>> trap,
>>>> >> outlining can allow LICM etc. to speculate entire
outlined regions
>>>> out of
>>>> >> loops.
>>>> >>
>>>> >>
>>>> >> Generally, I think outlining exposes information that
certain
>>>> regions of
>>>> >> the
>>>> >> program are doing identical things.  We should expect
to get some
>>>> mileage
>>>> >> out of
>>>> >> this information.
>>>> >>
>>>> >> -- Sanjoy
>>>> >> _______________________________________________
>>>> >> LLVM Developers mailing list
>>>> >> llvm-dev at lists.llvm.org
>>>> >>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >
>>>> >
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/a272cb15/attachment.html>

Jessica Paquette via llvm-dev

2017-Aug-01 16:15 UTC

head link

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

I think I agree with this sentiment (and the rather heated response to this
suggestion says that I ought to as well!)

The later pass would have to have a strong guarantee to “undo” bad decisions,
which, given that we have to use heuristics to make code size decisions, is very
difficult. I don’t know if I’d say *impossible*, but it would definitely force
us to work in a more restrictive space. I was mostly just thinking of how the
two separate technologies might work together in practice.

For example, in “Function outlining and partial inlining” by Peng Zhao and J. N.
Amaral, they found that you could get partial inlining for free by enabling
inlining and outlining at the IR-level. What I think would be cool to explore is
whether or not there would be a similar relationship between IR-level outlining
and MIR-level outlining. Of course, this is just a suggestion!

(Paper for anyone interested:
https://www.researchgate.net/profile/Jose_Amaral3/publication/221306447_Function_Outlining_and_Partial_Inlining/links/00463522754960618b000000.pdf
<https://www.researchgate.net/profile/Jose_Amaral3/publication/221306447_Function_Outlining_and_Partial_Inlining/links/00463522754960618b000000.pdf>)

- Jessica
> On Aug 1, 2017, at 1:17 AM, Andrey Bokhanko via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> ...and adding $0.02 to the "IR outline + inline + MIR outline"
idea, my gut feeling (yes, only a "feeling" -- and one coming from my
gut, not head!) is that inlining correcting wrong IR outlining decisions with
MIR outlining correcting wrong inlining decisions is absolutely unrealistic and
a heuristics nightmare at best.
> 
> Inliner's heuristics are already complex enough and not 100%
bulletproof; if we add IR outliner heuristics to the mix -- and then just a
little bit of MIR outliner heuristics (which are more precise but, as
demonstrated above, not 100% precise as well) on top... you can imagine.
> 
> Yours,
> Andrey
> 
> 
> On Tue, Aug 1, 2017 at 10:07 AM, Andrey Bokhanko <andreybokhanko at
gmail.com <mailto:andreybokhanko at gmail.com>> wrote:
> All,
> 
> +1 to what Mehdi said.
> 
> It's a fair concern to question whatever we need yet another Outlining
pass. I believe this concern has been cleared by River -- both with theoretical
arguments and practical data (benchmark numbers).
> 
> Jessica's pipeline proposal is completely orthogonal. It's not fair
to request River to implement / fit into what she suggested. Sure, it's a
valid topic to discuss -- but yet completely orthogonal one. If anything,
accepting River's implementation would enable us to do experiments /
developments like pipeline changes of this ilk!
> 
> Yours,
> Andrey
> ==> Compiler Architect
> NXP
> 
> 
> On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
> 
> 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>:
> Apologies for delayed joining of this discussion, but I had a few notes
from this thread that I really wanted to chime in about.
> 
> River,
> 
> I don't mean to put you on the spot, but I do want to start on a
semantic issue. In several places in the thread you used the words
"we" and "our" to imply that you're not alone in writing
this (which is totally fine), but your initial thread presented this as entirely
your own work. So, when you said things like "we feel there's an
advantage to being at the IR level", can you please clarify who is
"we"?
> 
> Given that there are a number of disagreements and opinions floating around
I think it benefits us all to speak clearly about who is taking what stances.
> 
> One particular disagreement that I think very much needs to be revisited in
this thread was Jessica's proposal of a pipeline of:
> IR outline
> Inline
> MIR outline
> In your response to that proposal you dismissed it out of hand with
"feelings" but not data. Given that the proposal came from Jessica (a
community member with significant relevant experience in outlining), and it was
also recognized as interesting by Eric Christopher (a long-time member of the
community with wide reaching expertise), I think dismissing it may have been a
little premature.
> 
> It isn't clear to me how much the *exact* pipeline and ordering of
passes is relevant to consider if "having an outliner at the IR level"
is a good idea.
> 
>  
> I also want to visit a few procedural notes.
> 
> Mehdi commented on the thread that it wouldn't be fair to ask for a
comparative study because the MIR outliner didn't have one. While I
don't think anyone is asking for a comparative study, I want to point out
that I think it is completely fair.
> If a new contributor approached the community with a new SROA pass and
wanted to land it in-tree it would be appropriate to ask for a comparative
analysis against the existing pass. How is this different?
> 
> It seems quite different to me because there is no outliner at the IR level
and so they don't provide the same functionality. The "Why at the IR
level" section of the original email combined with the performance numbers
seems largely enough to me to explain why it isn't redundant to the
Machine-level outliner.
> I'd consider this work for inclusion upstream purely on its technical
merit at this point.
> Discussing inclusion as part of any of the default pipeline is a different
story.
> 
> Similarly last year, the IR-level PGO was also implemented even though we
already had a PGO implementation, because 1) it provided a generic solutions for
other frontend (just like here it could be said that it provides a generic
solution for targets) and 2) it supported cases that FE-PGO didn't
(especially around better counter-context using pre-inlining and such).
> 
>  
> 
> Adding a new IR outliner is a different situation from when the MIR one was
added. When the MIR outliner was introduced there was no in-tree analog.
> 
> We still usually discuss design extensively. Skipping the IR-level option
didn't seem obvious to me, to say the least. And it wasn't really much
discussed/considered extensively upstream.
> If the idea is that implementing a concept at the machine level may
preclude a future implementation at the IR level, it means we should be *a lot*
more picky before accepting such contribution.
> In this case, if I had anticipated any push-back on an IR-level
implementation only based on the fact that we have now a Machine-level one,
I'd likely have pushed back on the machine-level one.
> 
>  
> When someone comes to the community with something that has no existing
in-tree analog it isn't fair to necessarily ask them to implement it
multiple different ways to prove their solution is the best.
> 
> It may or may not be fair, but there is a tradeoff in how much effort we
would require them to convince the community that this is *the* right way to go,
depending on what it implies for future approaches.
> 
> -- 
> Mehdi
>  
> However, as a community, we do still exercise the right to reject
contributions we disagree with, and we frequently request changes to the
implementation (as is shown every time someone tries to add SPIR-V support).
> 
> In the LLVM community we have a long history of approaching large
contributions (especially ones from new contributors) with scrutiny and
discussion. It would be a disservice to the project to forget that.
> 
> River, as a last note. I see that you've started uploading patches to
Phabricator, and I know you're relatively new to the community. When
uploading patches it helps to include appropriate reviewers so that the right
people see the patches as they come in. To that end can you please include
Jessica as a reviewer? Given her relevant domain experience I think her feedback
on the patches will be very valuable.
> 
> Thank you,
> -Chris
> 
>> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>> Hey Sanjoy,
>>   
>> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <llvm-dev
at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> Hi,
>> 
>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at
gmail.com <mailto:chisophugis at gmail.com>> wrote:
>> > The way I interpret Quentin's statement is something like:
>> >
>> > - Inlining turns an interprocedural problem into an
intraprocedural problem
>> > - Outlining turns an intraprocedural problem into an
interprocedural problem
>> >
>> > Insofar as our intraprocedural analyses and transformations are
strictly
>> > more powerful than interprocedural, then there is a precise sense
in which
>> > inlining exposes optimization opportunities while outlining does
not.
>> 
>> While I think our intra-proc optimizations are *generally* more
>> powerful, I don't think they are *always* more powerful.  For
>> instance, LICM (today) won't hoist full regions but it can hoist
>> single function calls.  If we can extract out a region into a
>> readnone+nounwind function call then LICM will hoist it to the
>> preheader if the safety checks pass.
>> 
>> > Actually, for his internship last summer River wrote a
profile-guided
>> > outliner / partial inliner (it didn't try to do deduplication;
so it was
>> > more like PartialInliner.cpp). IIRC he found that LLVM's
interprocedural
>> > analyses were so bad that there were pretty adverse effects from
many of the
>> > outlining decisions. E.g. if you outline from the left side of a
diamond,
>> > that side basically becomes a black box to most LLVM analyses and
forces
>> > downstream dataflow meet points to give an overly conservative
result, even
>> > though our standard intraprocedural analyses would have happily
dug through
>> > the left side of the diamond if the code had not been outlined.
>> >
>> > Also, River's patch (the one in this thread) does
parameterized outlining.
>> > For example, two sequences containing stores can be outlined even
if the
>> > corresponding stores have different pointers. The pointer to be
loaded from
>> > is passed as a parameter to the outlined function. In that sense,
the
>> > outlined function's behavior becomes a conservative
approximation of both
>> > which in principle loses precision.
>> 
>> Can we outline only once we've already done all of these
optimizations
>> that outlining would block?
>>  
>>   The outliner is able to run at any point in the interprocedural
pipeline. There are currently two locations: Early outlining(pre inliner) and
late outlining(practically the last pass to run). It is configured to run either
Early+Late, or just Late.
>> 
>> 
>> > I like your EarlyCSE example and it is interesting that combined
with
>> > functionattrs it can make a "cheap" pass get a
transformation that an
>> > "expensive" pass would otherwise be needed. Are there
any cases where we
>> > only have the "cheap" pass and thus the outlining would
be essential for our
>> > optimization pipeline to get the optimization right?
>> >
>> > The case that comes to mind for me is cases where we have some
cutoff of
>> > search depth. Reducing a sequence to a single call (+ functionattr
>> > inference) can essentially summarize the sequence and effectively
increase
>> > search depth, which might give more results. That seems like a bit
of a weak
>> > example though.
>> 
>> I don't know if River's patch outlines entire control flow
regions at
>> a time, but if it does then we could use cheap basic block scanning
>> analyses for things that would normally require CFG-level analysis.
>> 
>>   The current patch currently just supports outlining from within a
single block. Although, I had a working prototype for Region based outlining, I
kept it from this patch for simplicity. So its entirely possible to add that
kind of functionality because I've already tried.
>> Thanks,
>>   River Riddle
>>  
>> 
>> -- Sanjoy
>> 
>> >
>> > -- Sean Silva
>> >
>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev
>> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via
llvm-dev
>> >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>> >> > No, I mean in terms of enabling other optimizations in
the pipeline like
>> >> > vectorizer. Outliner does not expose any of that.
>> >>
>> >> I have not made a lot of effort to understand the full
discussion here (so
>> >> what
>> >> I say below may be off-base), but I think there are some cases
where
>> >> outlining
>> >> (especially working with function-attrs) can make optimization
easier.
>> >>
>> >> It can help transforms that duplicate code (like loop
unrolling and
>> >> inlining) be
>> >> more profitable -- I'm thinking of cases where
unrolling/inlining would
>> >> have to
>> >> duplicate a lot of code, but after outlining would require
duplicating
>> >> only a
>> >> few call instructions.
>> >>
>> >>
>> >> It can help EarlyCSE do things that require GVN today:
>> >>
>> >> void foo() {
>> >>   ... complex computation that computes func()
>> >>   ... complex computation that computes func()
>> >> }
>> >>
>> >> outlining=>
>> >>
>> >> int func() { ... }
>> >>
>> >> void foo() {
>> >>   int x = func();
>> >>   int y = func();
>> >> }
>> >>
>> >> functionattrs=>
>> >>
>> >> int func() readonly { ... }
>> >>
>> >> void foo(int a, int b) {
>> >>   int x = func();
>> >>   int y = func();
>> >> }
>> >>
>> >> earlycse=>
>> >>
>> >> int func(int t) readnone { ... }
>> >>
>> >> void foo(int a, int b) {
>> >>   int x = func(a);
>> >>   int y = x;
>> >> }
>> >>
>> >> GVN will catch this, but EarlyCSE is (at least supposed to
be!) cheaper.
>> >>
>> >>
>> >> Once we have an analysis that can prove that certain functions
can't trap,
>> >> outlining can allow LICM etc. to speculate entire outlined
regions out of
>> >> loops.
>> >>
>> >>
>> >> Generally, I think outlining exposes information that certain
regions of
>> >> the
>> >> program are doing identical things.  We should expect to get
some mileage
>> >> out of
>> >> this information.
>> >>
>> >> -- Sanjoy
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>> >
>> >
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/5a1cbb0f/attachment.html>

River Riddle via llvm-dev

2017-Aug-01 17:36 UTC

head link

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

Hey Jessica,
  I was interested in the same relationship you are. Last summer i
implemented many variations of the algorithm that is outlined in the paper
you've provided, even including hot switch splitting.

When I implemented this last year there were several road blocks: Inliner
not always playing nicely, the inliner didn't preserve pgo data at that
point, aa was not very forgiving about the loss of information, etc. Now
that some of these problems have been improved/fixed it would be
interesting to review what can be gained from this type of transformation.

   It's also important to note that the two outlining transformations would
use completely different algorithms, as they are satisfying different
benefit functions.
 The algorithm used for enabling partial inlining is pgo driven, making it
very context, or occurrence, sensitive. You are looking at subsections of
the graph to identify cold sections that could increase performance if
outlined. There isn't a concept for more than once occurrence because the
benefit function is very sensitive to the pgo data of that one instance of
the subgraph. Given that you are only considering one instance you don't
have to worry about trying to solve graph isomorphism, which substantially
increases the amount of candidates you can consider. This type of outlining
also interacts much better with the inliner because all of the cost
modeling for both is contained within the context of a single call.

So effectively we have two different types of outliners, which I'm assuming
is part of the reason why Dan B brought up the semantics of what to call
these transformations in the original MachineOutliner RFC. Having
implemented both, I can attest to there not really being a lot of shared
code. Providing pgo data to the IR outliner will somewhat emulate this
behavior, but the amount of benefit to performance is not likely to be as
high. I'm sensing that this will hold even if/when we extend IR outlining
to handle regions as well.

We kind of have an outliner of this type in the form of the partial inliner
in LLVM. It currently only cares about enabling more inlining but that
could definitely be changed or extended depending on the performance
benefits that we could get.

Thanks,
  River Riddle

On Tue, Aug 1, 2017 at 9:15 AM, Jessica Paquette via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I think I agree with this sentiment (and the rather heated response to
> this suggestion says that I ought to as well!)
>
> The later pass would have to have a strong guarantee to “undo” bad
> decisions, which, given that we have to use heuristics to make code size
> decisions, is very difficult. I don’t know if I’d say *impossible*, but it
> would definitely force us to work in a more restrictive space. I was mostly
> just thinking of how the two separate technologies might work together in
> practice.
>
> For example, in “Function outlining and partial inlining” by Peng Zhao and
> J. N. Amaral, they found that you could get partial inlining for free by
> enabling inlining and outlining at the IR-level. What I think would be cool
> to explore is whether or not there would be a similar relationship between
> IR-level outlining and MIR-level outlining. Of course, this is just a
> suggestion!
>
> (Paper for anyone interested: https://www.researchgate.net/
> profile/Jose_Amaral3/publication/221306447_Function_Outlining_and_
> Partial_Inlining/links/00463522754960618b000000.pdf)
>
> - Jessica
>
>
> On Aug 1, 2017, at 1:17 AM, Andrey Bokhanko via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> ...and adding $0.02 to the "IR outline + inline + MIR outline"
idea, my
> gut feeling (yes, only a "feeling" -- and one coming from my gut,
not
> head!) is that inlining correcting wrong IR outlining decisions with MIR
> outlining correcting wrong inlining decisions is absolutely unrealistic and
> a heuristics nightmare at best.
>
> Inliner's heuristics are already complex enough and not 100%
bulletproof;
> if we add IR outliner heuristics to the mix -- and then just a little bit
> of MIR outliner heuristics (which are more precise but, as demonstrated
> above, not 100% precise as well) on top... you can imagine.
>
> Yours,
> Andrey
>
>
> On Tue, Aug 1, 2017 at 10:07 AM, Andrey Bokhanko <andreybokhanko at
gmail.com
> > wrote:
>
>> All,
>>
>> +1 to what Mehdi said.
>>
>> It's a fair concern to question whatever we need yet another
Outlining
>> pass. I believe this concern has been cleared by River -- both with
>> theoretical arguments and practical data (benchmark numbers).
>>
>> Jessica's pipeline proposal is completely orthogonal. It's not
fair to
>> request River to implement / fit into what she suggested. Sure,
it's a
>> valid topic to discuss -- but yet completely orthogonal one. If
anything,
>> accepting River's implementation would enable us to do experiments
/
>> developments like pipeline changes of this ilk!
>>
>> Yours,
>> Andrey
>> ==>> Compiler Architect
>> NXP
>>
>>
>> On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>
>>> 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev <
>>> llvm-dev at lists.llvm.org>:
>>>
>>>> Apologies for delayed joining of this discussion, but I had a
few notes
>>>> from this thread that I really wanted to chime in about.
>>>>
>>>> River,
>>>>
>>>> I don't mean to put you on the spot, but I do want to start
on a
>>>> semantic issue. In several places in the thread you used the
words "we" and
>>>> "our" to imply that you're not alone in writing
this (which is totally
>>>> fine), but your initial thread presented this as entirely your
own work.
>>>> So, when you said things like "we feel there's an
advantage to being at the
>>>> IR level", can you please clarify who is "we"?
>>>>
>>>> Given that there are a number of disagreements and opinions
floating
>>>> around I think it benefits us all to speak clearly about who is
taking what
>>>> stances.
>>>>
>>>> One particular disagreement that I think very much needs to be
>>>> revisited in this thread was Jessica's proposal of a
pipeline of:
>>>>
>>>>    1. IR outline
>>>>    2. Inline
>>>>    3. MIR outline
>>>>
>>>> In your response to that proposal you dismissed it out of hand
with
>>>> "feelings" but not data. Given that the proposal came
from Jessica (a
>>>> community member with significant relevant experience in
outlining), and it
>>>> was also recognized as interesting by Eric Christopher (a
long-time member
>>>> of the community with wide reaching expertise), I think
dismissing it may
>>>> have been a little premature.
>>>>
>>>
>>> It isn't clear to me how much the *exact* pipeline and ordering
of
>>> passes is relevant to consider if "having an outliner at the
IR level" is a
>>> good idea.
>>>
>>>
>>>
>>>> I also want to visit a few procedural notes.
>>>>
>>>> Mehdi commented on the thread that it wouldn't be fair to
ask for a
>>>> comparative study because the MIR outliner didn't have one.
While I don't
>>>> think anyone is asking for a comparative study, I want to point
out that I
>>>> think it is completely fair.
>>>>
>>> If a new contributor approached the community with a new SROA pass
and
>>>> wanted to land it in-tree it would be appropriate to ask for a
comparative
>>>> analysis against the existing pass. How is this different?
>>>>
>>>
>>> It seems quite different to me because there is no outliner at the
IR
>>> level and so they don't provide the same functionality. The
"Why at the IR
>>> level" section of the original email combined with the
performance numbers
>>> seems largely enough to me to explain why it isn't redundant to
the
>>> Machine-level outliner.
>>> I'd consider this work for inclusion upstream purely on its
technical
>>> merit at this point.
>>> Discussing inclusion as part of any of the default pipeline is a
>>> different story.
>>>
>>> Similarly last year, the IR-level PGO was also implemented even
though
>>> we already had a PGO implementation, because 1) it provided a
generic
>>> solutions for other frontend (just like here it could be said that
it
>>> provides a generic solution for targets) and 2) it supported cases
that
>>> FE-PGO didn't (especially around better counter-context using
pre-inlining
>>> and such).
>>>
>>>
>>>
>>>>
>>>> Adding a new IR outliner is a different situation from when the
MIR one
>>>> was added. When the MIR outliner was introduced there was no
in-tree
>>>> analog.
>>>>
>>>
>>> We still usually discuss design extensively. Skipping the IR-level
>>> option didn't seem obvious to me, to say the least. And it
wasn't really
>>> much discussed/considered extensively upstream.
>>> If the idea is that implementing a concept at the machine level may
>>> preclude a future implementation at the IR level, it means we
should be *a
>>> lot* more picky before accepting such contribution.
>>> In this case, if I had anticipated any push-back on an IR-level
>>> implementation only based on the fact that we have now a
Machine-level one,
>>> I'd likely have pushed back on the machine-level one.
>>>
>>>
>>>
>>>> When someone comes to the community with something that has no
existing
>>>> in-tree analog it isn't fair to necessarily ask them to
implement it
>>>> multiple different ways to prove their solution is the best.
>>>>
>>>
>>> It may or may not be fair, but there is a tradeoff in how much
effort we
>>> would require them to convince the community that this is *the*
right way
>>> to go, depending on what it implies for future approaches.
>>>
>>> --
>>> Mehdi
>>>
>>>
>>>> However, as a community, we do still exercise the right to
reject
>>>> contributions we disagree with, and we frequently request
changes to the
>>>> implementation (as is shown every time someone tries to add
SPIR-V support).
>>>>
>>>> In the LLVM community we have a long history of approaching
large
>>>> contributions (especially ones from new contributors) with
scrutiny and
>>>> discussion. It would be a disservice to the project to forget
that.
>>>>
>>>> River, as a last note. I see that you've started uploading
patches to
>>>> Phabricator, and I know you're relatively new to the
community. When
>>>> uploading patches it helps to include appropriate reviewers so
that the
>>>> right people see the patches as they come in. To that end can
you please
>>>> include Jessica as a reviewer? Given her relevant domain
experience I think
>>>> her feedback on the patches will be very valuable.
>>>>
>>>> Thank you,
>>>> -Chris
>>>>
>>>> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> Hey Sanjoy,
>>>>
>>>> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva
<chisophugis at gmail.com>
>>>>> wrote:
>>>>> > The way I interpret Quentin's statement is
something like:
>>>>> >
>>>>> > - Inlining turns an interprocedural problem into an
intraprocedural
>>>>> problem
>>>>> > - Outlining turns an intraprocedural problem into an
interprocedural
>>>>> problem
>>>>> >
>>>>> > Insofar as our intraprocedural analyses and
transformations are
>>>>> strictly
>>>>> > more powerful than interprocedural, then there is a
precise sense in
>>>>> which
>>>>> > inlining exposes optimization opportunities while
outlining does not.
>>>>>
>>>>> While I think our intra-proc optimizations are *generally*
more
>>>>> powerful, I don't think they are *always* more
powerful.  For
>>>>> instance, LICM (today) won't hoist full regions but it
can hoist
>>>>> single function calls.  If we can extract out a region into
a
>>>>> readnone+nounwind function call then LICM will hoist it to
the
>>>>> preheader if the safety checks pass.
>>>>>
>>>>> > Actually, for his internship last summer River wrote a
profile-guided
>>>>> > outliner / partial inliner (it didn't try to do
deduplication; so it
>>>>> was
>>>>> > more like PartialInliner.cpp). IIRC he found that
LLVM's
>>>>> interprocedural
>>>>> > analyses were so bad that there were pretty adverse
effects from
>>>>> many of the
>>>>> > outlining decisions. E.g. if you outline from the left
side of a
>>>>> diamond,
>>>>> > that side basically becomes a black box to most LLVM
analyses and
>>>>> forces
>>>>> > downstream dataflow meet points to give an overly
conservative
>>>>> result, even
>>>>> > though our standard intraprocedural analyses would
have happily dug
>>>>> through
>>>>> > the left side of the diamond if the code had not been
outlined.
>>>>> >
>>>>> > Also, River's patch (the one in this thread) does
parameterized
>>>>> outlining.
>>>>> > For example, two sequences containing stores can be
outlined even if
>>>>> the
>>>>> > corresponding stores have different pointers. The
pointer to be
>>>>> loaded from
>>>>> > is passed as a parameter to the outlined function. In
that sense, the
>>>>> > outlined function's behavior becomes a
conservative approximation of
>>>>> both
>>>>> > which in principle loses precision.
>>>>>
>>>>> Can we outline only once we've already done all of
these optimizations
>>>>> that outlining would block?
>>>>>
>>>>
>>>>   The outliner is able to run at any point in the
interprocedural
>>>> pipeline. There are currently two locations: Early
outlining(pre inliner)
>>>> and late outlining(practically the last pass to run). It is
configured to
>>>> run either Early+Late, or just Late.
>>>>
>>>>
>>>>> > I like your EarlyCSE example and it is interesting
that combined with
>>>>> > functionattrs it can make a "cheap" pass get
a transformation that an
>>>>> > "expensive" pass would otherwise be needed.
Are there any cases
>>>>> where we
>>>>> > only have the "cheap" pass and thus the
outlining would be essential
>>>>> for our
>>>>> > optimization pipeline to get the optimization right?
>>>>> >
>>>>> > The case that comes to mind for me is cases where we
have some
>>>>> cutoff of
>>>>> > search depth. Reducing a sequence to a single call (+
functionattr
>>>>> > inference) can essentially summarize the sequence and
effectively
>>>>> increase
>>>>> > search depth, which might give more results. That
seems like a bit
>>>>> of a weak
>>>>> > example though.
>>>>>
>>>>> I don't know if River's patch outlines entire
control flow regions at
>>>>> a time, but if it does then we could use cheap basic block
scanning
>>>>> analyses for things that would normally require CFG-level
analysis.
>>>>>
>>>>
>>>>   The current patch currently just supports outlining from
within a
>>>> single block. Although, I had a working prototype for Region
based
>>>> outlining, I kept it from this patch for simplicity. So its
entirely
>>>> possible to add that kind of functionality because I've
already tried.
>>>> Thanks,
>>>>   River Riddle
>>>>
>>>>
>>>>>
>>>>> -- Sanjoy
>>>>>
>>>>> >
>>>>> > -- Sean Silva
>>>>> >
>>>>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via
llvm-dev
>>>>> > <llvm-dev at lists.llvm.org> wrote:
>>>>> >>
>>>>> >> Hi,
>>>>> >>
>>>>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet
via llvm-dev
>>>>> >> <llvm-dev at lists.llvm.org> wrote:
>>>>> >> > No, I mean in terms of enabling other
optimizations in the
>>>>> pipeline like
>>>>> >> > vectorizer. Outliner does not expose any of
that.
>>>>> >>
>>>>> >> I have not made a lot of effort to understand the
full discussion
>>>>> here (so
>>>>> >> what
>>>>> >> I say below may be off-base), but I think there
are some cases where
>>>>> >> outlining
>>>>> >> (especially working with function-attrs) can make
optimization
>>>>> easier.
>>>>> >>
>>>>> >> It can help transforms that duplicate code (like
loop unrolling and
>>>>> >> inlining) be
>>>>> >> more profitable -- I'm thinking of cases where
unrolling/inlining
>>>>> would
>>>>> >> have to
>>>>> >> duplicate a lot of code, but after outlining would
require
>>>>> duplicating
>>>>> >> only a
>>>>> >> few call instructions.
>>>>> >>
>>>>> >>
>>>>> >> It can help EarlyCSE do things that require GVN
today:
>>>>> >>
>>>>> >> void foo() {
>>>>> >>   ... complex computation that computes func()
>>>>> >>   ... complex computation that computes func()
>>>>> >> }
>>>>> >>
>>>>> >> outlining=>
>>>>> >>
>>>>> >> int func() { ... }
>>>>> >>
>>>>> >> void foo() {
>>>>> >>   int x = func();
>>>>> >>   int y = func();
>>>>> >> }
>>>>> >>
>>>>> >> functionattrs=>
>>>>> >>
>>>>> >> int func() readonly { ... }
>>>>> >>
>>>>> >> void foo(int a, int b) {
>>>>> >>   int x = func();
>>>>> >>   int y = func();
>>>>> >> }
>>>>> >>
>>>>> >> earlycse=>
>>>>> >>
>>>>> >> int func(int t) readnone { ... }
>>>>> >>
>>>>> >> void foo(int a, int b) {
>>>>> >>   int x = func(a);
>>>>> >>   int y = x;
>>>>> >> }
>>>>> >>
>>>>> >> GVN will catch this, but EarlyCSE is (at least
supposed to be!)
>>>>> cheaper.
>>>>> >>
>>>>> >>
>>>>> >> Once we have an analysis that can prove that
certain functions
>>>>> can't trap,
>>>>> >> outlining can allow LICM etc. to speculate entire
outlined regions
>>>>> out of
>>>>> >> loops.
>>>>> >>
>>>>> >>
>>>>> >> Generally, I think outlining exposes information
that certain
>>>>> regions of
>>>>> >> the
>>>>> >> program are doing identical things.  We should
expect to get some
>>>>> mileage
>>>>> >> out of
>>>>> >> this information.
>>>>> >>
>>>>> >> -- Sanjoy
>>>>> >> _______________________________________________
>>>>> >> LLVM Developers mailing list
>>>>> >> llvm-dev at lists.llvm.org
>>>>> >>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> >
>>>>> >
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/5d1beca7/attachment.html>

llvm dev - Aug 2017 - [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.