thr3ads.net - llvm dev - [llvm-dev] [RFC] Add IR level interprocedural outliner for code size. [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2017-Jul-26 19:07 UTC

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

Hi,

On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> No, I mean in terms of enabling other optimizations in the pipeline like
> vectorizer. Outliner does not expose any of that.
I have not made a lot of effort to understand the full discussion here (so what
I say below may be off-base), but I think there are some cases where outlining
(especially working with function-attrs) can make optimization easier.

It can help transforms that duplicate code (like loop unrolling and inlining) be
more profitable -- I'm thinking of cases where unrolling/inlining would have
to
duplicate a lot of code, but after outlining would require duplicating only a
few call instructions.


It can help EarlyCSE do things that require GVN today:

void foo() {
  ... complex computation that computes func()
  ... complex computation that computes func()
}

outlining=>

int func() { ... }

void foo() {
  int x = func();
  int y = func();
}

functionattrs=>

int func() readonly { ... }

void foo(int a, int b) {
  int x = func();
  int y = func();
}

earlycse=>

int func(int t) readnone { ... }

void foo(int a, int b) {
  int x = func(a);
  int y = x;
}

GVN will catch this, but EarlyCSE is (at least supposed to be!) cheaper.


Once we have an analysis that can prove that certain functions can't trap,
outlining can allow LICM etc. to speculate entire outlined regions out of loops.


Generally, I think outlining exposes information that certain regions of the
program are doing identical things.  We should expect to get some mileage out of
this information.

-- Sanjoy

Sean Silva via llvm-dev

2017-Jul-26 19:54 UTC

head link

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

The way I interpret Quentin's statement is something like:

- Inlining turns an interprocedural problem into an intraprocedural problem
- Outlining turns an intraprocedural problem into an interprocedural problem

Insofar as our intraprocedural analyses and transformations are strictly
more powerful than interprocedural, then there is a precise sense in which
inlining exposes optimization opportunities while outlining does not.

Actually, for his internship last summer River wrote a profile-guided
outliner / partial inliner (it didn't try to do deduplication; so it was
more like PartialInliner.cpp). IIRC he found that LLVM's interprocedural
analyses were so bad that there were pretty adverse effects from many of
the outlining decisions. E.g. if you outline from the left side of a
diamond, that side basically becomes a black box to most LLVM analyses and
forces downstream dataflow meet points to give an overly conservative
result, even though our standard intraprocedural analyses would have
happily dug through the left side of the diamond if the code had not been
outlined.

Also, River's patch (the one in this thread) does parameterized outlining.
For example, two sequences containing stores can be outlined even if the
corresponding stores have different pointers. The pointer to be loaded from
is passed as a parameter to the outlined function. In that sense, the
outlined function's behavior becomes a conservative approximation of both
which in principle loses precision.

I like your EarlyCSE example and it is interesting that combined with
functionattrs it can make a "cheap" pass get a transformation that an
"expensive" pass would otherwise be needed. Are there any cases where
we
only have the "cheap" pass and thus the outlining would be essential
for
our optimization pipeline to get the optimization right?

The case that comes to mind for me is cases where we have some cutoff of
search depth. Reducing a sequence to a single call (+ functionattr
inference) can essentially summarize the sequence and effectively increase
search depth, which might give more results. That seems like a bit of a
weak example though.

-- Sean Silva

On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > No, I mean in terms of enabling other optimizations in the pipeline
like
> > vectorizer. Outliner does not expose any of that.
>
> I have not made a lot of effort to understand the full discussion here (so
> what
> I say below may be off-base), but I think there are some cases where
> outlining
> (especially working with function-attrs) can make optimization easier.
>
> It can help transforms that duplicate code (like loop unrolling and
> inlining) be
> more profitable -- I'm thinking of cases where unrolling/inlining would
> have to
> duplicate a lot of code, but after outlining would require duplicating
> only a
> few call instructions.
>
>
> It can help EarlyCSE do things that require GVN today:
>
> void foo() {
>   ... complex computation that computes func()
>   ... complex computation that computes func()
> }
>
> outlining=>
>
> int func() { ... }
>
> void foo() {
>   int x = func();
>   int y = func();
> }
>
> functionattrs=>
>
> int func() readonly { ... }
>
> void foo(int a, int b) {
>   int x = func();
>   int y = func();
> }
>
> earlycse=>
>
> int func(int t) readnone { ... }
>
> void foo(int a, int b) {
>   int x = func(a);
>   int y = x;
> }
>
> GVN will catch this, but EarlyCSE is (at least supposed to be!) cheaper.
>
>
> Once we have an analysis that can prove that certain functions can't
trap,
> outlining can allow LICM etc. to speculate entire outlined regions out of
> loops.
>
>
> Generally, I think outlining exposes information that certain regions of
> the
> program are doing identical things.  We should expect to get some mileage
> out of
> this information.
>
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170726/c9734362/attachment.html>

Sanjoy Das via llvm-dev

2017-Jul-26 20:41 UTC

head link

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

Hi,

On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at gmail.com>
wrote:> The way I interpret Quentin's statement is something like:
>
> - Inlining turns an interprocedural problem into an intraprocedural problem
> - Outlining turns an intraprocedural problem into an interprocedural
problem
>
> Insofar as our intraprocedural analyses and transformations are strictly
> more powerful than interprocedural, then there is a precise sense in which
> inlining exposes optimization opportunities while outlining does not.
While I think our intra-proc optimizations are *generally* more
powerful, I don't think they are *always* more powerful.  For
instance, LICM (today) won't hoist full regions but it can hoist
single function calls.  If we can extract out a region into a
readnone+nounwind function call then LICM will hoist it to the
preheader if the safety checks pass.
> Actually, for his internship last summer River wrote a profile-guided
> outliner / partial inliner (it didn't try to do deduplication; so it
was
> more like PartialInliner.cpp). IIRC he found that LLVM's
interprocedural
> analyses were so bad that there were pretty adverse effects from many of
the
> outlining decisions. E.g. if you outline from the left side of a diamond,
> that side basically becomes a black box to most LLVM analyses and forces
> downstream dataflow meet points to give an overly conservative result, even
> though our standard intraprocedural analyses would have happily dug through
> the left side of the diamond if the code had not been outlined.
>
> Also, River's patch (the one in this thread) does parameterized
outlining.
> For example, two sequences containing stores can be outlined even if the
> corresponding stores have different pointers. The pointer to be loaded from
> is passed as a parameter to the outlined function. In that sense, the
> outlined function's behavior becomes a conservative approximation of
both
> which in principle loses precision.
Can we outline only once we've already done all of these optimizations
that outlining would block?
> I like your EarlyCSE example and it is interesting that combined with
> functionattrs it can make a "cheap" pass get a transformation
that an
> "expensive" pass would otherwise be needed. Are there any cases
where we
> only have the "cheap" pass and thus the outlining would be
essential for our
> optimization pipeline to get the optimization right?
>
> The case that comes to mind for me is cases where we have some cutoff of
> search depth. Reducing a sequence to a single call (+ functionattr
> inference) can essentially summarize the sequence and effectively increase
> search depth, which might give more results. That seems like a bit of a
weak
> example though.
I don't know if River's patch outlines entire control flow regions at
a time, but if it does then we could use cheap basic block scanning
analyses for things that would normally require CFG-level analysis.

-- Sanjoy
>
> -- Sean Silva
>
> On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> > No, I mean in terms of enabling other optimizations in the
pipeline like
>> > vectorizer. Outliner does not expose any of that.
>>
>> I have not made a lot of effort to understand the full discussion here
(so
>> what
>> I say below may be off-base), but I think there are some cases where
>> outlining
>> (especially working with function-attrs) can make optimization easier.
>>
>> It can help transforms that duplicate code (like loop unrolling and
>> inlining) be
>> more profitable -- I'm thinking of cases where unrolling/inlining
would
>> have to
>> duplicate a lot of code, but after outlining would require duplicating
>> only a
>> few call instructions.
>>
>>
>> It can help EarlyCSE do things that require GVN today:
>>
>> void foo() {
>>   ... complex computation that computes func()
>>   ... complex computation that computes func()
>> }
>>
>> outlining=>
>>
>> int func() { ... }
>>
>> void foo() {
>>   int x = func();
>>   int y = func();
>> }
>>
>> functionattrs=>
>>
>> int func() readonly { ... }
>>
>> void foo(int a, int b) {
>>   int x = func();
>>   int y = func();
>> }
>>
>> earlycse=>
>>
>> int func(int t) readnone { ... }
>>
>> void foo(int a, int b) {
>>   int x = func(a);
>>   int y = x;
>> }
>>
>> GVN will catch this, but EarlyCSE is (at least supposed to be!)
cheaper.
>>
>>
>> Once we have an analysis that can prove that certain functions
can't trap,
>> outlining can allow LICM etc. to speculate entire outlined regions out
of
>> loops.
>>
>>
>> Generally, I think outlining exposes information that certain regions
of
>> the
>> program are doing identical things.  We should expect to get some
mileage
>> out of
>> this information.
>>
>> -- Sanjoy
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

Quentin Colombet via llvm-dev

2017-Jul-27 04:59 UTC

head link

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

> On Jul 26, 2017, at 12:07 PM, Sanjoy Das <sanjoy at google.com>
wrote:
> 
> Hi,
> 
> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> No, I mean in terms of enabling other optimizations in the pipeline
like
>> vectorizer. Outliner does not expose any of that.
> 
> I have not made a lot of effort to understand the full discussion here (so
what
> I say below may be off-base), but I think there are some cases where
outlining
> (especially working with function-attrs) can make optimization easier.
> 
> It can help transforms that duplicate code (like loop unrolling and
inlining) be
> more profitable -- I'm thinking of cases where unrolling/inlining would
have to
> duplicate a lot of code, but after outlining would require duplicating only
a
> few call instructions.
> 
> 
> It can help EarlyCSE do things that require GVN today:
> 
> void foo() {
>  ... complex computation that computes func()
>  ... complex computation that computes func()
> }
> 
> outlining=>
> 
> int func() { ... }
> 
> void foo() {
>  int x = func();
>  int y = func();
> }
> 
> functionattrs=>
> 
> int func() readonly { ... }
> 
> void foo(int a, int b) {
>  int x = func();
>  int y = func();
> }
> 
> earlycse=>
> 
> int func(int t) readnone { ... }
> 
> void foo(int a, int b) {
>  int x = func(a);
>  int y = x;
> }
> 
> GVN will catch this, but EarlyCSE is (at least supposed to be!) cheaper.
> 
> 
> Once we have an analysis that can prove that certain functions can't
trap,
> outlining can allow LICM etc. to speculate entire outlined regions out of
loops.
> 
> 
> Generally, I think outlining exposes information that certain regions of
the
> program are doing identical things.  We should expect to get some mileage
out of
> this information.
That’s a fair point.
Using outlining is one possible way of doing that, indeed.
We could get that out of some analysis or improve the different passes. Because
generally speaking, I don’t think you want to add call indirections in
performance sensitive areas.
> 
> -- Sanjoy

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Jul 2017 - [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

Reasonably Related Threads