thr3ads.net - llvm dev - [LLVMdev] LLVM Inliner [Nov 2010]

If this information is useful, please help other people find it:
Share via:

Xinliang David Li

2010-Nov-30 22:36 UTC

[LLVMdev] LLVM Inliner

On Tue, Nov 30, 2010 at 2:29 PM, Chris Lattner <clattner at apple.com>
wrote:
>
> On Nov 30, 2010, at 2:19 PM, Xinliang David Li wrote:
>
> I understand that, but that implies that you have some model for code
>> locality.  Setting a global code growth limit is (in my opinion) a hack
>> unless you are aiming for the whole program to fit in the icache (which
I
>> don't think anyone tries to do :).
>>
>
> Yes, global growth limit may be good for size control, but is a hack for
> control icache footprint. However, as I mentioned, the bottom up inline
> scheme make it impossible to use any heuristics involving 'global
limit'
> which can be more complicated and fancier than the simple growth limit. 
For
> instance, there is no restriction that only one global limit can be used
---
>  the compiler can partition the call graph into multiple locality regions,
> and set icache limit for each region. The inlining order can be done on a
> region by region basis. For each region, the region limit is applied and
the
> priority queue must be used.
>
>
> Yes, I understand that.  But why is a global limit useful?  What problem
> does it solve?  If it is to cap "runaway inlining" there are
better ways to
> do it.  You agree that it isn't for icache optimization, so what value
does
> it serve?
>
I am not trying to defend 'global growth limit' here :) -- I am arguing
the
usefulness/flexibility of using a priority based inline ordering. do we
agree here?

David


>
> -Chris
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101130/edd7b2f6/attachment.html>

Chris Lattner

2010-Nov-30 22:44 UTC

head link

[LLVMdev] LLVM Inliner

On Nov 30, 2010, at 2:36 PM, Xinliang David Li wrote:
>> Yes, global growth limit may be good for size control, but is a hack
for control icache footprint. However, as I mentioned, the bottom up inline
scheme make it impossible to use any heuristics involving 'global limit'
which can be more complicated and fancier than the simple growth limit.  For
instance, there is no restriction that only one global limit can be used --- 
the compiler can partition the call graph into multiple locality regions, and
set icache limit for each region. The inlining order can be done on a region by
region basis. For each region, the region limit is applied and the priority
queue must be used.
> 
> Yes, I understand that.  But why is a global limit useful?  What problem
does it solve?  If it is to cap "runaway inlining" there are better
ways to do it.  You agree that it isn't for icache optimization, so what
value does it serve?
> 
> I am not trying to defend 'global growth limit' here :) -- I am
arguing the usefulness/flexibility of using a priority based inline ordering. do
we agree here?
I think I'm missing something important.  :)

Can you explain the difference between priority based inlining and a global
threshold?  If you end up inlining everything in your priority queue, then order
doesn't matter.  If you only inline the high priority stuff, then you can
handle this with a threshold.  Can you give a scenario that can't be handled
with bottom-up inlining and a sufficiently smart inlining cost evaluation?

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101130/be409668/attachment.html>

Xinliang David Li

2010-Nov-30 22:58 UTC

head link

[LLVMdev] LLVM Inliner

On Tue, Nov 30, 2010 at 2:44 PM, Chris Lattner <clattner at apple.com>
wrote:
>
> On Nov 30, 2010, at 2:36 PM, Xinliang David Li wrote:
>
> Yes, global growth limit may be good for size control, but is a hack for
>> control icache footprint. However, as I mentioned, the bottom up inline
>> scheme make it impossible to use any heuristics involving 'global
limit'
>> which can be more complicated and fancier than the simple growth limit.
For
>> instance, there is no restriction that only one global limit can be
used ---
>>  the compiler can partition the call graph into multiple locality
regions,
>> and set icache limit for each region. The inlining order can be done on
a
>> region by region basis. For each region, the region limit is applied
and the
>> priority queue must be used.
>>
>>
>> Yes, I understand that.  But why is a global limit useful?  What
problem
>> does it solve?  If it is to cap "runaway inlining" there are
better ways to
>> do it.  You agree that it isn't for icache optimization, so what
value does
>> it serve?
>>
>
> I am not trying to defend 'global growth limit' here :) -- I am
arguing the
> usefulness/flexibility of using a priority based inline ordering. do we
> agree here?
>
>
> I think I'm missing something important.  :)
>
> Can you explain the difference between priority based inlining and a global
> threshold?  If you end up inlining everything in your priority queue,
>
For large applications, this will never happen.


> then order doesn't matter.  If you only inline the high priority stuff,
> then you can handle this with a threshold.  Can you give a scenario that
> can't be handled with bottom-up inlining and a sufficiently smart
inlining
> cost evaluation?
>
It is doable in bottom up, but can be complicated. I guess you will also
need a priority queue as a side data structure, and you probably also need
multiple passes to exhaust the limit.

Another LLVM inliner limitation I won't to point out is that since inline
analysis and transformation are done together, any inline decisions can not
be undone.

Thanks,

David

>
> -Chris
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101130/aa0b9ebd/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Nov 2010 - [LLVMdev] LLVM Inliner

[LLVMdev] LLVM Inliner

[LLVMdev] LLVM Inliner

[LLVMdev] LLVM Inliner

Possibly Parallel Threads