On Tue, Nov 30, 2010 at 2:29 PM, Chris Lattner <clattner at apple.com> wrote:> > On Nov 30, 2010, at 2:19 PM, Xinliang David Li wrote: > > I understand that, but that implies that you have some model for code >> locality. Setting a global code growth limit is (in my opinion) a hack >> unless you are aiming for the whole program to fit in the icache (which I >> don't think anyone tries to do :). >> > > Yes, global growth limit may be good for size control, but is a hack for > control icache footprint. However, as I mentioned, the bottom up inline > scheme make it impossible to use any heuristics involving 'global limit' > which can be more complicated and fancier than the simple growth limit. For > instance, there is no restriction that only one global limit can be used --- > the compiler can partition the call graph into multiple locality regions, > and set icache limit for each region. The inlining order can be done on a > region by region basis. For each region, the region limit is applied and the > priority queue must be used. > > > Yes, I understand that. But why is a global limit useful? What problem > does it solve? If it is to cap "runaway inlining" there are better ways to > do it. You agree that it isn't for icache optimization, so what value does > it serve? >I am not trying to defend 'global growth limit' here :) -- I am arguing the usefulness/flexibility of using a priority based inline ordering. do we agree here? David> > -Chris > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101130/edd7b2f6/attachment.html>
On Nov 30, 2010, at 2:36 PM, Xinliang David Li wrote:>> Yes, global growth limit may be good for size control, but is a hack for control icache footprint. However, as I mentioned, the bottom up inline scheme make it impossible to use any heuristics involving 'global limit' which can be more complicated and fancier than the simple growth limit. For instance, there is no restriction that only one global limit can be used --- the compiler can partition the call graph into multiple locality regions, and set icache limit for each region. The inlining order can be done on a region by region basis. For each region, the region limit is applied and the priority queue must be used. > > Yes, I understand that. But why is a global limit useful? What problem does it solve? If it is to cap "runaway inlining" there are better ways to do it. You agree that it isn't for icache optimization, so what value does it serve? > > I am not trying to defend 'global growth limit' here :) -- I am arguing the usefulness/flexibility of using a priority based inline ordering. do we agree here?I think I'm missing something important. :) Can you explain the difference between priority based inlining and a global threshold? If you end up inlining everything in your priority queue, then order doesn't matter. If you only inline the high priority stuff, then you can handle this with a threshold. Can you give a scenario that can't be handled with bottom-up inlining and a sufficiently smart inlining cost evaluation? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101130/be409668/attachment.html>
On Tue, Nov 30, 2010 at 2:44 PM, Chris Lattner <clattner at apple.com> wrote:> > On Nov 30, 2010, at 2:36 PM, Xinliang David Li wrote: > > Yes, global growth limit may be good for size control, but is a hack for >> control icache footprint. However, as I mentioned, the bottom up inline >> scheme make it impossible to use any heuristics involving 'global limit' >> which can be more complicated and fancier than the simple growth limit. For >> instance, there is no restriction that only one global limit can be used --- >> the compiler can partition the call graph into multiple locality regions, >> and set icache limit for each region. The inlining order can be done on a >> region by region basis. For each region, the region limit is applied and the >> priority queue must be used. >> >> >> Yes, I understand that. But why is a global limit useful? What problem >> does it solve? If it is to cap "runaway inlining" there are better ways to >> do it. You agree that it isn't for icache optimization, so what value does >> it serve? >> > > I am not trying to defend 'global growth limit' here :) -- I am arguing the > usefulness/flexibility of using a priority based inline ordering. do we > agree here? > > > I think I'm missing something important. :) > > Can you explain the difference between priority based inlining and a global > threshold? If you end up inlining everything in your priority queue, >For large applications, this will never happen.> then order doesn't matter. If you only inline the high priority stuff, > then you can handle this with a threshold. Can you give a scenario that > can't be handled with bottom-up inlining and a sufficiently smart inlining > cost evaluation? >It is doable in bottom up, but can be complicated. I guess you will also need a priority queue as a side data structure, and you probably also need multiple passes to exhaust the limit. Another LLVM inliner limitation I won't to point out is that since inline analysis and transformation are done together, any inline decisions can not be undone. Thanks, David> > -Chris >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101130/aa0b9ebd/attachment.html>