Mehdi AMINI via llvm-dev
2017-Aug-01 05:38 UTC
[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.
2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev < llvm-dev at lists.llvm.org>:> Apologies for delayed joining of this discussion, but I had a few notes > from this thread that I really wanted to chime in about. > > River, > > I don't mean to put you on the spot, but I do want to start on a semantic > issue. In several places in the thread you used the words "we" and "our" to > imply that you're not alone in writing this (which is totally fine), but > your initial thread presented this as entirely your own work. So, when you > said things like "we feel there's an advantage to being at the IR level", > can you please clarify who is "we"? > > Given that there are a number of disagreements and opinions floating > around I think it benefits us all to speak clearly about who is taking what > stances. > > One particular disagreement that I think very much needs to be revisited > in this thread was Jessica's proposal of a pipeline of: > > 1. IR outline > 2. Inline > 3. MIR outline > > In your response to that proposal you dismissed it out of hand with > "feelings" but not data. Given that the proposal came from Jessica (a > community member with significant relevant experience in outlining), and it > was also recognized as interesting by Eric Christopher (a long-time member > of the community with wide reaching expertise), I think dismissing it may > have been a little premature. >It isn't clear to me how much the *exact* pipeline and ordering of passes is relevant to consider if "having an outliner at the IR level" is a good idea.> I also want to visit a few procedural notes. > > Mehdi commented on the thread that it wouldn't be fair to ask for a > comparative study because the MIR outliner didn't have one. While I don't > think anyone is asking for a comparative study, I want to point out that I > think it is completely fair. >If a new contributor approached the community with a new SROA pass and> wanted to land it in-tree it would be appropriate to ask for a comparative > analysis against the existing pass. How is this different? >It seems quite different to me because there is no outliner at the IR level and so they don't provide the same functionality. The "Why at the IR level" section of the original email combined with the performance numbers seems largely enough to me to explain why it isn't redundant to the Machine-level outliner. I'd consider this work for inclusion upstream purely on its technical merit at this point. Discussing inclusion as part of any of the default pipeline is a different story. Similarly last year, the IR-level PGO was also implemented even though we already had a PGO implementation, because 1) it provided a generic solutions for other frontend (just like here it could be said that it provides a generic solution for targets) and 2) it supported cases that FE-PGO didn't (especially around better counter-context using pre-inlining and such).> > Adding a new IR outliner is a different situation from when the MIR one > was added. When the MIR outliner was introduced there was no in-tree > analog. >We still usually discuss design extensively. Skipping the IR-level option didn't seem obvious to me, to say the least. And it wasn't really much discussed/considered extensively upstream. If the idea is that implementing a concept at the machine level may preclude a future implementation at the IR level, it means we should be *a lot* more picky before accepting such contribution. In this case, if I had anticipated any push-back on an IR-level implementation only based on the fact that we have now a Machine-level one, I'd likely have pushed back on the machine-level one.> When someone comes to the community with something that has no existing > in-tree analog it isn't fair to necessarily ask them to implement it > multiple different ways to prove their solution is the best. >It may or may not be fair, but there is a tradeoff in how much effort we would require them to convince the community that this is *the* right way to go, depending on what it implies for future approaches. -- Mehdi> However, as a community, we do still exercise the right to reject > contributions we disagree with, and we frequently request changes to the > implementation (as is shown every time someone tries to add SPIR-V support). > > In the LLVM community we have a long history of approaching large > contributions (especially ones from new contributors) with scrutiny and > discussion. It would be a disservice to the project to forget that. > > River, as a last note. I see that you've started uploading patches to > Phabricator, and I know you're relatively new to the community. When > uploading patches it helps to include appropriate reviewers so that the > right people see the patches as they come in. To that end can you please > include Jessica as a reviewer? Given her relevant domain experience I think > her feedback on the patches will be very valuable. > > Thank you, > -Chris > > On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hey Sanjoy, > > On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi, >> >> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> > The way I interpret Quentin's statement is something like: >> > >> > - Inlining turns an interprocedural problem into an intraprocedural >> problem >> > - Outlining turns an intraprocedural problem into an interprocedural >> problem >> > >> > Insofar as our intraprocedural analyses and transformations are strictly >> > more powerful than interprocedural, then there is a precise sense in >> which >> > inlining exposes optimization opportunities while outlining does not. >> >> While I think our intra-proc optimizations are *generally* more >> powerful, I don't think they are *always* more powerful. For >> instance, LICM (today) won't hoist full regions but it can hoist >> single function calls. If we can extract out a region into a >> readnone+nounwind function call then LICM will hoist it to the >> preheader if the safety checks pass. >> >> > Actually, for his internship last summer River wrote a profile-guided >> > outliner / partial inliner (it didn't try to do deduplication; so it was >> > more like PartialInliner.cpp). IIRC he found that LLVM's interprocedural >> > analyses were so bad that there were pretty adverse effects from many >> of the >> > outlining decisions. E.g. if you outline from the left side of a >> diamond, >> > that side basically becomes a black box to most LLVM analyses and forces >> > downstream dataflow meet points to give an overly conservative result, >> even >> > though our standard intraprocedural analyses would have happily dug >> through >> > the left side of the diamond if the code had not been outlined. >> > >> > Also, River's patch (the one in this thread) does parameterized >> outlining. >> > For example, two sequences containing stores can be outlined even if the >> > corresponding stores have different pointers. The pointer to be loaded >> from >> > is passed as a parameter to the outlined function. In that sense, the >> > outlined function's behavior becomes a conservative approximation of >> both >> > which in principle loses precision. >> >> Can we outline only once we've already done all of these optimizations >> that outlining would block? >> > > The outliner is able to run at any point in the interprocedural > pipeline. There are currently two locations: Early outlining(pre inliner) > and late outlining(practically the last pass to run). It is configured to > run either Early+Late, or just Late. > > >> > I like your EarlyCSE example and it is interesting that combined with >> > functionattrs it can make a "cheap" pass get a transformation that an >> > "expensive" pass would otherwise be needed. Are there any cases where we >> > only have the "cheap" pass and thus the outlining would be essential >> for our >> > optimization pipeline to get the optimization right? >> > >> > The case that comes to mind for me is cases where we have some cutoff of >> > search depth. Reducing a sequence to a single call (+ functionattr >> > inference) can essentially summarize the sequence and effectively >> increase >> > search depth, which might give more results. That seems like a bit of a >> weak >> > example though. >> >> I don't know if River's patch outlines entire control flow regions at >> a time, but if it does then we could use cheap basic block scanning >> analyses for things that would normally require CFG-level analysis. >> > > The current patch currently just supports outlining from within a single > block. Although, I had a working prototype for Region based outlining, I > kept it from this patch for simplicity. So its entirely possible to add > that kind of functionality because I've already tried. > Thanks, > River Riddle > > >> >> -- Sanjoy >> >> > >> > -- Sean Silva >> > >> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev >> > <llvm-dev at lists.llvm.org> wrote: >> >> >> >> Hi, >> >> >> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev >> >> <llvm-dev at lists.llvm.org> wrote: >> >> > No, I mean in terms of enabling other optimizations in the pipeline >> like >> >> > vectorizer. Outliner does not expose any of that. >> >> >> >> I have not made a lot of effort to understand the full discussion here >> (so >> >> what >> >> I say below may be off-base), but I think there are some cases where >> >> outlining >> >> (especially working with function-attrs) can make optimization easier. >> >> >> >> It can help transforms that duplicate code (like loop unrolling and >> >> inlining) be >> >> more profitable -- I'm thinking of cases where unrolling/inlining would >> >> have to >> >> duplicate a lot of code, but after outlining would require duplicating >> >> only a >> >> few call instructions. >> >> >> >> >> >> It can help EarlyCSE do things that require GVN today: >> >> >> >> void foo() { >> >> ... complex computation that computes func() >> >> ... complex computation that computes func() >> >> } >> >> >> >> outlining=> >> >> >> >> int func() { ... } >> >> >> >> void foo() { >> >> int x = func(); >> >> int y = func(); >> >> } >> >> >> >> functionattrs=> >> >> >> >> int func() readonly { ... } >> >> >> >> void foo(int a, int b) { >> >> int x = func(); >> >> int y = func(); >> >> } >> >> >> >> earlycse=> >> >> >> >> int func(int t) readnone { ... } >> >> >> >> void foo(int a, int b) { >> >> int x = func(a); >> >> int y = x; >> >> } >> >> >> >> GVN will catch this, but EarlyCSE is (at least supposed to be!) >> cheaper. >> >> >> >> >> >> Once we have an analysis that can prove that certain functions can't >> trap, >> >> outlining can allow LICM etc. to speculate entire outlined regions out >> of >> >> loops. >> >> >> >> >> >> Generally, I think outlining exposes information that certain regions >> of >> >> the >> >> program are doing identical things. We should expect to get some >> mileage >> >> out of >> >> this information. >> >> >> >> -- Sanjoy >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> llvm-dev at lists.llvm.org >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > >> > >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170731/6bed9cd1/attachment.html>
Andrey Bokhanko via llvm-dev
2017-Aug-01 08:17 UTC
[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.
...and adding $0.02 to the "IR outline + inline + MIR outline" idea, my gut feeling (yes, only a "feeling" -- and one coming from my gut, not head!) is that inlining correcting wrong IR outlining decisions with MIR outlining correcting wrong inlining decisions is absolutely unrealistic and a heuristics nightmare at best. Inliner's heuristics are already complex enough and not 100% bulletproof; if we add IR outliner heuristics to the mix -- and then just a little bit of MIR outliner heuristics (which are more precise but, as demonstrated above, not 100% precise as well) on top... you can imagine. Yours, Andrey On Tue, Aug 1, 2017 at 10:07 AM, Andrey Bokhanko <andreybokhanko at gmail.com> wrote:> All, > > +1 to what Mehdi said. > > It's a fair concern to question whatever we need yet another Outlining > pass. I believe this concern has been cleared by River -- both with > theoretical arguments and practical data (benchmark numbers). > > Jessica's pipeline proposal is completely orthogonal. It's not fair to > request River to implement / fit into what she suggested. Sure, it's a > valid topic to discuss -- but yet completely orthogonal one. If anything, > accepting River's implementation would enable us to do experiments / > developments like pipeline changes of this ilk! > > Yours, > Andrey > ==> Compiler Architect > NXP > > > On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> >> 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev < >> llvm-dev at lists.llvm.org>: >> >>> Apologies for delayed joining of this discussion, but I had a few notes >>> from this thread that I really wanted to chime in about. >>> >>> River, >>> >>> I don't mean to put you on the spot, but I do want to start on a >>> semantic issue. In several places in the thread you used the words "we" and >>> "our" to imply that you're not alone in writing this (which is totally >>> fine), but your initial thread presented this as entirely your own work. >>> So, when you said things like "we feel there's an advantage to being at the >>> IR level", can you please clarify who is "we"? >>> >>> Given that there are a number of disagreements and opinions floating >>> around I think it benefits us all to speak clearly about who is taking what >>> stances. >>> >>> One particular disagreement that I think very much needs to be revisited >>> in this thread was Jessica's proposal of a pipeline of: >>> >>> 1. IR outline >>> 2. Inline >>> 3. MIR outline >>> >>> In your response to that proposal you dismissed it out of hand with >>> "feelings" but not data. Given that the proposal came from Jessica (a >>> community member with significant relevant experience in outlining), and it >>> was also recognized as interesting by Eric Christopher (a long-time member >>> of the community with wide reaching expertise), I think dismissing it may >>> have been a little premature. >>> >> >> It isn't clear to me how much the *exact* pipeline and ordering of passes >> is relevant to consider if "having an outliner at the IR level" is a good >> idea. >> >> >> >>> I also want to visit a few procedural notes. >>> >>> Mehdi commented on the thread that it wouldn't be fair to ask for a >>> comparative study because the MIR outliner didn't have one. While I don't >>> think anyone is asking for a comparative study, I want to point out that I >>> think it is completely fair. >>> >> If a new contributor approached the community with a new SROA pass and >>> wanted to land it in-tree it would be appropriate to ask for a comparative >>> analysis against the existing pass. How is this different? >>> >> >> It seems quite different to me because there is no outliner at the IR >> level and so they don't provide the same functionality. The "Why at the IR >> level" section of the original email combined with the performance numbers >> seems largely enough to me to explain why it isn't redundant to the >> Machine-level outliner. >> I'd consider this work for inclusion upstream purely on its technical >> merit at this point. >> Discussing inclusion as part of any of the default pipeline is a >> different story. >> >> Similarly last year, the IR-level PGO was also implemented even though we >> already had a PGO implementation, because 1) it provided a generic >> solutions for other frontend (just like here it could be said that it >> provides a generic solution for targets) and 2) it supported cases that >> FE-PGO didn't (especially around better counter-context using pre-inlining >> and such). >> >> >> >>> >>> Adding a new IR outliner is a different situation from when the MIR one >>> was added. When the MIR outliner was introduced there was no in-tree >>> analog. >>> >> >> We still usually discuss design extensively. Skipping the IR-level option >> didn't seem obvious to me, to say the least. And it wasn't really much >> discussed/considered extensively upstream. >> If the idea is that implementing a concept at the machine level may >> preclude a future implementation at the IR level, it means we should be *a >> lot* more picky before accepting such contribution. >> In this case, if I had anticipated any push-back on an IR-level >> implementation only based on the fact that we have now a Machine-level one, >> I'd likely have pushed back on the machine-level one. >> >> >> >>> When someone comes to the community with something that has no existing >>> in-tree analog it isn't fair to necessarily ask them to implement it >>> multiple different ways to prove their solution is the best. >>> >> >> It may or may not be fair, but there is a tradeoff in how much effort we >> would require them to convince the community that this is *the* right way >> to go, depending on what it implies for future approaches. >> >> -- >> Mehdi >> >> >>> However, as a community, we do still exercise the right to reject >>> contributions we disagree with, and we frequently request changes to the >>> implementation (as is shown every time someone tries to add SPIR-V support). >>> >>> In the LLVM community we have a long history of approaching large >>> contributions (especially ones from new contributors) with scrutiny and >>> discussion. It would be a disservice to the project to forget that. >>> >>> River, as a last note. I see that you've started uploading patches to >>> Phabricator, and I know you're relatively new to the community. When >>> uploading patches it helps to include appropriate reviewers so that the >>> right people see the patches as they come in. To that end can you please >>> include Jessica as a reviewer? Given her relevant domain experience I think >>> her feedback on the patches will be very valuable. >>> >>> Thank you, >>> -Chris >>> >>> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>> Hey Sanjoy, >>> >>> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hi, >>>> >>>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at gmail.com> >>>> wrote: >>>> > The way I interpret Quentin's statement is something like: >>>> > >>>> > - Inlining turns an interprocedural problem into an intraprocedural >>>> problem >>>> > - Outlining turns an intraprocedural problem into an interprocedural >>>> problem >>>> > >>>> > Insofar as our intraprocedural analyses and transformations are >>>> strictly >>>> > more powerful than interprocedural, then there is a precise sense in >>>> which >>>> > inlining exposes optimization opportunities while outlining does not. >>>> >>>> While I think our intra-proc optimizations are *generally* more >>>> powerful, I don't think they are *always* more powerful. For >>>> instance, LICM (today) won't hoist full regions but it can hoist >>>> single function calls. If we can extract out a region into a >>>> readnone+nounwind function call then LICM will hoist it to the >>>> preheader if the safety checks pass. >>>> >>>> > Actually, for his internship last summer River wrote a profile-guided >>>> > outliner / partial inliner (it didn't try to do deduplication; so it >>>> was >>>> > more like PartialInliner.cpp). IIRC he found that LLVM's >>>> interprocedural >>>> > analyses were so bad that there were pretty adverse effects from many >>>> of the >>>> > outlining decisions. E.g. if you outline from the left side of a >>>> diamond, >>>> > that side basically becomes a black box to most LLVM analyses and >>>> forces >>>> > downstream dataflow meet points to give an overly conservative >>>> result, even >>>> > though our standard intraprocedural analyses would have happily dug >>>> through >>>> > the left side of the diamond if the code had not been outlined. >>>> > >>>> > Also, River's patch (the one in this thread) does parameterized >>>> outlining. >>>> > For example, two sequences containing stores can be outlined even if >>>> the >>>> > corresponding stores have different pointers. The pointer to be >>>> loaded from >>>> > is passed as a parameter to the outlined function. In that sense, the >>>> > outlined function's behavior becomes a conservative approximation of >>>> both >>>> > which in principle loses precision. >>>> >>>> Can we outline only once we've already done all of these optimizations >>>> that outlining would block? >>>> >>> >>> The outliner is able to run at any point in the interprocedural >>> pipeline. There are currently two locations: Early outlining(pre inliner) >>> and late outlining(practically the last pass to run). It is configured to >>> run either Early+Late, or just Late. >>> >>> >>>> > I like your EarlyCSE example and it is interesting that combined with >>>> > functionattrs it can make a "cheap" pass get a transformation that an >>>> > "expensive" pass would otherwise be needed. Are there any cases where >>>> we >>>> > only have the "cheap" pass and thus the outlining would be essential >>>> for our >>>> > optimization pipeline to get the optimization right? >>>> > >>>> > The case that comes to mind for me is cases where we have some cutoff >>>> of >>>> > search depth. Reducing a sequence to a single call (+ functionattr >>>> > inference) can essentially summarize the sequence and effectively >>>> increase >>>> > search depth, which might give more results. That seems like a bit of >>>> a weak >>>> > example though. >>>> >>>> I don't know if River's patch outlines entire control flow regions at >>>> a time, but if it does then we could use cheap basic block scanning >>>> analyses for things that would normally require CFG-level analysis. >>>> >>> >>> The current patch currently just supports outlining from within a >>> single block. Although, I had a working prototype for Region based >>> outlining, I kept it from this patch for simplicity. So its entirely >>> possible to add that kind of functionality because I've already tried. >>> Thanks, >>> River Riddle >>> >>> >>>> >>>> -- Sanjoy >>>> >>>> > >>>> > -- Sean Silva >>>> > >>>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev >>>> > <llvm-dev at lists.llvm.org> wrote: >>>> >> >>>> >> Hi, >>>> >> >>>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev >>>> >> <llvm-dev at lists.llvm.org> wrote: >>>> >> > No, I mean in terms of enabling other optimizations in the >>>> pipeline like >>>> >> > vectorizer. Outliner does not expose any of that. >>>> >> >>>> >> I have not made a lot of effort to understand the full discussion >>>> here (so >>>> >> what >>>> >> I say below may be off-base), but I think there are some cases where >>>> >> outlining >>>> >> (especially working with function-attrs) can make optimization >>>> easier. >>>> >> >>>> >> It can help transforms that duplicate code (like loop unrolling and >>>> >> inlining) be >>>> >> more profitable -- I'm thinking of cases where unrolling/inlining >>>> would >>>> >> have to >>>> >> duplicate a lot of code, but after outlining would require >>>> duplicating >>>> >> only a >>>> >> few call instructions. >>>> >> >>>> >> >>>> >> It can help EarlyCSE do things that require GVN today: >>>> >> >>>> >> void foo() { >>>> >> ... complex computation that computes func() >>>> >> ... complex computation that computes func() >>>> >> } >>>> >> >>>> >> outlining=> >>>> >> >>>> >> int func() { ... } >>>> >> >>>> >> void foo() { >>>> >> int x = func(); >>>> >> int y = func(); >>>> >> } >>>> >> >>>> >> functionattrs=> >>>> >> >>>> >> int func() readonly { ... } >>>> >> >>>> >> void foo(int a, int b) { >>>> >> int x = func(); >>>> >> int y = func(); >>>> >> } >>>> >> >>>> >> earlycse=> >>>> >> >>>> >> int func(int t) readnone { ... } >>>> >> >>>> >> void foo(int a, int b) { >>>> >> int x = func(a); >>>> >> int y = x; >>>> >> } >>>> >> >>>> >> GVN will catch this, but EarlyCSE is (at least supposed to be!) >>>> cheaper. >>>> >> >>>> >> >>>> >> Once we have an analysis that can prove that certain functions can't >>>> trap, >>>> >> outlining can allow LICM etc. to speculate entire outlined regions >>>> out of >>>> >> loops. >>>> >> >>>> >> >>>> >> Generally, I think outlining exposes information that certain >>>> regions of >>>> >> the >>>> >> program are doing identical things. We should expect to get some >>>> mileage >>>> >> out of >>>> >> this information. >>>> >> >>>> >> -- Sanjoy >>>> >> _______________________________________________ >>>> >> LLVM Developers mailing list >>>> >> llvm-dev at lists.llvm.org >>>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > >>>> > >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/49ddd482/attachment.html>
Daniel Berlin via llvm-dev
2017-Aug-01 17:28 UTC
[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.
> > > > Also as a side note, I think in the original MachineOutliner RFC thread > there was some confusion as to whether it was possible to solve the code > folding outlining problem exactly as a graph problem on SSA using standard > value numbering algorithms in polynomial time. >> I can elaborate further, but > 1. it is easy to see that you can map an arbitrary dag to an isomorphic > data flow graph in an SSA IR e.g. in LLVM IR or pre-RA MIR >> 2. Given two dags, you can create their respective isomorphic data flow > graphs (say, put them into two separate functions) > 3. An exact graph based code folding outliner would be able to discover if > the two dataflow graphs are isomorphic (that is basically what I mean by > exact) and outline them. > 4. Thus, graph isomorphism on dags can be solved with such an algorithm > and thus the outlining problem is GI-hard and a polynomial time solution > would be a big breakthrough in CS. >First, you'd have to reduce them in both directions to prove that ;) All that's been shown is that you can reduce it to a hard problem. You can also reduce it to 3-sat., but that doesn't mean anything unless you can reduce 3-sat to it. 5. The actual problem the outliner is trying to solve is actually more like> finding subgraphs that are isomorphic, making the problem even harder > (something like "given dags A and B does there exist a subgraph of A that > is isomorphic to a subgraph of B") > >This assumes, strongly, that this reduction is the way to do it, and also that SSA/etc imparts no structure in the reduction that enables you to solve it faster (IE restricts the types of graphs, etc) FWIW, the exact argument above holds for value numbering, and since the days of kildall, it was believed not solvable in a complete fashion in less than exponential time due to the need to do graph isomorphism on large value graphs at join points. Except, much like RA and other things, it turns out this is not right for SSA, and in the past few years, it was proved you could do complete value numbering in polynomial time on SSA. So with all due respect to quentin, i don't buy it yet. Without more, i'd bet using similar techniques to solve value numbering in polynomial time to SSA could be applied here. This is because the complete polynomial value numbering techniques are in fact, value graph isomorphism .... So i'd assume the opposite (it can be done in polynomial time) without more. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/6e2544d0/attachment.html>
Chris Bieneman via llvm-dev
2017-Aug-01 18:20 UTC
[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.
> On Aug 1, 2017, at 1:07 AM, Andrey Bokhanko via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > All, > > +1 to what Mehdi said. > > It's a fair concern to question whatever we need yet another Outlining pass. I believe this concern has been cleared by River -- both with theoretical arguments and practical data (benchmark numbers).I want to point out that River has not provided full raw benchmark data, he has provided summarized data. Also, as a community of engineers I think theoretical arguments are dangerous to accept. The community has also not been presented with sufficient information to understand some of the differences between the numbers for the IR outliner and MIR outliner. Since the two are using very similar algorithms for detecting outlining candidates I would expect the numbers to be very similar, however in many cases they are not. I think understanding those differences and comparing them will help inform whether or not the IR outliner needs changes before being integrated to LLVM. I'm not advocating that we shouldn't take the IR outliner (in fact I think IR outlining is an interesting optimization). I'm advocating that we should understand and explore how this pass will fit into LLVM today and in the future.> > Jessica's pipeline proposal is completely orthogonal. It's not fair to request River to implement / fit into what she suggested. Sure, it's a valid topic to discuss -- but yet completely orthogonal one. If anything, accepting River's implementation would enable us to do experiments / developments like pipeline changes of this ilk!I completely disagree. When considering accepting a new pass to LLVM it is completely reasonable, appropriate, and not orthogonal to consider how that pass would interact with other existing passes. Also, given that in the past we've requested much more of contributors trying to make large contributions (see the many SPIR-V discussions) I think it is totally fair to request River consider Jessica's suggestion, and if the community felt that was the right approach it would be fair to request him to implement it. Please keep in mind, every new piece of functionality added to LLVM adds a maintenance burden on the community. If we are going to accept the burden of maintaining a new pass, it is reasonable for us, as a community, to request that the pass be engineered in a way that will make it worth that maintenance. Further, I am incredibly frustrated by the fact that people in this thread seem intent on squashing relevant technical discussions or turning them into combative arguments. There is nothing wrong with asking technical questions on an RFC, and we should be encouraging cooperative discussions of present and future implications of all proposals. -Chris> > Yours, > Andrey > ==> Compiler Architect > NXP > > > On Tue, Aug 1, 2017 at 7:38 AM, Mehdi AMINI via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>: > Apologies for delayed joining of this discussion, but I had a few notes from this thread that I really wanted to chime in about. > > River, > > I don't mean to put you on the spot, but I do want to start on a semantic issue. In several places in the thread you used the words "we" and "our" to imply that you're not alone in writing this (which is totally fine), but your initial thread presented this as entirely your own work. So, when you said things like "we feel there's an advantage to being at the IR level", can you please clarify who is "we"? > > Given that there are a number of disagreements and opinions floating around I think it benefits us all to speak clearly about who is taking what stances. > > One particular disagreement that I think very much needs to be revisited in this thread was Jessica's proposal of a pipeline of: > IR outline > Inline > MIR outline > In your response to that proposal you dismissed it out of hand with "feelings" but not data. Given that the proposal came from Jessica (a community member with significant relevant experience in outlining), and it was also recognized as interesting by Eric Christopher (a long-time member of the community with wide reaching expertise), I think dismissing it may have been a little premature. > > It isn't clear to me how much the *exact* pipeline and ordering of passes is relevant to consider if "having an outliner at the IR level" is a good idea. > > > I also want to visit a few procedural notes. > > Mehdi commented on the thread that it wouldn't be fair to ask for a comparative study because the MIR outliner didn't have one. While I don't think anyone is asking for a comparative study, I want to point out that I think it is completely fair. > If a new contributor approached the community with a new SROA pass and wanted to land it in-tree it would be appropriate to ask for a comparative analysis against the existing pass. How is this different? > > It seems quite different to me because there is no outliner at the IR level and so they don't provide the same functionality. The "Why at the IR level" section of the original email combined with the performance numbers seems largely enough to me to explain why it isn't redundant to the Machine-level outliner. > I'd consider this work for inclusion upstream purely on its technical merit at this point. > Discussing inclusion as part of any of the default pipeline is a different story. > > Similarly last year, the IR-level PGO was also implemented even though we already had a PGO implementation, because 1) it provided a generic solutions for other frontend (just like here it could be said that it provides a generic solution for targets) and 2) it supported cases that FE-PGO didn't (especially around better counter-context using pre-inlining and such). > > > > Adding a new IR outliner is a different situation from when the MIR one was added. When the MIR outliner was introduced there was no in-tree analog. > > We still usually discuss design extensively. Skipping the IR-level option didn't seem obvious to me, to say the least. And it wasn't really much discussed/considered extensively upstream. > If the idea is that implementing a concept at the machine level may preclude a future implementation at the IR level, it means we should be *a lot* more picky before accepting such contribution. > In this case, if I had anticipated any push-back on an IR-level implementation only based on the fact that we have now a Machine-level one, I'd likely have pushed back on the machine-level one. > > > When someone comes to the community with something that has no existing in-tree analog it isn't fair to necessarily ask them to implement it multiple different ways to prove their solution is the best. > > It may or may not be fair, but there is a tradeoff in how much effort we would require them to convince the community that this is *the* right way to go, depending on what it implies for future approaches. > > -- > Mehdi > > However, as a community, we do still exercise the right to reject contributions we disagree with, and we frequently request changes to the implementation (as is shown every time someone tries to add SPIR-V support). > > In the LLVM community we have a long history of approaching large contributions (especially ones from new contributors) with scrutiny and discussion. It would be a disservice to the project to forget that. > > River, as a last note. I see that you've started uploading patches to Phabricator, and I know you're relatively new to the community. When uploading patches it helps to include appropriate reviewers so that the right people see the patches as they come in. To that end can you please include Jessica as a reviewer? Given her relevant domain experience I think her feedback on the patches will be very valuable. > > Thank you, > -Chris > >> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hey Sanjoy, >> >> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> Hi, >> >> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: >> > The way I interpret Quentin's statement is something like: >> > >> > - Inlining turns an interprocedural problem into an intraprocedural problem >> > - Outlining turns an intraprocedural problem into an interprocedural problem >> > >> > Insofar as our intraprocedural analyses and transformations are strictly >> > more powerful than interprocedural, then there is a precise sense in which >> > inlining exposes optimization opportunities while outlining does not. >> >> While I think our intra-proc optimizations are *generally* more >> powerful, I don't think they are *always* more powerful. For >> instance, LICM (today) won't hoist full regions but it can hoist >> single function calls. If we can extract out a region into a >> readnone+nounwind function call then LICM will hoist it to the >> preheader if the safety checks pass. >> >> > Actually, for his internship last summer River wrote a profile-guided >> > outliner / partial inliner (it didn't try to do deduplication; so it was >> > more like PartialInliner.cpp). IIRC he found that LLVM's interprocedural >> > analyses were so bad that there were pretty adverse effects from many of the >> > outlining decisions. E.g. if you outline from the left side of a diamond, >> > that side basically becomes a black box to most LLVM analyses and forces >> > downstream dataflow meet points to give an overly conservative result, even >> > though our standard intraprocedural analyses would have happily dug through >> > the left side of the diamond if the code had not been outlined. >> > >> > Also, River's patch (the one in this thread) does parameterized outlining. >> > For example, two sequences containing stores can be outlined even if the >> > corresponding stores have different pointers. The pointer to be loaded from >> > is passed as a parameter to the outlined function. In that sense, the >> > outlined function's behavior becomes a conservative approximation of both >> > which in principle loses precision. >> >> Can we outline only once we've already done all of these optimizations >> that outlining would block? >> >> The outliner is able to run at any point in the interprocedural pipeline. There are currently two locations: Early outlining(pre inliner) and late outlining(practically the last pass to run). It is configured to run either Early+Late, or just Late. >> >> >> > I like your EarlyCSE example and it is interesting that combined with >> > functionattrs it can make a "cheap" pass get a transformation that an >> > "expensive" pass would otherwise be needed. Are there any cases where we >> > only have the "cheap" pass and thus the outlining would be essential for our >> > optimization pipeline to get the optimization right? >> > >> > The case that comes to mind for me is cases where we have some cutoff of >> > search depth. Reducing a sequence to a single call (+ functionattr >> > inference) can essentially summarize the sequence and effectively increase >> > search depth, which might give more results. That seems like a bit of a weak >> > example though. >> >> I don't know if River's patch outlines entire control flow regions at >> a time, but if it does then we could use cheap basic block scanning >> analyses for things that would normally require CFG-level analysis. >> >> The current patch currently just supports outlining from within a single block. Although, I had a working prototype for Region based outlining, I kept it from this patch for simplicity. So its entirely possible to add that kind of functionality because I've already tried. >> Thanks, >> River Riddle >> >> >> -- Sanjoy >> >> > >> > -- Sean Silva >> > >> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev >> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> >> >> Hi, >> >> >> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev >> >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> > No, I mean in terms of enabling other optimizations in the pipeline like >> >> > vectorizer. Outliner does not expose any of that. >> >> >> >> I have not made a lot of effort to understand the full discussion here (so >> >> what >> >> I say below may be off-base), but I think there are some cases where >> >> outlining >> >> (especially working with function-attrs) can make optimization easier. >> >> >> >> It can help transforms that duplicate code (like loop unrolling and >> >> inlining) be >> >> more profitable -- I'm thinking of cases where unrolling/inlining would >> >> have to >> >> duplicate a lot of code, but after outlining would require duplicating >> >> only a >> >> few call instructions. >> >> >> >> >> >> It can help EarlyCSE do things that require GVN today: >> >> >> >> void foo() { >> >> ... complex computation that computes func() >> >> ... complex computation that computes func() >> >> } >> >> >> >> outlining=> >> >> >> >> int func() { ... } >> >> >> >> void foo() { >> >> int x = func(); >> >> int y = func(); >> >> } >> >> >> >> functionattrs=> >> >> >> >> int func() readonly { ... } >> >> >> >> void foo(int a, int b) { >> >> int x = func(); >> >> int y = func(); >> >> } >> >> >> >> earlycse=> >> >> >> >> int func(int t) readnone { ... } >> >> >> >> void foo(int a, int b) { >> >> int x = func(a); >> >> int y = x; >> >> } >> >> >> >> GVN will catch this, but EarlyCSE is (at least supposed to be!) cheaper. >> >> >> >> >> >> Once we have an analysis that can prove that certain functions can't trap, >> >> outlining can allow LICM etc. to speculate entire outlined regions out of >> >> loops. >> >> >> >> >> >> Generally, I think outlining exposes information that certain regions of >> >> the >> >> program are doing identical things. We should expect to get some mileage >> >> out of >> >> this information. >> >> >> >> -- Sanjoy >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> > >> > >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/8b0df674/attachment-0001.html>
Mehdi AMINI via llvm-dev
2017-Aug-01 19:05 UTC
[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.
2017-08-01 11:03 GMT-07:00 Chris Bieneman <beanz at apple.com>:> > On Jul 31, 2017, at 10:38 PM, Mehdi AMINI <joker.eph at gmail.com> wrote: > > > > 2017-07-28 21:58 GMT-07:00 Chris Bieneman via llvm-dev < > llvm-dev at lists.llvm.org>: > >> Apologies for delayed joining of this discussion, but I had a few notes >> from this thread that I really wanted to chime in about. >> >> River, >> >> I don't mean to put you on the spot, but I do want to start on a semantic >> issue. In several places in the thread you used the words "we" and "our" to >> imply that you're not alone in writing this (which is totally fine), but >> your initial thread presented this as entirely your own work. So, when you >> said things like "we feel there's an advantage to being at the IR level", >> can you please clarify who is "we"? >> >> Given that there are a number of disagreements and opinions floating >> around I think it benefits us all to speak clearly about who is taking what >> stances. >> >> One particular disagreement that I think very much needs to be revisited >> in this thread was Jessica's proposal of a pipeline of: >> >> 1. IR outline >> 2. Inline >> 3. MIR outline >> >> In your response to that proposal you dismissed it out of hand with >> "feelings" but not data. Given that the proposal came from Jessica (a >> community member with significant relevant experience in outlining), and it >> was also recognized as interesting by Eric Christopher (a long-time member >> of the community with wide reaching expertise), I think dismissing it may >> have been a little premature. >> > > It isn't clear to me how much the *exact* pipeline and ordering of passes > is relevant to consider if "having an outliner at the IR level" is a good > idea. > > > I think it is particularly relevant because based on the limited > performance numbers we've seen it looks like the MIR and IR outliners have > different benefits. Figuring out a pipeline where one doesn't prevent the > other from performing good optimizations seems like a reasonable > precondition to accepting these patches. > > > > >> I also want to visit a few procedural notes. >> >> Mehdi commented on the thread that it wouldn't be fair to ask for a >> comparative study because the MIR outliner didn't have one. While I don't >> think anyone is asking for a comparative study, I want to point out that I >> think it is completely fair. >> > If a new contributor approached the community with a new SROA pass and >> wanted to land it in-tree it would be appropriate to ask for a comparative >> analysis against the existing pass. How is this different? >> > > It seems quite different to me because there is no outliner at the IR > level and so they don't provide the same functionality. The "Why at the IR > level" section of the original email combined with the performance numbers > seems largely enough to me to explain why it isn't redundant to the > Machine-level outliner. > I'd consider this work for inclusion upstream purely on its technical > merit at this point. > > > I believe the technical merit has not been shown clearly enough. The only > data we've seen has been cherry-picked and there are outstanding technical > questions about the approach. >> Discussing inclusion as part of any of the default pipeline is a different > story. > > > The patches that were sent out *do* include it in default pass pipelines. >Saying we shouldn't include this in any default pipeline for now seems totally reasonable to me. My understanding was that the patch were sent so that folks can reproduce the results, I think these were "WIP" not patches ready to commit. But again, discussing the implementation and other details is perfectly appropriate. That isn't what I was trying to point at.> > > Similarly last year, the IR-level PGO was also implemented even though we > already had a PGO implementation, because 1) it provided a generic > solutions for other frontend (just like here it could be said that it > provides a generic solution for targets) and 2) it supported cases that > FE-PGO didn't (especially around better counter-context using pre-inlining > and such). > > > >> >> Adding a new IR outliner is a different situation from when the MIR one >> was added. When the MIR outliner was introduced there was no in-tree >> analog. >> > > We still usually discuss design extensively. Skipping the IR-level option > didn't seem obvious to me, to say the least. And it wasn't really much > discussed/considered extensively upstream. > > > The reasoning for this was covered in the discussions and in Jessica's > LLVM dev meeting talk. It may not have been widely discussed because it was > widely agreed on. >I disagree with your assessment and I'm puzzle that you can claim that while your previous email was along the line "the community should discussed and decide about a new proposal". It was widely agreed on by the people who started this project, that's far from a reason to not discuss the pros/cons upstream. Again, *shrug* on my side as long as it does not preclude other approach at the IR level.> > If the idea is that implementing a concept at the machine level may > preclude a future implementation at the IR level, it means we should be *a > lot* more picky before accepting such contribution. > > > Nobody is precluding an IR implementation. We are merely holding the IR > implementation to the same high standards of justification that we held the > MIR one to. You may not recall this, but the MIR one took *months* to go > from RFC to landing in-tree. >I'm opposing to holding the RFC itself solely based on the existence of MIR. I don't disagree with reviewing carefully the IR outliner implementation, and requiring incremental individual patches, etc, on the opposite this should happen.> > In this case, if I had anticipated any push-back on an IR-level > implementation only based on the fact that we have now a Machine-level one, > I'd likely have pushed back on the machine-level one. > > > There is no pushback based solely on the presence of the MIR outliner. >Then it seems we're on perfect agreement, but that wasn't my first impression. -- Mehdi> One source of inquiry about the merits of the IR outliner is its > comparison to the MIR outliner, and whether or not the two can play well > together. This seems like a reasonable line of inquiry to me. > > > > >> When someone comes to the community with something that has no existing >> in-tree analog it isn't fair to necessarily ask them to implement it >> multiple different ways to prove their solution is the best. >> > > It may or may not be fair, but there is a tradeoff in how much effort we > would require them to convince the community that this is *the* right way > to go, depending on what it implies for future approaches. > > > Sure, and several of us are trying to have a conversation with River about > how the IR outliner will best fit into LLVM and what technical > considerations have to be made. You arguing that we should just accept the > patches as they are is counter productive to us being able to ensure that > the IR outliner is at an appropriate quality and has sufficient technical > merit. > > -Chris > > > -- > Mehdi > > >> However, as a community, we do still exercise the right to reject >> contributions we disagree with, and we frequently request changes to the >> implementation (as is shown every time someone tries to add SPIR-V support). >> >> In the LLVM community we have a long history of approaching large >> contributions (especially ones from new contributors) with scrutiny and >> discussion. It would be a disservice to the project to forget that. >> >> River, as a last note. I see that you've started uploading patches to >> Phabricator, and I know you're relatively new to the community. When >> uploading patches it helps to include appropriate reviewers so that the >> right people see the patches as they come in. To that end can you please >> include Jessica as a reviewer? Given her relevant domain experience I think >> her feedback on the patches will be very valuable. >> >> Thank you, >> -Chris >> >> On Jul 26, 2017, at 1:52 PM, River Riddle via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Hey Sanjoy, >> >> On Wed, Jul 26, 2017 at 1:41 PM, Sanjoy Das via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi, >>> >>> On Wed, Jul 26, 2017 at 12:54 PM, Sean Silva <chisophugis at gmail.com> >>> wrote: >>> > The way I interpret Quentin's statement is something like: >>> > >>> > - Inlining turns an interprocedural problem into an intraprocedural >>> problem >>> > - Outlining turns an intraprocedural problem into an interprocedural >>> problem >>> > >>> > Insofar as our intraprocedural analyses and transformations are >>> strictly >>> > more powerful than interprocedural, then there is a precise sense in >>> which >>> > inlining exposes optimization opportunities while outlining does not. >>> >>> While I think our intra-proc optimizations are *generally* more >>> powerful, I don't think they are *always* more powerful. For >>> instance, LICM (today) won't hoist full regions but it can hoist >>> single function calls. If we can extract out a region into a >>> readnone+nounwind function call then LICM will hoist it to the >>> preheader if the safety checks pass. >>> >>> > Actually, for his internship last summer River wrote a profile-guided >>> > outliner / partial inliner (it didn't try to do deduplication; so it >>> was >>> > more like PartialInliner.cpp). IIRC he found that LLVM's >>> interprocedural >>> > analyses were so bad that there were pretty adverse effects from many >>> of the >>> > outlining decisions. E.g. if you outline from the left side of a >>> diamond, >>> > that side basically becomes a black box to most LLVM analyses and >>> forces >>> > downstream dataflow meet points to give an overly conservative result, >>> even >>> > though our standard intraprocedural analyses would have happily dug >>> through >>> > the left side of the diamond if the code had not been outlined. >>> > >>> > Also, River's patch (the one in this thread) does parameterized >>> outlining. >>> > For example, two sequences containing stores can be outlined even if >>> the >>> > corresponding stores have different pointers. The pointer to be loaded >>> from >>> > is passed as a parameter to the outlined function. In that sense, the >>> > outlined function's behavior becomes a conservative approximation of >>> both >>> > which in principle loses precision. >>> >>> Can we outline only once we've already done all of these optimizations >>> that outlining would block? >>> >> >> The outliner is able to run at any point in the interprocedural >> pipeline. There are currently two locations: Early outlining(pre inliner) >> and late outlining(practically the last pass to run). It is configured to >> run either Early+Late, or just Late. >> >> >>> > I like your EarlyCSE example and it is interesting that combined with >>> > functionattrs it can make a "cheap" pass get a transformation that an >>> > "expensive" pass would otherwise be needed. Are there any cases where >>> we >>> > only have the "cheap" pass and thus the outlining would be essential >>> for our >>> > optimization pipeline to get the optimization right? >>> > >>> > The case that comes to mind for me is cases where we have some cutoff >>> of >>> > search depth. Reducing a sequence to a single call (+ functionattr >>> > inference) can essentially summarize the sequence and effectively >>> increase >>> > search depth, which might give more results. That seems like a bit of >>> a weak >>> > example though. >>> >>> I don't know if River's patch outlines entire control flow regions at >>> a time, but if it does then we could use cheap basic block scanning >>> analyses for things that would normally require CFG-level analysis. >>> >> >> The current patch currently just supports outlining from within a >> single block. Although, I had a working prototype for Region based >> outlining, I kept it from this patch for simplicity. So its entirely >> possible to add that kind of functionality because I've already tried. >> Thanks, >> River Riddle >> >> >>> >>> -- Sanjoy >>> >>> > >>> > -- Sean Silva >>> > >>> > On Wed, Jul 26, 2017 at 12:07 PM, Sanjoy Das via llvm-dev >>> > <llvm-dev at lists.llvm.org> wrote: >>> >> >>> >> Hi, >>> >> >>> >> On Wed, Jul 26, 2017 at 10:10 AM, Quentin Colombet via llvm-dev >>> >> <llvm-dev at lists.llvm.org> wrote: >>> >> > No, I mean in terms of enabling other optimizations in the pipeline >>> like >>> >> > vectorizer. Outliner does not expose any of that. >>> >> >>> >> I have not made a lot of effort to understand the full discussion >>> here (so >>> >> what >>> >> I say below may be off-base), but I think there are some cases where >>> >> outlining >>> >> (especially working with function-attrs) can make optimization easier. >>> >> >>> >> It can help transforms that duplicate code (like loop unrolling and >>> >> inlining) be >>> >> more profitable -- I'm thinking of cases where unrolling/inlining >>> would >>> >> have to >>> >> duplicate a lot of code, but after outlining would require duplicating >>> >> only a >>> >> few call instructions. >>> >> >>> >> >>> >> It can help EarlyCSE do things that require GVN today: >>> >> >>> >> void foo() { >>> >> ... complex computation that computes func() >>> >> ... complex computation that computes func() >>> >> } >>> >> >>> >> outlining=> >>> >> >>> >> int func() { ... } >>> >> >>> >> void foo() { >>> >> int x = func(); >>> >> int y = func(); >>> >> } >>> >> >>> >> functionattrs=> >>> >> >>> >> int func() readonly { ... } >>> >> >>> >> void foo(int a, int b) { >>> >> int x = func(); >>> >> int y = func(); >>> >> } >>> >> >>> >> earlycse=> >>> >> >>> >> int func(int t) readnone { ... } >>> >> >>> >> void foo(int a, int b) { >>> >> int x = func(a); >>> >> int y = x; >>> >> } >>> >> >>> >> GVN will catch this, but EarlyCSE is (at least supposed to be!) >>> cheaper. >>> >> >>> >> >>> >> Once we have an analysis that can prove that certain functions can't >>> trap, >>> >> outlining can allow LICM etc. to speculate entire outlined regions >>> out of >>> >> loops. >>> >> >>> >> >>> >> Generally, I think outlining exposes information that certain regions >>> of >>> >> the >>> >> program are doing identical things. We should expect to get some >>> mileage >>> >> out of >>> >> this information. >>> >> >>> >> -- Sanjoy >>> >> _______________________________________________ >>> >> LLVM Developers mailing list >>> >> llvm-dev at lists.llvm.org >>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > >>> > >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170801/7ebdc26b/attachment.html>
Possibly Parallel Threads
- [RFC] Add IR level interprocedural outliner for code size.
- [RFC] Add IR level interprocedural outliner for code size.
- [RFC] Add IR level interprocedural outliner for code size.
- [RFC] Add IR level interprocedural outliner for code size.
- [RFC] Add IR level interprocedural outliner for code size.