David Greene via llvm-dev
2020-Oct-20 14:24 UTC
[llvm-dev] New Pass Manager and CGSSCPassManager
Hi all, I've run into a sticky situation with CGSSCPassManager. I have a module pass that needs to run after all inlining has occurred but *before* loops have been optimized significantly. I'm not sure this is possible with the way CGSCCPassManager is formulated, at least not without hackery. My pass has to be a module pass or a CGSCC pass because it has to add global variables and function declarations. It does not otherwise modify global state or the call graph SCC. Incidentally WRT my pass, this page about pass requirements seems ambiguous: http://llvm.org/docs/WritingAnLLVMPass.html To be explicit, FunctionPass subclasses are not allowed to: 1. Inspect or modify a Function other than the one currently being processed. 2. Add or remove Functions from the current Module. 3. Add or remove global variables from the current Module. 4. Maintain state across invocations of runOnFunction (including global data). #2 is ambiguous to me. Does "add or remove Functions" mean definitions only or both definitions and declarations? It might be helpful to clarify that in this section of the document. As I understand things, CGSCCPassManager is designed to run things in a bottom-up manner: // #1 for scc in scc_list { inline callees everywhere do other CG passes for function in bottom_up(scc) { run function passes } } Have I got that right? I have some questions below about this general structure that are only tangentially related to my main issue. I need to be able to do something like this: // #2 for scc in scc_list { inline callees everywhere run my pass do other CG passes for function in bottom_up(scc) { run function passes } } It doesn't seem possible currently because there's no adaptor to run a module pass inside CGSCCPassManager. You can run a CGSCCPassManager inside a ModulePassManager but not the other way around. It kind of makes sense because a call graph SCC doesn't necessarily contain all of the Functions in a Module. On the other hand, my pass doesn't really care that it might not see all Functions in the Module in a single invocation. It would eventually see them all as CGSCCPassManager processes additional SCCs. I suppose I could make my pass a CGSCC pass but that seems like overkill for my purposes. Indeed, I had no need to do this with the Old Pass Manager as inlining ran in a ModulePassManager, not a CGSCCPassManager. When I first looked into this I expected the inliner SCC algorithm to work something like this: // #3 for scc in scc_list { for function in bottom_up(scc) { inline callees into function run function passes } do other CG passes } But it apparently doesn't work that way. If it did I would be in really bad shape because there would be no way to run my pass after all inlining has occurred but before loops have been significantly altered. It is functionally incorrect for my pass to modify a Function B and have B inlined into the Function A which my pass also modifies. Another option would be to split the current CGSCC pass pipeline in two, creating one pipeline for things to run before my pass and another for things to run after my pass. But upstream is definitely not interested in my pass so this would be a downstream change and rather burdensome to maintain. Now for the additional questions about CGSCCPassManager mentioned above.>From the pseudocode #1 above, it looks like all inlining happens beforeany optimization. This seems sub-optimal to me because transformations may make Functions good inline candidates when they were not previously. Is this a know issue with the current setup? I'm kind of glad it works like #1 (if indeed it does) because it at least makes my goal theoretically attainable. But another part of me really wants it to work like pseudocode #3 because it seems better for optimization. Thanks for all insights and help! -David
Arthur Eubanks via llvm-dev
2020-Oct-20 17:31 UTC
[llvm-dev] New Pass Manager and CGSSCPassManager
On Tue, Oct 20, 2020 at 7:24 AM David Greene via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > I've run into a sticky situation with CGSSCPassManager. I have a module > pass that needs to run after all inlining has occurred but *before* > loops have been optimized significantly. I'm not sure this is possible > with the way CGSCCPassManager is formulated, at least not without > hackery. >Could you explain what your pass does and why it needs to be where it needs to be?> > My pass has to be a module pass or a CGSCC pass because it has to add > global variables and function declarations. It does not otherwise modify > global state or the call graph SCC. > > Incidentally WRT my pass, this page about pass requirements seems > ambiguous: > > http://llvm.org/docs/WritingAnLLVMPass.html > > To be explicit, FunctionPass subclasses are not allowed to: > > 1. Inspect or modify a Function other than the one currently being > processed. > 2. Add or remove Functions from the current Module. > 3. Add or remove global variables from the current Module. > 4. Maintain state across invocations of runOnFunction (including global > data). > > #2 is ambiguous to me. Does "add or remove Functions" mean definitions > only or both definitions and declarations? It might be helpful to > clarify that in this section of the document. >At least for the NPM, it was designed with potential future concurrency in mind. Modifying the list of functions in a module, even just declarations, could mess with that. http://llvm.org/docs/WritingAnLLVMPass.html is more of a legacy PM tutorial. I started on http://llvm.org/docs/WritingAnLLVMNewPMPass.html for the NPM, I can clarify that there.> > As I understand things, CGSCCPassManager is designed to run things in a > bottom-up manner: > > // #1 > for scc in scc_list { > inline callees everywhere > do other CG passes > for function in bottom_up(scc) { > run function passes > } > } > > Have I got that right? I have some questions below about this general > structure that are only tangentially related to my main issue. >The inliner inlines calls within the function, it doesn't look at callers of the current function. A CGSCC pass shouldn't look at anything above the current SCC. As you mentioned below, this is what makes callers see the most optimized version of these functions when deciding to inline or not.> I need to be able to do something like this: > > // #2 > for scc in scc_list { > inline callees everywhere > run my pass > do other CG passes > for function in bottom_up(scc) { > run function passes > } > } > > It doesn't seem possible currently because there's no adaptor to run a > module pass inside CGSCCPassManager. You can run a CGSCCPassManager > inside a ModulePassManager but not the other way around. It kind of > makes sense because a call graph SCC doesn't necessarily contain all of > the Functions in a Module. On the other hand, my pass doesn't really > care that it might not see all Functions in the Module in a single > invocation. It would eventually see them all as CGSCCPassManager > processes additional SCCs. > > I suppose I could make my pass a CGSCC pass but that seems like overkill > for my purposes. Indeed, I had no need to do this with the Old Pass > Manager as inlining ran in a ModulePassManager, not a CGSCCPassManager. >It doesn't really make sense to run a module pass multiple times because of the number of SCCs/functions. A module pass should just do everything it needs to do once and be done.> > When I first looked into this I expected the inliner SCC algorithm to > work something like this: > > // #3 > for scc in scc_list { > for function in bottom_up(scc) { > inline callees into function > run function passes > } > do other CG passes > } > > But it apparently doesn't work that way. If it did I would be in really > bad shape because there would be no way to run my pass after all > inlining has occurred but before loops have been significantly altered. > It is functionally incorrect for my pass to modify a Function B and have > B inlined into the Function A which my pass also modifies. > > Another option would be to split the current CGSCC pass pipeline in two, > creating one pipeline for things to run before my pass and another for > things to run after my pass. But upstream is definitely not interested > in my pass so this would be a downstream change and rather burdensome to > maintain. > > Now for the additional questions about CGSCCPassManager mentioned above. > > From the pseudocode #1 above, it looks like all inlining happens before > any optimization. This seems sub-optimal to me because transformations > may make Functions good inline candidates when they were not previously. > Is this a know issue with the current setup? I'm kind of glad it works > like #1 (if indeed it does) because it at least makes my goal > theoretically attainable. But another part of me really wants it to > work like pseudocode #3 because it seems better for optimization. > > Thanks for all insights and help! > > -David > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201020/c147f2d0/attachment.html>
Mircea Trofin via llvm-dev
2020-Oct-20 17:36 UTC
[llvm-dev] New Pass Manager and CGSSCPassManager
On Tue, Oct 20, 2020 at 10:32 AM Arthur Eubanks via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Tue, Oct 20, 2020 at 7:24 AM David Greene via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi all, >> >> I've run into a sticky situation with CGSSCPassManager. I have a module >> pass that needs to run after all inlining has occurred but *before* >> loops have been optimized significantly. I'm not sure this is possible >> with the way CGSCCPassManager is formulated, at least not without >> hackery. >> > Could you explain what your pass does and why it needs to be where it > needs to be? > >> >> My pass has to be a module pass or a CGSCC pass because it has to add >> global variables and function declarations. It does not otherwise modify >> global state or the call graph SCC. >> >> Incidentally WRT my pass, this page about pass requirements seems >> ambiguous: >> >> http://llvm.org/docs/WritingAnLLVMPass.html >> >> To be explicit, FunctionPass subclasses are not allowed to: >> >> 1. Inspect or modify a Function other than the one currently being >> processed. >> 2. Add or remove Functions from the current Module. >> 3. Add or remove global variables from the current Module. >> 4. Maintain state across invocations of runOnFunction (including global >> data). >> >> #2 is ambiguous to me. Does "add or remove Functions" mean definitions >> only or both definitions and declarations? It might be helpful to >> clarify that in this section of the document. >> > At least for the NPM, it was designed with potential future concurrency in > mind. Modifying the list of functions in a module, even just declarations, > could mess with that. http://llvm.org/docs/WritingAnLLVMPass.html is more > of a legacy PM tutorial. I started on > http://llvm.org/docs/WritingAnLLVMNewPMPass.html for the NPM, I can > clarify that there. > >> >> As I understand things, CGSCCPassManager is designed to run things in a >> bottom-up manner: >> >> // #1 >> for scc in scc_list { >> inline callees everywhere >> do other CG passes >> for function in bottom_up(scc) { >> run function passes >> } >> } >> >> Have I got that right? I have some questions below about this general >> structure that are only tangentially related to my main issue. >> > The inliner inlines calls within the function, it doesn't look at callers > of the current function. A CGSCC pass shouldn't look at anything above the > current SCC. As you mentioned below, this is what makes callers see the > most optimized version of these functions when deciding to inline or not. >(may be nit) not quite: see "shouldBeDeferred" <https://github.com/llvm/llvm-project/blob/master/llvm/lib/Analysis/InlineAdvisor.cpp#L185>, where the cost of inlining the current caller into its callers is evaluated.> I need to be able to do something like this: >> >> // #2 >> for scc in scc_list { >> inline callees everywhere >> run my pass >> do other CG passes >> for function in bottom_up(scc) { >> run function passes >> } >> } >> >> It doesn't seem possible currently because there's no adaptor to run a >> module pass inside CGSCCPassManager. You can run a CGSCCPassManager >> inside a ModulePassManager but not the other way around. It kind of >> makes sense because a call graph SCC doesn't necessarily contain all of >> the Functions in a Module. On the other hand, my pass doesn't really >> care that it might not see all Functions in the Module in a single >> invocation. It would eventually see them all as CGSCCPassManager >> processes additional SCCs. >> >> I suppose I could make my pass a CGSCC pass but that seems like overkill >> for my purposes. Indeed, I had no need to do this with the Old Pass >> Manager as inlining ran in a ModulePassManager, not a CGSCCPassManager. >> > It doesn't really make sense to run a module pass multiple times because > of the number of SCCs/functions. A module pass should just do everything it > needs to do once and be done. > >> >> When I first looked into this I expected the inliner SCC algorithm to >> work something like this: >> >> // #3 >> for scc in scc_list { >> for function in bottom_up(scc) { >> inline callees into function >> run function passes >> } >> do other CG passes >> } >> >> But it apparently doesn't work that way. If it did I would be in really >> bad shape because there would be no way to run my pass after all >> inlining has occurred but before loops have been significantly altered. >> It is functionally incorrect for my pass to modify a Function B and have >> B inlined into the Function A which my pass also modifies. >> >> Another option would be to split the current CGSCC pass pipeline in two, >> creating one pipeline for things to run before my pass and another for >> things to run after my pass. But upstream is definitely not interested >> in my pass so this would be a downstream change and rather burdensome to >> maintain. >> >> Now for the additional questions about CGSCCPassManager mentioned above. >> >> From the pseudocode #1 above, it looks like all inlining happens before >> any optimization. This seems sub-optimal to me because transformations >> may make Functions good inline candidates when they were not previously. >> Is this a know issue with the current setup? I'm kind of glad it works >> like #1 (if indeed it does) because it at least makes my goal >> theoretically attainable. But another part of me really wants it to >> work like pseudocode #3 because it seems better for optimization. >> >> Thanks for all insights and help! >> >> -David >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201020/9e638b7e/attachment-0001.html>
David Greene via llvm-dev
2020-Oct-20 18:14 UTC
[llvm-dev] New Pass Manager and CGSSCPassManager
Arthur Eubanks <aeubanks at google.com> writes:>> I've run into a sticky situation with CGSSCPassManager. I have a module >> pass that needs to run after all inlining has occurred but *before* >> loops have been optimized significantly. I'm not sure this is possible >> with the way CGSCCPassManager is formulated, at least not without >> hackery. >> > Could you explain what your pass does and why it needs to be where it > needs to be?Unfortunately I'm not sure I can due to IP issues. I can probably say that in addition to the correctness restriction about not inlining a processed function into another processed function, it wants to see loops as close to the original source form as possible. I can't really ease that restriction as other clients I don't control rely on it. I can maybe work around the correctness issue by doing a post-SCC cleanup to fix up problem functions. But that would be almost as complicated as the pass itself and so would be best avoided.> At least for the NPM, it was designed with potential future concurrency in > mind. Modifying the list of functions in a module, even just declarations, > could mess with that.> I can clarify that there.That would be great, thanks!> The inliner inlines calls within the function, it doesn't look at callers > of the current function. A CGSCC pass shouldn't look at anything above the > current SCC. As you mentioned below, this is what makes callers see the > most optimized version of these functions when deciding to inline or not.Ah, good point. It would do all inlining within an SCC, which I'd guess is usually pretty small. If inlining happens across SCCs that could be trouble for me. Reading the code, it's not entirely clear whether that is possible. I guess as inlining proceeds the SCC being processed may become small enough that it can be subsumed into some other SCC. In fact it's rather likely in many cases.>> I suppose I could make my pass a CGSCC pass but that seems like overkill >> for my purposes. Indeed, I had no need to do this with the Old Pass >> Manager as inlining ran in a ModulePassManager, not a CGSCCPassManager. >> > It doesn't really make sense to run a module pass multiple times because of > the number of SCCs/functions. A module pass should just do everything it > needs to do once and be done.Got it, that makes perfect sense. But I'm left with an even worse problem than I had before. :( I may have to end up disabling various optimizations which would be unfortunate. -David