Johannes Doerfert via llvm-dev
2020-Oct-28 05:48 UTC
[llvm-dev] CGSCC passes (in the new PM)
Hey, Recently I ran into multiple problems with the new PM CGSCC Attributor pass. Instead of going into details now, I want to first ask what we want/expect to work when it comes to CGSCC passes (in the new PM). What I was hoping to (eventually) do are the following things: - Internalize functions, that is, replace all uses of `define weak @foo() {}` with `define private @foo.internal() {}` while keeping @foo around. - Delete functions, both in the current SCC but also children SCCs, e.g., ``` static void bar() { ... } static void baz() { bar(); } void quack() { if (/* something that is basically */ false) baz(); } ``` We visit the singleton SCCs with bar, baz, then quack only to realize when we optimize quack that baz and bar are dead and can be removed. - Delete multiple functions from the current SCC, this is already breaking with our current (inliner-centric) update methods (AFAIKT). - Add/Remove/Modify globals variables. (Not related to the call graph, but I wanted to mention it anyway.) - Outline parts of a function in the current SCC into a new function that will then either become part of the current SCC (even if that is an overapproximation) or a child SCC. - Delete calls to any function. There might be more but this is all I remember right now. If it turns out too many things cannot be supported, it's unclear if the module Attributor pass is not going to be preferable, especially since we can parallelize the pass itself more easily (I think) than multiple concurrent CGSCC passes. I'd appreciate any thoughts on this :) Thanks in advance, Johannes
I'm missing a lot of the context for the initial implementation of the NPM CGSCC infra so I may be wrong in some aspects, but my initial thoughts are below On Tue, Oct 27, 2020 at 10:49 PM Johannes Doerfert < johannesdoerfert at gmail.com> wrote:> Hey, > > Recently I ran into multiple problems with the new PM CGSCC Attributor > pass. > Instead of going into details now, I want to first ask what we > want/expect to work when it comes to CGSCC passes (in the new PM). > > What I was hoping to (eventually) do are the following things: > > - Internalize functions, that is, replace all uses of > `define weak @foo() {}` > with > `define private @foo.internal() {}` > while keeping @foo around. >I can see issues where foo.internal() is created (say when visiting foo()), but the CGSCC pass manager doesn't visit it, and then later on when a caller of foo() now calls foo.internal(), the CGSCC contract is broken since foo.internal() hasn't been visited yet. Pretty sure that modifying all callers of foo() when splitting foo() is bad since you'd be modifying a SCC further up.> > - Delete functions, both in the current SCC but also children SCCs, > e.g., > ``` > static void bar() { ... } > static void baz() { bar(); } > void quack() { if (/* something that is basically */ false) baz(); } > ``` > We visit the singleton SCCs with bar, baz, then quack only to realize > when we optimize quack that baz and bar are dead and can be removed. >The inliner already deletes non-recursive dead functions, so I think this should already work. (Although looking through the code, the inliner wouldn't delete bar in your case which seems like an oversight)> > - Delete multiple functions from the current SCC, this is already > breaking with our current (inliner-centric) update methods (AFAIKT). >To clarify, deleting potentially mutually recursive functions in the current SCC? That seems reasonable, maybe by making the existing LazyCallGraph::removeDeadFunction() more generic.> > - Add/Remove/Modify globals variables. (Not related to the call graph, > but I wanted to mention it anyway.) >Not sure that modifying globals makes sense in a CGSCC pass. Maybe if the global were only used by functions in the current SCC. Any specific examples?> > - Outline parts of a function in the current SCC into a new function > that will then either become part of the current SCC (even if that is > an overapproximation) or a child SCC. >I've actually been trying to work on this to make coroutines work under the NPM. For an example see https://reviews.llvm.org/D88714. I've been meaning to ask for more specifics on exactly what coroutines do to the call graph to better understand what a proper solution would look like.> > - Delete calls to any function. > > There might be more but this is all I remember right now. If it turns > out too many things cannot be supported, it's unclear if the module > Attributor pass is not going to be preferable, especially since we can > parallelize the pass itself more easily (I think) than multiple > concurrent CGSCC passes. >I have a feeling that concurrency at the pass manager level is a long way off.> > I'd appreciate any thoughts on this :) >At a high level, does the attributor work on SCCs? It seems that it works on the call graph, but not in a bottom-up way, making it different from typical CGSCC passes. That makes me think it shouldn't be working with the CGSCC infra. Did you want to run it alongside the inliner/other CGSCC passes? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201028/62d82d72/attachment.html>
Johannes Doerfert via llvm-dev
2020-Oct-29 01:11 UTC
[llvm-dev] CGSCC passes (in the new PM)
On 10/28/20 2:36 PM, Arthur Eubanks wrote: > I'm missing a lot of the context for the initial implementation of the NPM > CGSCC infra so I may be wrong in some aspects, but my initial thoughts are > below Yeah, I don't know either so I figured I ask :) > On Tue, Oct 27, 2020 at 10:49 PM Johannes Doerfert < > johannesdoerfert at gmail.com> wrote: > >> Hey, >> >> Recently I ran into multiple problems with the new PM CGSCC Attributor >> pass. >> Instead of going into details now, I want to first ask what we >> want/expect to work when it comes to CGSCC passes (in the new PM). >> >> What I was hoping to (eventually) do are the following things: >> >> - Internalize functions, that is, replace all uses of >> `define weak @foo() {}` >> with >> `define private @foo.internal() {}` >> while keeping @foo around. >> > I can see issues where foo.internal() is created (say when visiting foo()), > but the CGSCC pass manager doesn't visit it, and then later on when a > caller of foo() now calls foo.internal(), the CGSCC contract is broken > since foo.internal() hasn't been visited yet. > Pretty sure that modifying all callers of foo() when splitting foo() is bad > since you'd be modifying a SCC further up. Fair enough, I can do that in the Module pass only I guess. >> - Delete functions, both in the current SCC but also children SCCs, >> e.g., >> ``` >> static void bar() { ... } >> static void baz() { bar(); } >> void quack() { if (/* something that is basically */ false) baz(); } >> ``` >> We visit the singleton SCCs with bar, baz, then quack only to realize >> when we optimize quack that baz and bar are dead and can be removed. >> > The inliner already deletes non-recursive dead functions, so I think this > should already work. (Although looking through the code, the inliner > wouldn't delete bar in your case which seems like an oversight) This "oversight" is consequential though, see the next comment. > >> >> - Delete multiple functions from the current SCC, this is already >> breaking with our current (inliner-centric) update methods (AFAIKT). >> > To clarify, deleting potentially mutually recursive functions in the > current SCC? That seems reasonable, maybe by making the existing > LazyCallGraph::removeDeadFunction() more generic. Yes, but: [This is from my memory so it is a bit fuzzy but I think we can reconstruct the situation if we want to.] The methods updateCGAndAnalysisManagerForCGSCCPass and updateCGAndAnalysisManagerForFunctionPass can, as I understand it, split a SCC if you delete a function from it. They use UpdatedRC in CGSCCUpdateResult to later repair the CG on the "way up". Now if you have a single SCC with 3 functions [bar,baz,quack] of which 2 are dead: ``` static void bar() { quack(); } static void baz() { bar(); } void quack() { if (/* something that is basically */ false) baz(); } ``` You can delete baz and get two SCCs each with one function [bar] and [quack]. Now if you continue deleting, from what I have seen, we break some invariants when we delete bar out of [bar]. I think the problem is that we only have one UpdatedRC in the CGSCCUpdateResult which is later used to repair things. IIRC, UpdatedRC for the oroginal [bar,baz,quack] SCC can point to the [bar] SCC which is deleted though. That is not supposed to happen, and outside the Attributor I don't think can happen. >> >> - Add/Remove/Modify globals variables. (Not related to the call graph, >> but I wanted to mention it anyway.) >> > Not sure that modifying globals makes sense in a CGSCC pass. Maybe if the > global were only used by functions in the current SCC. Any specific > examples? I mean, if I look at all the uses I might look at other SCCs so I'm unsure if this is legal for an CGSCC pass at all to look at globals if we would restrict it to the ones only with uses in the SCC. >> >> - Outline parts of a function in the current SCC into a new function >> that will then either become part of the current SCC (even if that is >> an overapproximation) or a child SCC. >> > I've actually been trying to work on this to make coroutines work under the > NPM. For an example see https://reviews.llvm.org/D88714. I've been meaning > to ask for more specifics on exactly what coroutines do to the call graph > to better understand what a proper solution would look like. > >> >> - Delete calls to any function. >> >> There might be more but this is all I remember right now. If it turns >> out too many things cannot be supported, it's unclear if the module >> Attributor pass is not going to be preferable, especially since we can >> parallelize the pass itself more easily (I think) than multiple >> concurrent CGSCC passes. >> > I have a feeling that concurrency at the pass manager level is a long way > off. Yeah, and I feel passes move away from CGSCC towards module passes. If that is the case, I'll concentrate on the Attributor module pass and look at parallelization of the fixpoint iteration inside the pass instead. >> >> I'd appreciate any thoughts on this :) >> > At a high level, does the attributor work on SCCs? It seems that it works > on the call graph, but not in a bottom-up way, making it different from > typical CGSCC passes. That makes me think it shouldn't be working with the > CGSCC infra. Did you want to run it alongside the inliner/other CGSCC > passes? Running in the inliner loop was the main motivation for the CGSCC version. Along the ability for other CGSCC passes to use it, e.g., we run it in the OpenMPOpt pass. We can run it as either CGSCC or module pass and depending on this thread we might need to add way more restrictions to the CGSCC pass or completely move to a module version after all.