Sean Silva via llvm-dev
2016-Jul-22 08:55 UTC
[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...
The more closely I look at this, the more it seems like there may be a useful incremental step in the transition to the new PM: use the new PM analysis machinery in the old PM. If this is possible, it will simplify the old PM and (hopefully) allow an incremental transition to the new PM instead of a flag day transition for the switch. I.e., AFAICT, the new PM transition is essentially about 2 mostly orthogonal aspects of running optimization pipelines: 1. Analysis computation and analysis result lifetime management (including avoiding analysis recomputation) 2. Running transformation passes over their respective IRUnit's in some order These are conflated in the old PM. In reality, the only interaction between them (with the new PM machinery for 1.) is a small number of places within 2. which need to call 'invalidate'. I'm pretty sure that 2. is fairly similar in the new PM and old PM (the main difference is that the notion of "adapters" is split out in the new PM). The analysis handling seems to be what makes the old PM so difficult to understand (e.g. it is the cause of the multiple inheritance in the implementation). Trying to unify analyses and transformations (and some questionable (in hindsight) implementation decisions) seems to be the main "problem" with the design of the old PM AFAICT (there are other issues, but they are more "nice to have"). IMO it is an anti-pattern to think of analyses as "passes". There are just "analyses" and "transformations" and they are two separate things. In fact, the `run` method on analyses should probably be called `computeResult` or something like that to avoid confusion. And the `run` method on transformations could just as easily be called `performTransformation`. I remember asking and getting and answer from Chandler ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150907/299083.html) about how to coexist the old and new PM compatibly so individual passes would be able to work for both and we wouldn't need to have a flag day. I wasn't able to find the discussions that Chandler cited, but I suspect that the answer is that we didn't have enough analyses ported at that point to consider sharing the analysis management between the old and new PM. If this works out it may be the evolutionary path we have all been wanting for this pass manager transition. Fingers crossed. Hopefully I'm not overlooking some major issue. Anyway... back to working on the analysis manager dependency tracking. -- Sean Silva On Thu, Jul 21, 2016 at 1:59 AM, Sean Silva <chisophugis at gmail.com> wrote:> We did some basic sanity checking that memory usage didn't go out of > control (it doesn't; at least with with a simple > preserves-all/preserves-none invalidation scheme and the current LTO > pipeline). Also, I did some basic sanity checking for compile time. The > simple preserves-all/preserves-none invalidation scheme seems marginally > slower, but no real conclusions (besides a simple sanity check) can be > drawn without the real analysis preservation semantics in place. > > I'll start working on fixing the analysis managers. There seem to > basically be two parts (although they may need to be done simultaneously to > make sure all the pieces fit together): > - unify all the analysis managers into a single analysis manager for all > IRUnitT's (requires type-erasing the IRUnit) > - introduce the dependency tracking machinery > > I think I gave a reasonable outline in the two posts above: > - the one starting with "To clarify, it seems like the current new PM is > essentially trying to solve the problem of maintaining/updating a mapping" > - the one starting with "Yeah, the mechanics of maintaining this fully > general mapping are straightforward in the abstract" > > I'm happy to do a centralized writeup if anybody wants. Just let me know. > > As far as changes to the code, the updates to the new PM passes should > hopefully be mechanical (despite there being many of them). However, from > what I can tell, fixing this problem will require touching most lines of > the analysis manager machinery so the diff will probably be a bit scary, > even though I think we can keep the same basic structure (i.e. a per-IRUnit > std::list holding one analysis result (at a stable address) per element, > combined with a DenseMap from (Analysis, IRUnit) to an element of the > per-IRUnit storage list (see AnalysisResultListMapT and AnalysisResultMapT > in include/llvm/IR/PassManager.h)). > The main changes to the analysis manager will be: > - type-erasing the IRUnit > - the elements of the AnalysisResultListMapT will need to keep track of > any dependents > - the analysis manager will need to update those dependencies as it is > re-entered by analyses getting results of other analyses > - the analysis manager will need to walk the dependencies to do transitive > invalidation > > I think the most robust approach is for analysis dependencies to be > implicitly constructed by the analysis manager via tracking entry/exit from > get{,Cached}Result. > One alternative is for analyses to explicitly pass in their ID to > getResult to indicate the dependency, but that seems error-prone (and also > verbose). The issue is that we will need a getResult API that does not > track dependencies for use by transformation passes (since there is no > dependency to track in that case); so an innocuous copy-paste from a > transform pass to an analysis can result in a failure to track dependencies > and risk of use-after-free (we could fight this with the type system, but I > think that would get a bit verbose (I'm willing to try it though if people > would prefer)) > One restriction of the implicit tracking approach is that it requires all > calls into the analysis manager to occur in the `run` method of the > analysis (so that the dependencies are implicitly tracked via re-entrance > to get{,Cached}Result); is this a reasonable restriction? > > > One annoying problem is that I think that the dependency links will need > to be bidirectional. To use the example analysis cache from my other post: > (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar, > [(SomeModuleAnalysis, module TheModule)]) > (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz, > [(SomeModuleAnalysis, module TheModule)]) > (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for > TheModule, [(SomeFunctionAnalysis, function @baz)]) > (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for > @baz, []) > > if we delete function @baz, then the dependent list [(SomeFunctionAnalysis, > function @baz)] for `(SomeModuleAnalysis, module TheModule)` will now > have a stale pointer to function @baz. I think that in practice we can > check to see if `(SomeFunctionAnalysis, function @baz)` is in our hash > table and if it isn't then just ignore the dependency as "this dependent > analysis result has already been freed". In the worst case (memory > allocator reuses the memory for another function) we may spuriously free an > analysis result for a different function. However it is still unsettling > (and may actually be UB in C++?). > Ideally we would track bidirectional links; that way when we remove an > analysis result we also have it remove itself from dependence lists of all > of its dependencies. > > -- Sean Silva > > On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <chisophugis at gmail.com> wrote: > >> >> >> On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> It looks like there is really no sane fix within the current >>> infrastructure. I've had to essentially trigger invalidation (except in the >>> PreservedAnalyses::all() case) in the function pass manager and function to >>> loop adapters. >>> >> >> invalidation of *everything* I mean. >> >> -- Sean Silva >> >> >>> >>> So we basically need to get the analysis manager dependency tracking >>> fixed. >>> >>> Davide and I will get measurements on the resident set impact of all >>> this caching (even with the overconservative invalidation for now) to see >>> the impact. If there is a big rss impact then we probably want to consider >>> that problem at the same time as the rewrite of the analysis manager. >>> >>> -- Sean Silva >>> >>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth < >>>>> chandlerc at gmail.com> wrote: >>>>> >>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth < >>>>>>> chandlerc at gmail.com> wrote: >>>>>>> >>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li < >>>>>>>>> davidxl at google.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth < >>>>>>>>>> chandlerc at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Yea, this is a nasty problem. >>>>>>>>>>> >>>>>>>>>>> One important thing to understand is that this is specific to >>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard >>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need >>>>>>>>>>> to address. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We can call this type of dependencies (holding references) >>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A' >>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is >>>>>>>>>> computed. >>>>>>>>>> >>>>>>>>>> There are actually quite a few examples of hard-dependency case. >>>>>>>>>> For instance LoopAccessInfo, LazyValueInfo etc which hold references to >>>>>>>>>> other analyses. >>>>>>>>>> >>>>>>>>>> Problem involving hard-dependency is actually easier to detect, >>>>>>>>>> as it is usually a compile time problem. Issues involving soft dependencies >>>>>>>>>> are more subtle and can lead to wrong code gen. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Did you mean to say that soft-dependency problems are easier to >>>>>>>>> detect? At least my intuition is that soft-dependency is easier because >>>>>>>>> there is no risk of dangling pointers to other analyses. >>>>>>>>> >>>>>>>> >>>>>>>> The issue is that the fact that there is *any* dependency isn't >>>>>>>> clear. >>>>>>>> >>>>>>>> However, I think the only real problem here are these "hard >>>>>>>> dependencies" (I don't really like that term though). For others, only an >>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about >>>>>>>> the fact that people have to remember this. >>>>>>>> >>>>>>>> The question is how often there are cross-data-structure >>>>>>>> references. David mentions a few examples, and I'm sure there are more, but >>>>>>>> it isn't clear to me yet whether this is pervasive or occasional. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> I just did a quick run-through of PassRegistry.def and this is what >>>>>>> I found: >>>>>>> >>>>>>> Module analyses: 0/5 hold pointers to other analyses >>>>>>> CallGraph: No pointers to other analyses. >>>>>>> LazyCallGraph: No pointers to other analyses. >>>>>>> ProfileSummaryAnalysis: No pointers to other analyses. >>>>>>> TargetLibraryAnalysis: No pointers to other analyses. >>>>>>> VerifierAnalysis: No pointers to other analyses. >>>>>>> >>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis. >>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function analysis). >>>>>>> >>>>>>> Function analyses: 9/17 keep pointers to other analysis >>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual >>>>>>> AA result objects. >>>>>>> AssumptionAnalysis: No pointers to other analyses. >>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and >>>>>>> BPI. >>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses. >>>>>>> (uses LoopInfo to "recalculate" though) >>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache >>>>>>> and DominatorTree >>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses. >>>>>>> (uses DominatorTreeAnalysis for "recalculate" though). >>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores no >>>>>>> pointers. >>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache, >>>>>>> TargetLibraryInfo, DominatorTree. >>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>> ScalarEvolution, LoopInfo >>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree >>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree >>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree, >>>>>>> DomFrontier >>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo, >>>>>>> AssumptionCache, DominatorTree, LoopInfo >>>>>>> TargetLibraryAnalysis: Has no dependencies >>>>>>> TargetIRAnalysis: Has no dependencies. >>>>>>> >>>>>>> Function alias analyses: 3/5 keep pointers to other analyses >>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache, >>>>>>> DominatorTree, LoopInfo >>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo >>>>>>> SCEVAA: Keeps pointer to ScalarEvolution >>>>>>> ScopedNoAliasAA: No dependencies >>>>>>> TypeBasedAA: No dependencies >>>>>>> >>>>>>> >>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses. >>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have no >>>>>>> dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have >>>>>>> dependencies that are used just for a "recalculate" step that retains no >>>>>>> pointers. >>>>>>> So I think it is fair to say that analyses which hold pointers to >>>>>>> other analyses is not an exceptional case. In fact, analyses that use other >>>>>>> analyses just for a "recalculate" step seems to be the exceptional case >>>>>>> (only 3/28 or about 10%) >>>>>>> >>>>>> >>>>>> Interesting! >>>>>> >>>>>> Most of these look like they hold a pointer to the root analysis as >>>>>> opposed to detailed objects *inside* the analysis? >>>>>> >>>>>> It might make sense to try to handle this very specific pattern in a >>>>>> special way of overriding the invalidate routines is too error prone.... We >>>>>> could try to make this work "automatically" but I'm worried this would be >>>>>> challenging to get right. Open to suggestions of course. >>>>>> >>>>>> Any other ideas about what would make sense to handle this? >>>>>> >>>>>> Does it make sense to override the invalidate routines now and >>>>>> iterate from there? I feel like you've done a lot of the research necessary >>>>>> for this already... >>>>>> >>>>> >>>>> I'll keep pushing forward tomorrow with building test-suite >>>>> successfully using the new PM for the LTO pipeline (I was doing some >>>>> unrelated LLD stuff for most of today). It will be interesting to see how >>>>> many `invalidate` overrides will be needed to avoid these issues for just >>>>> the LTO pipeline on test-suite. >>>>> >>>> >>>> I spent the better part of today working on this and will continue >>>> tomorrow; this problem seems nastier than I thought. For some reason the >>>> LTO pipeline (or something about LTO) seems to hit on these issues much >>>> more (I'm talking like 40k lines of ASan error reports from building >>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using >>>> the old PM). Some notes: >>>> >>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to >>>> account for quite a few of the problems. >>>> - BasicAA and other stuff are marked (by overriding `invalidate` to >>>> return false) to never be invalidated because they are "stateless". However >>>> they still hold pointers and so they do need to be invalidated. >>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround >>>> similar to r274656 in various passes. >>>> - D21921 is holding up -- I haven't hit any issues with the core logic >>>> of that patch. >>>> - AAResults holds handles to various AA result objects. This means it >>>> pretty much always needs to be invalidated unless you are sure that none of >>>> the AA's will get invalidated. >>>> >>>> >>>> The existing `invalidate` method doesn't have the right semantics for >>>> even an error-prone solution :( We are going to need to make some >>>> significant changes to even get basic sanity I think. Perhaps each analysis >>>> can expose a "preserve" static function. E.g. instead of >>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`. >>>> I'm actually not quite sure that that will even work. Once I have >>>> test-suite fully building successfully with the LTO pipeline in the new PM >>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for >>>> different IRUnitT's). >>>> But at this point I'm not confident running *any* pass pipeline in the >>>> new PM without at least assertions+ASan. >>>> >>>> We may want to have a proper design discussion around this problem >>>> though. >>>> >>>> Also I'd like to have test-suite working (by hook or by crook) with LTO >>>> in the new PM so we can get some numbers on the resident set impact of all >>>> this caching; if it is really problematic then we may need to start talking >>>> front-and-center about different invalidation policies for keeping this in >>>> check instead of leaving it as something that we will be able to patch >>>> later. >>>> >>>> >>>> >>>> The more I think about it, the more I'm convinced that the real "hard" >>>> problem that the new PM is exposing us to is having the ability for any >>>> pass to ask for any analysis on any IRUnitT (and any specific IRUnit of >>>> that IRUnitT) and have the result stored somewhere and then invalidated. >>>> This means that "getAnalysisUsage" is not just a list of passes, but much >>>> more complicated and is essentially a set of arbitrary pairs "(analysis, >>>> IRUnit)" (and the associated potential tangle of dependencies between the >>>> state cached on these tuples). With the old PM, you essentially are looking >>>> at a problem of scheduling the lifetime of analyses of the same IRUnit >>>> intermingled with transformation passes on that same IRUnit, so you only >>>> have the "analysis" part of the tuple above, making things much simpler >>>> (and handling dependencies is much simpler too). We've obviously outgrown >>>> this model with examples like LAA, AssumptionCacheTracker, etc. that hack >>>> around this in the old PM. We may want to have a fresh re-examination of >>>> what problems we are exactly trying to solve. >>>> >>>> For me, my main concern now is what changes need to be made in order to >>>> feel confident running a pipeline in the new PM without assertions+ASan. >>>> >>>> >>>> Sorry for the long post, just brain-dumping before heading home. >>>> >>>> -- Sean Silva >>>> >>>> >>>> >>>> >>>>> >>>>> -- Sean Silva >>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160722/33b430ea/attachment.html>
Sean Silva via llvm-dev
2016-Jul-24 09:58 UTC
[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...
I've started looking specifically at the existing code and how it needs to be changed. It seems like the concept-based polymorphism stuff in PassManagerInternal.h actually don't need to be changed that much. AnalysisPassConcept and AnalysisResultConcept need to be changed to take a type-erased IRUnit (e.g. a void* or something). AnalysisPassModel and AnalysisResultModel (which inherit from their respective abstract base classes I mentioned above) are already templated on the IRUnitT and so they can just cast the void* back to the right type. Adding the dependency tracking seems like it will be mostly a data structure change with some isolated "algorithmic" changes to track/invalidate dependencies. Most of the methods that need to know the specific IRUnitT type already take the IRUnit as an argument (e.g. getResult, getCachedResult, invalidate) and so it's actually somewhat natural for the analysis manager not be templated on IRUnitT (but rather to have just those methods be templated). Also, up until now I hadn't noticed this weird "registerPass" thing on the analysis manager (AnalysisManagerBase to be specific). Effectively, it allows analyses to hold state (this is separate from analysis *results* which of course can hold state). But the state is effectively global to the AnalysisManager (and hence the pass pipeline, and hence (essentially) the context (since you can currently only run a single pass pipeline concurrently on a single LLVMContext)). The net result is that you can specify a context-global "configuration" for each analysis type (not to be confused with the analysis *result* type!). Right now, AAManager is the only thing that uses it though (the "configuration" is the AA pipeline). btw, I've started to keep a log for this like I do for my projects at home: https://docs.google.com/document/d/1-nrq2y_hTiZhrsJDmH8TzFjFEeApfccs6wwGyaXjSkg/edit?usp=sharing It's fairly stream-of-consciousness. My log style is mostly append-only (I rarely edit stuff from a previous day or from too long ago on the same day; if I do I usually don't overwrite and instead insert something like "(edit: I was actually totally wrong about this, see below)"). So if you want to follow this across multiple days just go to the end and scroll up until you see something that you've already looked at. (right now there is just one day though so there is not very much). (some interesting things to search for are "interesting", "okay", "oh", and "?"; I tend to use these a lot) I've gone back to google docs for this log since it is easier to share on the mailing list. Unfortunately google docs does not have an explicit notion of "cell" like Mathematica does, so I've tried to just insert lots of line breaks for things where there would have just been a cell boundary in Mathematica. (I use Mathematica for my logs at home). -- Sean Silva On Fri, Jul 22, 2016 at 1:55 AM, Sean Silva <chisophugis at gmail.com> wrote:> The more closely I look at this, the more it seems like there may be a > useful incremental step in the transition to the new PM: use the new PM > analysis machinery in the old PM. If this is possible, it will simplify the > old PM and (hopefully) allow an incremental transition to the new PM > instead of a flag day transition for the switch. > > I.e., AFAICT, the new PM transition is essentially about 2 mostly > orthogonal aspects of running optimization pipelines: > 1. Analysis computation and analysis result lifetime management (including > avoiding analysis recomputation) > 2. Running transformation passes over their respective IRUnit's in some > order > > These are conflated in the old PM. In reality, the only interaction > between them (with the new PM machinery for 1.) is a small number of places > within 2. which need to call 'invalidate'. > > I'm pretty sure that 2. is fairly similar in the new PM and old PM (the > main difference is that the notion of "adapters" is split out in the new > PM). The analysis handling seems to be what makes the old PM so difficult > to understand (e.g. it is the cause of the multiple inheritance in the > implementation). Trying to unify analyses and transformations (and some > questionable (in hindsight) implementation decisions) seems to be the main > "problem" with the design of the old PM AFAICT (there are other issues, but > they are more "nice to have"). > > IMO it is an anti-pattern to think of analyses as "passes". There are just > "analyses" and "transformations" and they are two separate things. In fact, > the `run` method on analyses should probably be called `computeResult` or > something like that to avoid confusion. And the `run` method on > transformations could just as easily be called `performTransformation`. > > > I remember asking and getting and answer from Chandler ( > http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150907/299083.html) > about how to coexist the old and new PM compatibly so individual passes > would be able to work for both and we wouldn't need to have a flag day. I > wasn't able to find the discussions that Chandler cited, but I suspect that > the answer is that we didn't have enough analyses ported at that point to > consider sharing the analysis management between the old and new PM. > > > If this works out it may be the evolutionary path we have all been wanting > for this pass manager transition. Fingers crossed. Hopefully I'm not > overlooking some major issue. > > Anyway... back to working on the analysis manager dependency tracking. > > -- Sean Silva > > On Thu, Jul 21, 2016 at 1:59 AM, Sean Silva <chisophugis at gmail.com> wrote: > >> We did some basic sanity checking that memory usage didn't go out of >> control (it doesn't; at least with with a simple >> preserves-all/preserves-none invalidation scheme and the current LTO >> pipeline). Also, I did some basic sanity checking for compile time. The >> simple preserves-all/preserves-none invalidation scheme seems marginally >> slower, but no real conclusions (besides a simple sanity check) can be >> drawn without the real analysis preservation semantics in place. >> >> I'll start working on fixing the analysis managers. There seem to >> basically be two parts (although they may need to be done simultaneously to >> make sure all the pieces fit together): >> - unify all the analysis managers into a single analysis manager for all >> IRUnitT's (requires type-erasing the IRUnit) >> - introduce the dependency tracking machinery >> >> I think I gave a reasonable outline in the two posts above: >> - the one starting with "To clarify, it seems like the current new PM is >> essentially trying to solve the problem of maintaining/updating a mapping" >> - the one starting with "Yeah, the mechanics of maintaining this fully >> general mapping are straightforward in the abstract" >> >> I'm happy to do a centralized writeup if anybody wants. Just let me know. >> >> As far as changes to the code, the updates to the new PM passes should >> hopefully be mechanical (despite there being many of them). However, from >> what I can tell, fixing this problem will require touching most lines of >> the analysis manager machinery so the diff will probably be a bit scary, >> even though I think we can keep the same basic structure (i.e. a per-IRUnit >> std::list holding one analysis result (at a stable address) per element, >> combined with a DenseMap from (Analysis, IRUnit) to an element of the >> per-IRUnit storage list (see AnalysisResultListMapT and AnalysisResultMapT >> in include/llvm/IR/PassManager.h)). >> The main changes to the analysis manager will be: >> - type-erasing the IRUnit >> - the elements of the AnalysisResultListMapT will need to keep track of >> any dependents >> - the analysis manager will need to update those dependencies as it is >> re-entered by analyses getting results of other analyses >> - the analysis manager will need to walk the dependencies to do >> transitive invalidation >> >> I think the most robust approach is for analysis dependencies to be >> implicitly constructed by the analysis manager via tracking entry/exit from >> get{,Cached}Result. >> One alternative is for analyses to explicitly pass in their ID to >> getResult to indicate the dependency, but that seems error-prone (and also >> verbose). The issue is that we will need a getResult API that does not >> track dependencies for use by transformation passes (since there is no >> dependency to track in that case); so an innocuous copy-paste from a >> transform pass to an analysis can result in a failure to track dependencies >> and risk of use-after-free (we could fight this with the type system, but I >> think that would get a bit verbose (I'm willing to try it though if people >> would prefer)) >> One restriction of the implicit tracking approach is that it requires all >> calls into the analysis manager to occur in the `run` method of the >> analysis (so that the dependencies are implicitly tracked via re-entrance >> to get{,Cached}Result); is this a reasonable restriction? >> >> >> One annoying problem is that I think that the dependency links will need >> to be bidirectional. To use the example analysis cache from my other post: >> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar, >> [(SomeModuleAnalysis, module TheModule)]) >> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz, >> [(SomeModuleAnalysis, module TheModule)]) >> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for >> TheModule, [(SomeFunctionAnalysis, function @baz)]) >> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for >> @baz, []) >> >> if we delete function @baz, then the dependent list [(SomeFunctionAnalysis, >> function @baz)] for `(SomeModuleAnalysis, module TheModule)` will now >> have a stale pointer to function @baz. I think that in practice we can >> check to see if `(SomeFunctionAnalysis, function @baz)` is in our hash >> table and if it isn't then just ignore the dependency as "this dependent >> analysis result has already been freed". In the worst case (memory >> allocator reuses the memory for another function) we may spuriously free an >> analysis result for a different function. However it is still unsettling >> (and may actually be UB in C++?). >> Ideally we would track bidirectional links; that way when we remove an >> analysis result we also have it remove itself from dependence lists of all >> of its dependencies. >> >> -- Sean Silva >> >> On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> >>> >>> On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <chisophugis at gmail.com> >>> wrote: >>> >>>> It looks like there is really no sane fix within the current >>>> infrastructure. I've had to essentially trigger invalidation (except in the >>>> PreservedAnalyses::all() case) in the function pass manager and function to >>>> loop adapters. >>>> >>> >>> invalidation of *everything* I mean. >>> >>> -- Sean Silva >>> >>> >>>> >>>> So we basically need to get the analysis manager dependency tracking >>>> fixed. >>>> >>>> Davide and I will get measurements on the resident set impact of all >>>> this caching (even with the overconservative invalidation for now) to see >>>> the impact. If there is a big rss impact then we probably want to consider >>>> that problem at the same time as the rewrite of the analysis manager. >>>> >>>> -- Sean Silva >>>> >>>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth < >>>>>> chandlerc at gmail.com> wrote: >>>>>> >>>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth < >>>>>>>> chandlerc at gmail.com> wrote: >>>>>>>> >>>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li < >>>>>>>>>> davidxl at google.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth < >>>>>>>>>>> chandlerc at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Yea, this is a nasty problem. >>>>>>>>>>>> >>>>>>>>>>>> One important thing to understand is that this is specific to >>>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard >>>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need >>>>>>>>>>>> to address. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We can call this type of dependencies (holding references) >>>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A' >>>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is >>>>>>>>>>> computed. >>>>>>>>>>> >>>>>>>>>>> There are actually quite a few examples of hard-dependency case. >>>>>>>>>>> For instance LoopAccessInfo, LazyValueInfo etc which hold references to >>>>>>>>>>> other analyses. >>>>>>>>>>> >>>>>>>>>>> Problem involving hard-dependency is actually easier to detect, >>>>>>>>>>> as it is usually a compile time problem. Issues involving soft dependencies >>>>>>>>>>> are more subtle and can lead to wrong code gen. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Did you mean to say that soft-dependency problems are easier to >>>>>>>>>> detect? At least my intuition is that soft-dependency is easier because >>>>>>>>>> there is no risk of dangling pointers to other analyses. >>>>>>>>>> >>>>>>>>> >>>>>>>>> The issue is that the fact that there is *any* dependency isn't >>>>>>>>> clear. >>>>>>>>> >>>>>>>>> However, I think the only real problem here are these "hard >>>>>>>>> dependencies" (I don't really like that term though). For others, only an >>>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about >>>>>>>>> the fact that people have to remember this. >>>>>>>>> >>>>>>>>> The question is how often there are cross-data-structure >>>>>>>>> references. David mentions a few examples, and I'm sure there are more, but >>>>>>>>> it isn't clear to me yet whether this is pervasive or occasional. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I just did a quick run-through of PassRegistry.def and this is what >>>>>>>> I found: >>>>>>>> >>>>>>>> Module analyses: 0/5 hold pointers to other analyses >>>>>>>> CallGraph: No pointers to other analyses. >>>>>>>> LazyCallGraph: No pointers to other analyses. >>>>>>>> ProfileSummaryAnalysis: No pointers to other analyses. >>>>>>>> TargetLibraryAnalysis: No pointers to other analyses. >>>>>>>> VerifierAnalysis: No pointers to other analyses. >>>>>>>> >>>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis. >>>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function >>>>>>>> analysis). >>>>>>>> >>>>>>>> Function analyses: 9/17 keep pointers to other analysis >>>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual >>>>>>>> AA result objects. >>>>>>>> AssumptionAnalysis: No pointers to other analyses. >>>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and >>>>>>>> BPI. >>>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses. >>>>>>>> (uses LoopInfo to "recalculate" though) >>>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache >>>>>>>> and DominatorTree >>>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses. >>>>>>>> (uses DominatorTreeAnalysis for "recalculate" though). >>>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores >>>>>>>> no pointers. >>>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache, >>>>>>>> TargetLibraryInfo, DominatorTree. >>>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>>> ScalarEvolution, LoopInfo >>>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree >>>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree >>>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree, >>>>>>>> DomFrontier >>>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo, >>>>>>>> AssumptionCache, DominatorTree, LoopInfo >>>>>>>> TargetLibraryAnalysis: Has no dependencies >>>>>>>> TargetIRAnalysis: Has no dependencies. >>>>>>>> >>>>>>>> Function alias analyses: 3/5 keep pointers to other analyses >>>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache, >>>>>>>> DominatorTree, LoopInfo >>>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo >>>>>>>> SCEVAA: Keeps pointer to ScalarEvolution >>>>>>>> ScopedNoAliasAA: No dependencies >>>>>>>> TypeBasedAA: No dependencies >>>>>>>> >>>>>>>> >>>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses. >>>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have >>>>>>>> no dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have >>>>>>>> dependencies that are used just for a "recalculate" step that retains no >>>>>>>> pointers. >>>>>>>> So I think it is fair to say that analyses which hold pointers to >>>>>>>> other analyses is not an exceptional case. In fact, analyses that use other >>>>>>>> analyses just for a "recalculate" step seems to be the exceptional case >>>>>>>> (only 3/28 or about 10%) >>>>>>>> >>>>>>> >>>>>>> Interesting! >>>>>>> >>>>>>> Most of these look like they hold a pointer to the root analysis as >>>>>>> opposed to detailed objects *inside* the analysis? >>>>>>> >>>>>>> It might make sense to try to handle this very specific pattern in a >>>>>>> special way of overriding the invalidate routines is too error prone.... We >>>>>>> could try to make this work "automatically" but I'm worried this would be >>>>>>> challenging to get right. Open to suggestions of course. >>>>>>> >>>>>>> Any other ideas about what would make sense to handle this? >>>>>>> >>>>>>> Does it make sense to override the invalidate routines now and >>>>>>> iterate from there? I feel like you've done a lot of the research necessary >>>>>>> for this already... >>>>>>> >>>>>> >>>>>> I'll keep pushing forward tomorrow with building test-suite >>>>>> successfully using the new PM for the LTO pipeline (I was doing some >>>>>> unrelated LLD stuff for most of today). It will be interesting to see how >>>>>> many `invalidate` overrides will be needed to avoid these issues for just >>>>>> the LTO pipeline on test-suite. >>>>>> >>>>> >>>>> I spent the better part of today working on this and will continue >>>>> tomorrow; this problem seems nastier than I thought. For some reason the >>>>> LTO pipeline (or something about LTO) seems to hit on these issues much >>>>> more (I'm talking like 40k lines of ASan error reports from building >>>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using >>>>> the old PM). Some notes: >>>>> >>>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to >>>>> account for quite a few of the problems. >>>>> - BasicAA and other stuff are marked (by overriding `invalidate` to >>>>> return false) to never be invalidated because they are "stateless". However >>>>> they still hold pointers and so they do need to be invalidated. >>>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround >>>>> similar to r274656 in various passes. >>>>> - D21921 is holding up -- I haven't hit any issues with the core >>>>> logic of that patch. >>>>> - AAResults holds handles to various AA result objects. This means it >>>>> pretty much always needs to be invalidated unless you are sure that none of >>>>> the AA's will get invalidated. >>>>> >>>>> >>>>> The existing `invalidate` method doesn't have the right semantics for >>>>> even an error-prone solution :( We are going to need to make some >>>>> significant changes to even get basic sanity I think. Perhaps each analysis >>>>> can expose a "preserve" static function. E.g. instead of >>>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`. >>>>> I'm actually not quite sure that that will even work. Once I have >>>>> test-suite fully building successfully with the LTO pipeline in the new PM >>>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for >>>>> different IRUnitT's). >>>>> But at this point I'm not confident running *any* pass pipeline in the >>>>> new PM without at least assertions+ASan. >>>>> >>>>> We may want to have a proper design discussion around this problem >>>>> though. >>>>> >>>>> Also I'd like to have test-suite working (by hook or by crook) with >>>>> LTO in the new PM so we can get some numbers on the resident set impact of >>>>> all this caching; if it is really problematic then we may need to start >>>>> talking front-and-center about different invalidation policies for keeping >>>>> this in check instead of leaving it as something that we will be able to >>>>> patch later. >>>>> >>>>> >>>>> >>>>> The more I think about it, the more I'm convinced that the real "hard" >>>>> problem that the new PM is exposing us to is having the ability for any >>>>> pass to ask for any analysis on any IRUnitT (and any specific IRUnit of >>>>> that IRUnitT) and have the result stored somewhere and then invalidated. >>>>> This means that "getAnalysisUsage" is not just a list of passes, but much >>>>> more complicated and is essentially a set of arbitrary pairs "(analysis, >>>>> IRUnit)" (and the associated potential tangle of dependencies between the >>>>> state cached on these tuples). With the old PM, you essentially are looking >>>>> at a problem of scheduling the lifetime of analyses of the same IRUnit >>>>> intermingled with transformation passes on that same IRUnit, so you only >>>>> have the "analysis" part of the tuple above, making things much simpler >>>>> (and handling dependencies is much simpler too). We've obviously outgrown >>>>> this model with examples like LAA, AssumptionCacheTracker, etc. that hack >>>>> around this in the old PM. We may want to have a fresh re-examination of >>>>> what problems we are exactly trying to solve. >>>>> >>>>> For me, my main concern now is what changes need to be made in order >>>>> to feel confident running a pipeline in the new PM without assertions+ASan. >>>>> >>>>> >>>>> Sorry for the long post, just brain-dumping before heading home. >>>>> >>>>> -- Sean Silva >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> -- Sean Silva >>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160724/be2017bd/attachment.html>
Hal Finkel via llvm-dev
2016-Jul-25 16:27 UTC
[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...
----- Original Message -----> From: "Sean Silva" <chisophugis at gmail.com> > To: "Chandler Carruth" <chandlerc at gmail.com> > Cc: "Xinliang David Li" <davidxl at google.com>, "llvm-dev" > <llvm-dev at lists.llvm.org>, "Davide Italiano" > <dccitaliano at gmail.com>, "Tim Amini Golling" > <mehdi.amini at apple.com>, "Hal Finkel" <hfinkel at anl.gov>, "Sanjoy > Das" <sanjoy at playingwithpointers.com>, "Pete Cooper" > <peter_cooper at apple.com> > Sent: Friday, July 22, 2016 3:55:52 AM > Subject: Re: [PM] I think that the new PM needs to learn about > inter-analysis dependencies...> The more closely I look at this, the more it seems like there may be > a useful incremental step in the transition to the new PM: use the > new PM analysis machinery in the old PM. If this is possible, it > will simplify the old PM and (hopefully) allow an incremental > transition to the new PM instead of a flag day transition for the > switch.> I.e., AFAICT, the new PM transition is essentially about 2 mostly > orthogonal aspects of running optimization pipelines: > 1. Analysis computation and analysis result lifetime management > (including avoiding analysis recomputation) > 2. Running transformation passes over their respective IRUnit's in > some order> These are conflated in the old PM. In reality, the only interaction > between them (with the new PM machinery for 1.) is a small number of > places within 2. which need to call 'invalidate'.> I'm pretty sure that 2. is fairly similar in the new PM and old PM > (the main difference is that the notion of "adapters" is split out > in the new PM). The analysis handling seems to be what makes the old > PM so difficult to understand (e.g. it is the cause of the multiple > inheritance in the implementation). Trying to unify analyses and > transformations (and some questionable (in hindsight) implementation > decisions) seems to be the main "problem" with the design of the old > PM AFAICT (there are other issues, but they are more "nice to > have").> IMO it is an anti-pattern to think of analyses as "passes". There are > just "analyses" and "transformations" and they are two separate > things. In fact, the `run` method on analyses should probably be > called `computeResult` or something like that to avoid confusion.This makes sense to me. We do currently have some "in between" passes, like LCSSA, which are transformations, but are required by other passes, and transform the IR but whose preservation represents properties of the IR. The particulars of how we handle LCSSA aside (e.g. I think we should preserve it more, perhaps everywhere), how are we planning on handling this class of things? Thanks again, Hal> And the `run` method on transformations could just as easily be > called `performTransformation`.> I remember asking and getting and answer from Chandler ( > http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150907/299083.html > ) about how to coexist the old and new PM compatibly so individual > passes would be able to work for both and we wouldn't need to have a > flag day. I wasn't able to find the discussions that Chandler cited, > but I suspect that the answer is that we didn't have enough analyses > ported at that point to consider sharing the analysis management > between the old and new PM.> If this works out it may be the evolutionary path we have all been > wanting for this pass manager transition. Fingers crossed. Hopefully > I'm not overlooking some major issue.> Anyway... back to working on the analysis manager dependency > tracking.> -- Sean Silva> On Thu, Jul 21, 2016 at 1:59 AM, Sean Silva < chisophugis at gmail.com > > wrote:> > We did some basic sanity checking that memory usage didn't go out > > of > > control (it doesn't; at least with with a simple > > preserves-all/preserves-none invalidation scheme and the current > > LTO > > pipeline). Also, I did some basic sanity checking for compile time. > > The simple preserves-all/preserves-none invalidation scheme seems > > marginally slower, but no real conclusions (besides a simple sanity > > check) can be drawn without the real analysis preservation > > semantics > > in place. >> > I'll start working on fixing the analysis managers. There seem to > > basically be two parts (although they may need to be done > > simultaneously to make sure all the pieces fit together): > > > - unify all the analysis managers into a single analysis manager > > for > > all IRUnitT's (requires type-erasing the IRUnit) > > > - introduce the dependency tracking machinery >> > I think I gave a reasonable outline in the two posts above: > > > - the one starting with " To clarify, it seems like the current new > > PM is essentially trying to solve the problem of > > maintaining/updating a mapping" > > > - the one starting with " Yeah, the mechanics of maintaining this > > fully general mapping are straightforward in the abstract" >> > I'm happy to do a centralized writeup if anybody wants. Just let me > > know. >> > As far as changes to the code, t he updates to the new PM passes > > should hopefully be mechanical (despite there being many of them). > > However, from what I can tell, fixing this problem will require > > touching most lines of the analysis manager machinery so the diff > > will probably be a bit scary, even though I think we can keep the > > same basic structure (i.e. a per-IRUnit std::list holding one > > analysis result (at a stable address) per element, combined with a > > DenseMap from (Analysis, IRUnit) to an element of the per-IRUnit > > storage list (see AnalysisResultListMapT and AnalysisResultMapT in > > include/llvm/IR/PassManager.h)). > > > The main changes to the analysis manager will be: > > > - type-erasing the IRUnit > > > - the elements of the AnalysisResultListMapT will need to keep > > track > > of any dependents > > > - the analysis manager will need to update those dependencies as it > > is re-entered by analyses getting results of other analyses > > > - the analysis manager will need to walk the dependencies to do > > transitive invalidation >> > I think the most robust approach is for analysis dependencies to be > > implicitly constructed by the analysis manager via tracking > > entry/exit from get{,Cached}Result. > > > One alternative is for analyses to explicitly pass in their ID to > > getResult to indicate the dependency, but that seems error-prone > > (and also verbose). The issue is that we will need a getResult API > > that does not track dependencies for use by transformation passes > > (since there is no dependency to track in that case); so an > > innocuous copy-paste from a transform pass to an analysis can > > result > > in a failure to track dependencies and risk of use-after-free (we > > could fight this with the type system, but I think that would get a > > bit verbose (I'm willing to try it though if people would prefer)) > > > One restriction of the implicit tracking approach is that it > > requires > > all calls into the analysis manager to occur in the `run` method of > > the analysis (so that the dependencies are implicitly tracked via > > re-entrance to get{,Cached}Result); is this a reasonable > > restriction? >> > One annoying problem is that I think that the dependency links will > > need to be bidirectional. To use the example analysis cache from my > > other post: >> > (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar, > > [(SomeModuleAnalysis, module TheModule)]) > > > (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz, > > [(SomeModuleAnalysis, module TheModule)]) > > > (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult > > for TheModule, [(SomeFunctionAnalysis, function @baz)]) > > > (SomeFunctionAnalysis, function @baz) -> > > (SomeFunctionAnalysisResult > > for @baz, []) >> > if we delete function @baz, then the dependent list > > [(SomeFunctionAnalysis, function @baz)] for ` (SomeModuleAnalysis, > > module TheModule)` will now have a stale pointer to function @baz. > > I > > think that in practice we can check to see if ` > > (SomeFunctionAnalysis, function @baz)` is in our hash table and if > > it isn't then just ignore the dependency as "this dependent > > analysis > > result has already been freed". In the worst case (memory allocator > > reuses the memory for another function) we may spuriously free an > > analysis result for a different function. However it is still > > unsettling (and may actually be UB in C++?). > > > Ideally we would track bidirectional links; that way when we remove > > an analysis result we also have it remove itself from dependence > > lists of all of its dependencies. >> > -- Sean Silva >> > On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva < chisophugis at gmail.com > > > > > wrote: >> > > On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva < > > > chisophugis at gmail.com > > > > > > > wrote: > > >> > > > It looks like there is really no sane fix within the current > > > > infrastructure. I've had to essentially trigger invalidation > > > > (except > > > > in the PreservedAnalyses::all() case) in the function pass > > > > manager > > > > and function to loop adapters. > > > > > > > > > invalidation of *everything* I mean. > > >> > > -- Sean Silva > > >> > > > So we basically need to get the analysis manager dependency > > > > tracking > > > > fixed. > > > > > >> > > > Davide and I will get measurements on the resident set impact > > > > of > > > > all > > > > this caching (even with the overconservative invalidation for > > > > now) > > > > to see the impact. If there is a big rss impact then we > > > > probably > > > > want to consider that problem at the same time as the rewrite > > > > of > > > > the > > > > analysis manager. > > > > > >> > > > -- Sean Silva > > > > > >> > > > On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva < > > > > chisophugis at gmail.com > > > > > wrote: > > > > > >> > > > > On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva < > > > > > chisophugis at gmail.com > > > > > > > > > > > wrote: > > > > > > > > > >> > > > > > On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth < > > > > > > chandlerc at gmail.com > wrote: > > > > > > > > > > > > > > >> > > > > > > On Wed, Jul 13, 2016 at 12:25 AM Sean Silva < > > > > > > > chisophugis at gmail.com > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > >> > > > > > > > On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth < > > > > > > > > chandlerc at gmail.com > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > On Tue, Jul 12, 2016 at 11:34 PM Sean Silva < > > > > > > > > > chisophugis at gmail.com > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li > > > > > > > > > > < > > > > > > > > > > davidxl at google.com > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > On Tue, Jul 12, 2016 at 10:57 PM, Chandler > > > > > > > > > > > Carruth > > > > > > > > > > > < > > > > > > > > > > > chandlerc at gmail.com > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > Yea, this is a nasty problem. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > One important thing to understand is that this > > > > > > > > > > > > is > > > > > > > > > > > > specific > > > > > > > > > > > > to > > > > > > > > > > > > analyses which hold references to other > > > > > > > > > > > > analyses. > > > > > > > > > > > > While > > > > > > > > > > > > this > > > > > > > > > > > > isn't > > > > > > > > > > > > unheard of, it isn't as common as it could be. > > > > > > > > > > > > Still, > > > > > > > > > > > > definitely > > > > > > > > > > > > something we need to address. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We can call this type of dependencies (holding > > > > > > > > > > > references) > > > > > > > > > > > hard-dependency. The soft dependency refers to > > > > > > > > > > > the > > > > > > > > > > > case > > > > > > > > > > > where > > > > > > > > > > > analysis 'A' depends on 'B' during computation, > > > > > > > > > > > but > > > > > > > > > > > does > > > > > > > > > > > not > > > > > > > > > > > need > > > > > > > > > > > 'B' once it is computed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > There are actually quite a few examples of > > > > > > > > > > > hard-dependency > > > > > > > > > > > case. > > > > > > > > > > > For > > > > > > > > > > > instance LoopAccessInfo, LazyValueInfo etc which > > > > > > > > > > > hold > > > > > > > > > > > references > > > > > > > > > > > to > > > > > > > > > > > other analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > Problem involving hard-dependency is actually > > > > > > > > > > > easier > > > > > > > > > > > to > > > > > > > > > > > detect, > > > > > > > > > > > as > > > > > > > > > > > it > > > > > > > > > > > is usually a compile time problem. Issues > > > > > > > > > > > involving > > > > > > > > > > > soft > > > > > > > > > > > dependencies are more subtle and can lead to > > > > > > > > > > > wrong > > > > > > > > > > > code > > > > > > > > > > > gen. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Did you mean to say that soft-dependency problems > > > > > > > > > > are > > > > > > > > > > easier > > > > > > > > > > to > > > > > > > > > > detect? At least my intuition is that > > > > > > > > > > soft-dependency > > > > > > > > > > is > > > > > > > > > > easier > > > > > > > > > > because there is no risk of dangling pointers to > > > > > > > > > > other > > > > > > > > > > analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > The issue is that the fact that there is *any* > > > > > > > > > dependency > > > > > > > > > isn't > > > > > > > > > clear. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > However, I think the only real problem here are these > > > > > > > > > "hard > > > > > > > > > dependencies" (I don't really like that term though). > > > > > > > > > For > > > > > > > > > others, > > > > > > > > > only an analysis that is *explicitly* preserved > > > > > > > > > survives. > > > > > > > > > So > > > > > > > > > I'm > > > > > > > > > not > > > > > > > > > worried about the fact that people have to remember > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > The question is how often there are > > > > > > > > > cross-data-structure > > > > > > > > > references. > > > > > > > > > David mentions a few examples, and I'm sure there are > > > > > > > > > more, > > > > > > > > > but > > > > > > > > > it > > > > > > > > > isn't clear to me yet whether this is pervasive or > > > > > > > > > occasional. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I just did a quick run-through of PassRegistry.def and > > > > > > > > this > > > > > > > > is > > > > > > > > what > > > > > > > > I > > > > > > > > found: > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > Module analyses: 0/5 hold pointers to other analyses > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > CallGraph: No pointers to other analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > LazyCallGraph: No pointers to other analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ProfileSummaryAnalysis: No pointers to other analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > TargetLibraryAnalysis: No pointers to other analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > VerifierAnalysis: No pointers to other analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > Module alias analyses: 1/1 keeps pointer to other > > > > > > > > analysis. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > GlobalsAA: Result keeps pointer to TLI (this is a > > > > > > > > function > > > > > > > > analysis). > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > Function analyses: 9/17 keep pointers to other analysis > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > AAManager: Its Result holds TLI pointer and pointers to > > > > > > > > individual > > > > > > > > AA > > > > > > > > result objects. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > AssumptionAnalysis: No pointers to other analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > BlockFrequencyAnalysis: Its Result holds pointers to > > > > > > > > LoopInfo > > > > > > > > and > > > > > > > > BPI. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > BranchProbabilityAnalysis: Stores no pointers to other > > > > > > > > analyses. > > > > > > > > (uses LoopInfo to "recalculate" though) > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > DominatorTreeAnalysis: Stores no pointers to other > > > > > > > > analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > PostDominatorTreeAnalysis: Stores no pointers to other > > > > > > > > analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DemandedBitsAnalysis: Stores pointers to > > > > > > > > AssumptionCache > > > > > > > > and > > > > > > > > DominatorTree > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > DominanceFrontierAnalysis: Stores no pointers to other > > > > > > > > analyses. > > > > > > > > (uses DominatorTreeAnalysis for "recalculate" though). > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > LoopInfo: Uses DominatorTreeAnalysis for "recalculate" > > > > > > > > but > > > > > > > > stores > > > > > > > > no > > > > > > > > pointers. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > LazyValueAnalysis: Stores pointers to AssumptionCache, > > > > > > > > TargetLibraryInfo, DominatorTree. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > DependenceAnalysis: Stores pointers to AliasAnalysis, > > > > > > > > ScalarEvolution, LoopInfo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > MemoryDependenceAnalysis: Stores pointers to > > > > > > > > AliasAnalysis, > > > > > > > > AssumptionCache, TargetLibraryInfo, DominatorTree > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > MemorySSAAnalysis: Stores pointers to AliasAnalysis, > > > > > > > > DominatorTree > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > RegionInfoAnalysis: Stores pointers to DomTree, > > > > > > > > PostDomTree, > > > > > > > > DomFrontier > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > ScalarEvolutionAnalysis: Stores pointers to > > > > > > > > TargetLibraryInfo, > > > > > > > > AssumptionCache, DominatorTree, LoopInfo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > TargetLibraryAnalysis: Has no dependencies > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > TargetIRAnalysis: Has no dependencies. > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > Function alias analyses: 3/5 keep pointers to other > > > > > > > > analyses > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > BasicAA: Keeps pointers to TargetLibraryInfo, > > > > > > > > AssumptionCache, > > > > > > > > DominatorTree, LoopInfo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > CFLAA: Keeps pointer to TargetLibraryInfo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > SCEVAA: Keeps pointer to ScalarEvolution > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ScopedNoAliasAA: No dependencies > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > TypeBasedAA: No dependencies > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > Total: 13/28 analyses (~50%) hold pointers to other > > > > > > > > analyses. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Of the 15/28 analyses that don't hold pointers, 12/15 > > > > > > > > simply > > > > > > > > have > > > > > > > > no > > > > > > > > dependencies. Only 3/15 (BPI, LoopInfo, > > > > > > > > DominanceFrontier) > > > > > > > > have > > > > > > > > dependencies that are used just for a "recalculate" > > > > > > > > step > > > > > > > > that > > > > > > > > retains no pointers. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So I think it is fair to say that analyses which hold > > > > > > > > pointers > > > > > > > > to > > > > > > > > other analyses is not an exceptional case. In fact, > > > > > > > > analyses > > > > > > > > that > > > > > > > > use other analyses just for a "recalculate" step seems > > > > > > > > to > > > > > > > > be > > > > > > > > the > > > > > > > > exceptional case (only 3/28 or about 10%) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Interesting! > > > > > > > > > > > > > > > > > > > > >> > > > > > > Most of these look like they hold a pointer to the root > > > > > > > analysis > > > > > > > as > > > > > > > opposed to detailed objects *inside* the analysis? > > > > > > > > > > > > > > > > > > > > >> > > > > > > It might make sense to try to handle this very specific > > > > > > > pattern > > > > > > > in > > > > > > > a > > > > > > > special way of overriding the invalidate routines is too > > > > > > > error > > > > > > > prone.... We could try to make this work "automatically" > > > > > > > but > > > > > > > I'm > > > > > > > worried this would be challenging to get right. Open to > > > > > > > suggestions > > > > > > > of course. > > > > > > > > > > > > > > > > > > > > >> > > > > > > Any other ideas about what would make sense to handle > > > > > > > this? > > > > > > > > > > > > > > > > > > > > >> > > > > > > Does it make sense to override the invalidate routines > > > > > > > now > > > > > > > and > > > > > > > iterate from there? I feel like you've done a lot of the > > > > > > > research > > > > > > > necessary for this already... > > > > > > > > > > > > > > > > > > > > > > > > > > > I'll keep pushing forward tomorrow with building test-suite > > > > > > successfully using the new PM for the LTO pipeline (I was > > > > > > doing > > > > > > some > > > > > > unrelated LLD stuff for most of today). It will be > > > > > > interesting > > > > > > to > > > > > > see how many `invalidate` overrides will be needed to avoid > > > > > > these > > > > > > issues for just the LTO pipeline on test-suite. > > > > > > > > > > > > > > > > > > > > I spent the better part of today working on this and will > > > > > continue > > > > > tomorrow; this problem seems nastier than I thought. For some > > > > > reason > > > > > the LTO pipeline (or something about LTO) seems to hit on > > > > > these > > > > > issues much more (I'm talking like 40k lines of ASan error > > > > > reports > > > > > from building test-suite with the LTO pipeline in the new PM; > > > > > per-TU > > > > > steps still using the old PM). Some notes: > > > > > > > > > >> > > > > - BasicAA's dependence on domtree and loopinfo in the new PM > > > > > seems > > > > > to > > > > > account for quite a few of the problems. > > > > > > > > > > > > > > > - BasicAA and other stuff are marked (by overriding > > > > > `invalidate` > > > > > to > > > > > return false) to never be invalidated because they are > > > > > "stateless". > > > > > However they still hold pointers and so they do need to be > > > > > invalidated. > > > > > > > > > > > > > > > - CallGraph uses AssertingVH (PR28400) and so I needed a > > > > > workaround > > > > > similar to r274656 in various passes. > > > > > > > > > > > > > > > - D21921 is holding up -- I haven't hit any issues with the > > > > > core > > > > > logic of that patch. > > > > > > > > > > > > > > > - AAResults holds handles to various AA result objects. This > > > > > means > > > > > it > > > > > pretty much always needs to be invalidated unless you are > > > > > sure > > > > > that > > > > > none of the AA's will get invalidated. > > > > > > > > > >> > > > > The existing `invalidate` method doesn't have the right > > > > > semantics > > > > > for > > > > > even an error-prone solution :( We are going to need to make > > > > > some > > > > > significant changes to even get basic sanity I think. Perhaps > > > > > each > > > > > analysis can expose a "preserve" static function. E.g. > > > > > instead > > > > > of > > > > > `PA.preserve<Foo>();` you have to do > > > > > `Foo::setPreserved(PA);`. > > > > > > > > > > > > > > > I'm actually not quite sure that that will even work. Once I > > > > > have > > > > > test-suite fully building successfully with the LTO pipeline > > > > > in > > > > > the > > > > > new PM I'll be able to give a more confident answer (esp. > > > > > w.r.t. > > > > > the > > > > > manager for different IRUnitT's). > > > > > > > > > > > > > > > But at this point I'm not confident running *any* pass > > > > > pipeline > > > > > in > > > > > the new PM without at least assertions+ASan. > > > > > > > > > >> > > > > We may want to have a proper design discussion around this > > > > > problem > > > > > though. > > > > > > > > > >> > > > > Also I'd like to have test-suite working (by hook or by > > > > > crook) > > > > > with > > > > > LTO in the new PM so we can get some numbers on the resident > > > > > set > > > > > impact of all this caching; if it is really problematic then > > > > > we > > > > > may > > > > > need to start talking front-and-center about different > > > > > invalidation > > > > > policies for keeping this in check instead of leaving it as > > > > > something that we will be able to patch later. > > > > > > > > > >> > > > > The more I think about it, the more I'm convinced that the > > > > > real > > > > > "hard" problem that the new PM is exposing us to is having > > > > > the > > > > > ability for any pass to ask for any analysis on any IRUnitT > > > > > (and > > > > > any > > > > > specific IRUnit of that IRUnitT) and have the result stored > > > > > somewhere and then invalidated. This means that > > > > > "getAnalysisUsage" > > > > > is not just a list of passes, but much more complicated and > > > > > is > > > > > essentially a set of arbitrary pairs "(analysis, IRUnit)" > > > > > (and > > > > > the > > > > > associated potential tangle of dependencies between the state > > > > > cached > > > > > on these tuples). With the old PM, you essentially are > > > > > looking > > > > > at > > > > > a > > > > > problem of scheduling the lifetime of analyses of the same > > > > > IRUnit > > > > > intermingled with transformation passes on that same IRUnit, > > > > > so > > > > > you > > > > > only have the "analysis" part of the tuple above, making > > > > > things > > > > > much > > > > > simpler (and handling dependencies is much simpler too). > > > > > We've > > > > > obviously outgrown this model with examples like LAA, > > > > > AssumptionCacheTracker, etc. that hack around this in the old > > > > > PM. > > > > > We > > > > > may want to have a fresh re-examination of what problems we > > > > > are > > > > > exactly trying to solve. > > > > > > > > > >> > > > > For me, my main concern now is what changes need to be made > > > > > in > > > > > order > > > > > to feel confident running a pipeline in the new PM without > > > > > assertions+ASan. > > > > > > > > > >> > > > > Sorry for the long post, just brain-dumping before heading > > > > > home. > > > > > > > > > >> > > > > -- Sean Silva > > > > > > > > > >> > > > > > -- Sean Silva > > > > > > > > > > > > > > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/7c448c14/attachment.html>
Sean Silva via llvm-dev
2016-Jul-25 22:16 UTC
[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...
On Mon, Jul 25, 2016 at 9:27 AM, Hal Finkel <hfinkel at anl.gov> wrote:> > ------------------------------ > > *From: *"Sean Silva" <chisophugis at gmail.com> > *To: *"Chandler Carruth" <chandlerc at gmail.com> > *Cc: *"Xinliang David Li" <davidxl at google.com>, "llvm-dev" < > llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano at gmail.com>, "Tim > Amini Golling" <mehdi.amini at apple.com>, "Hal Finkel" <hfinkel at anl.gov>, > "Sanjoy Das" <sanjoy at playingwithpointers.com>, "Pete Cooper" < > peter_cooper at apple.com> > *Sent: *Friday, July 22, 2016 3:55:52 AM > *Subject: *Re: [PM] I think that the new PM needs to learn about > inter-analysis dependencies... > > The more closely I look at this, the more it seems like there may be a > useful incremental step in the transition to the new PM: use the new PM > analysis machinery in the old PM. If this is possible, it will simplify the > old PM and (hopefully) allow an incremental transition to the new PM > instead of a flag day transition for the switch. > > I.e., AFAICT, the new PM transition is essentially about 2 mostly > orthogonal aspects of running optimization pipelines: > 1. Analysis computation and analysis result lifetime management (including > avoiding analysis recomputation) > 2. Running transformation passes over their respective IRUnit's in some > order > > These are conflated in the old PM. In reality, the only interaction > between them (with the new PM machinery for 1.) is a small number of places > within 2. which need to call 'invalidate'. > > I'm pretty sure that 2. is fairly similar in the new PM and old PM (the > main difference is that the notion of "adapters" is split out in the new > PM). The analysis handling seems to be what makes the old PM so difficult > to understand (e.g. it is the cause of the multiple inheritance in the > implementation). Trying to unify analyses and transformations (and some > questionable (in hindsight) implementation decisions) seems to be the main > "problem" with the design of the old PM AFAICT (there are other issues, but > they are more "nice to have"). > > IMO it is an anti-pattern to think of analyses as "passes". There are just > "analyses" and "transformations" and they are two separate things. In fact, > the `run` method on analyses should probably be called `computeResult` or > something like that to avoid confusion. > > This makes sense to me. > > We do currently have some "in between" passes, like LCSSA, which are > transformations, but are required by other passes, and transform the IR but > whose preservation represents properties of the IR. The particulars of how > we handle LCSSA aside (e.g. I think we should preserve it more, perhaps > everywhere), how are we planning on handling this class of things? >The new PM doesn't currently have a concept like this. As you mentioned, it is a weird cross between a transformation and an analysis: it can be "invalidated" like an analysis, but "recomputing" it actually mutates the IR like a transformation. I'd like to preface the below with the following: No matter how we ultimately address this requirement, my preference is that we do so in a way that applies to the old PM. This is a case where the old PM supports a richer set of functionality than the new PM. By incrementally refactoring the old PM away from its use of this extra capability and towards whatever "new" way there is to do it, we will understand better what it is that we actually need. (and sorry for the brain dump in the rest of this post) I have not seen any mention of a solution to this problem besides "we shouldn't do that", which is sort of a cop-out. Here is a strawman proposal: If it isn't too expensive, one simple alternative is to have passes just make a call to a utility function to put things in LCSSA if they need it (this utility would also have to invalidate analyses). If that ends up being expensive, we can have a dummy "indicator" analysis IRIsInLCSSAForm which, if cached, means "don't bother to call the utility function". We could maybe just use the LCSSA pass directly to do the transformation. LCSSA could have IRIsInLCSSAForm as an member typedef `IndicatorT` so it can be accessed generically. We could then support an API like: ``` FooTransformation.cpp: PreservedAnalyses FooTransformation::run(Function &F, AnalysisManager AM) { // Must be called before getting analyses, as it might invalidate some. canonicalizeIR<LCSSA>(F, AM); ... } include/IR/Canonicalization.h: template <typename CanonicalizationT, typename IRUnitT> void canonicalizeIR(IRUnitT &IR, AnalysisManager &AM) { using IndicatorT = typename CanonicalizationT::IndicatorAnalysis; if (AM.getCachedResult<IndicatorT>(IR)) return; CanonicalizationT C; PreservedAnalysis PA = C.run(IR, AM); AM.invalidate(IR, PA); (void)AM.getResult<IndicatorT>(IR); } ``` One place that talks about this problem of "requiring a transformation" is http://llvm.org/devmtg/2014-04/PDFs/Talks/Passes.pdf on slide 17. One reason it provides for "we shouldn't do that" is that if you think about these things as "canonicalize the IR into a specific form", then when you have N>1 such dependencies (like some passes do on LoopSimplify and LCSSA), one must have a subset of the requirements of the other. I.e. you can't have two canonicalizations that "fight" each other. Using an explicit mutation API like the strawman above is a bit less bulletproof than scheduling based on statically known interferences between canonicalizations (e.g. CanonicalizationA may invalidate CanonicalizationB, but not the reverse, so it would automatically know to run CanonicalizationA before CanonicalizationB), but given that we have relatively few "canonicalizations" (to give them a name) that use this feature of the old PM, it may be livable (at least in the middle-end, it seems like there is just LCSSA, LoopSimplify, BreakCriticalEdges, and LowerSwitch in calls to addPreservedID/addRequiredID). I don't find the "Causes rampant re-running of invalidated analyses" argument in that slide convincing. If a pass needs the IR in LCSSA then it needs it. There isn't much we can do about that. One invariant I'd like to preserve in the new pass manager is that whatever pipeline is provided on the opt command line, we end up running something "valid"; so a cop-out like "if a pass needs LCSSA, you need to make sure to add LCSSA at an appropriate place before it in the pipeline" is not something I think is reasonable (way too error-prone). Small rant: We already are in this error-prone situation in the new PM with the need to call `getCachedResult` to access analyses from a larger IRUnitT (e.g. the situation I explained in the post-commit thread of r274712); this means that when constructing a pass pipeline to pass to opt, you need to do various things to ensure your pipeline even makes sense: 1. You need to make sure to always add the `requires<foo>` at a higher level for any pass that needs it (how do you find out what each pass needs besides grepping the source?), otherwise you get an assertion failure / report_fatal_error 2. You need to make sure that no intervening transformation at the lower level invalidates the outer analysis before getting to your pass that needs it. E.g. `module(require<foo>,function(pass-that-invalidates-foo,pass-that-requires-foo)`. I don't see the point of this restriction. In a hypothetical future where we run over inner IRUnit's in parallel we would need some awareness to avoid races when getting analyses for outer IRUnit's. But doing that serialization dynamically as the analysis is accessed (and perhaps even centralized in the analysis manager and pass management layers) seems better than just always computing it at a higher level and crossing your fingers that you got your pass pipeline right and the outer analysis doesn't get invalidated before it is needed. -- Sean Silva> > Thanks again, > Hal > > And the `run` method on transformations could just as easily be called > `performTransformation`. > > > I remember asking and getting and answer from Chandler ( > http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150907/299083.html) > about how to coexist the old and new PM compatibly so individual passes > would be able to work for both and we wouldn't need to have a flag day. I > wasn't able to find the discussions that Chandler cited, but I suspect that > the answer is that we didn't have enough analyses ported at that point to > consider sharing the analysis management between the old and new PM. > > > If this works out it may be the evolutionary path we have all been wanting > for this pass manager transition. Fingers crossed. Hopefully I'm not > overlooking some major issue. > > Anyway... back to working on the analysis manager dependency tracking. > > -- Sean Silva > > On Thu, Jul 21, 2016 at 1:59 AM, Sean Silva <chisophugis at gmail.com> wrote: > >> We did some basic sanity checking that memory usage didn't go out of >> control (it doesn't; at least with with a simple >> preserves-all/preserves-none invalidation scheme and the current LTO >> pipeline). Also, I did some basic sanity checking for compile time. The >> simple preserves-all/preserves-none invalidation scheme seems marginally >> slower, but no real conclusions (besides a simple sanity check) can be >> drawn without the real analysis preservation semantics in place. >> >> I'll start working on fixing the analysis managers. There seem to >> basically be two parts (although they may need to be done simultaneously to >> make sure all the pieces fit together): >> - unify all the analysis managers into a single analysis manager for all >> IRUnitT's (requires type-erasing the IRUnit) >> - introduce the dependency tracking machinery >> >> I think I gave a reasonable outline in the two posts above: >> - the one starting with "To clarify, it seems like the current new PM is >> essentially trying to solve the problem of maintaining/updating a mapping" >> - the one starting with "Yeah, the mechanics of maintaining this fully >> general mapping are straightforward in the abstract" >> >> I'm happy to do a centralized writeup if anybody wants. Just let me know. >> >> As far as changes to the code, the updates to the new PM passes should >> hopefully be mechanical (despite there being many of them). However, from >> what I can tell, fixing this problem will require touching most lines of >> the analysis manager machinery so the diff will probably be a bit scary, >> even though I think we can keep the same basic structure (i.e. a per-IRUnit >> std::list holding one analysis result (at a stable address) per element, >> combined with a DenseMap from (Analysis, IRUnit) to an element of the >> per-IRUnit storage list (see AnalysisResultListMapT and AnalysisResultMapT >> in include/llvm/IR/PassManager.h)). >> The main changes to the analysis manager will be: >> - type-erasing the IRUnit >> - the elements of the AnalysisResultListMapT will need to keep track of >> any dependents >> - the analysis manager will need to update those dependencies as it is >> re-entered by analyses getting results of other analyses >> - the analysis manager will need to walk the dependencies to do >> transitive invalidation >> >> I think the most robust approach is for analysis dependencies to be >> implicitly constructed by the analysis manager via tracking entry/exit from >> get{,Cached}Result. >> One alternative is for analyses to explicitly pass in their ID to >> getResult to indicate the dependency, but that seems error-prone (and also >> verbose). The issue is that we will need a getResult API that does not >> track dependencies for use by transformation passes (since there is no >> dependency to track in that case); so an innocuous copy-paste from a >> transform pass to an analysis can result in a failure to track dependencies >> and risk of use-after-free (we could fight this with the type system, but I >> think that would get a bit verbose (I'm willing to try it though if people >> would prefer)) >> One restriction of the implicit tracking approach is that it requires all >> calls into the analysis manager to occur in the `run` method of the >> analysis (so that the dependencies are implicitly tracked via re-entrance >> to get{,Cached}Result); is this a reasonable restriction? >> >> >> One annoying problem is that I think that the dependency links will need >> to be bidirectional. To use the example analysis cache from my other post: >> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar, >> [(SomeModuleAnalysis, module TheModule)]) >> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz, >> [(SomeModuleAnalysis, module TheModule)]) >> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for >> TheModule, [(SomeFunctionAnalysis, function @baz)]) >> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for >> @baz, []) >> >> if we delete function @baz, then the dependent list [(SomeFunctionAnalysis, >> function @baz)] for `(SomeModuleAnalysis, module TheModule)` will now >> have a stale pointer to function @baz. I think that in practice we can >> check to see if `(SomeFunctionAnalysis, function @baz)` is in our hash >> table and if it isn't then just ignore the dependency as "this dependent >> analysis result has already been freed". In the worst case (memory >> allocator reuses the memory for another function) we may spuriously free an >> analysis result for a different function. However it is still unsettling >> (and may actually be UB in C++?). >> Ideally we would track bidirectional links; that way when we remove an >> analysis result we also have it remove itself from dependence lists of all >> of its dependencies. >> >> -- Sean Silva >> >> On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> >>> >>> On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <chisophugis at gmail.com> >>> wrote: >>> >>>> It looks like there is really no sane fix within the current >>>> infrastructure. I've had to essentially trigger invalidation (except in the >>>> PreservedAnalyses::all() case) in the function pass manager and function to >>>> loop adapters. >>>> >>> >>> invalidation of *everything* I mean. >>> >>> -- Sean Silva >>> >>> >>>> >>>> So we basically need to get the analysis manager dependency tracking >>>> fixed. >>>> >>>> Davide and I will get measurements on the resident set impact of all >>>> this caching (even with the overconservative invalidation for now) to see >>>> the impact. If there is a big rss impact then we probably want to consider >>>> that problem at the same time as the rewrite of the analysis manager. >>>> >>>> -- Sean Silva >>>> >>>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth < >>>>>> chandlerc at gmail.com> wrote: >>>>>> >>>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth < >>>>>>>> chandlerc at gmail.com> wrote: >>>>>>>> >>>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva <chisophugis at gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li < >>>>>>>>>> davidxl at google.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth < >>>>>>>>>>> chandlerc at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Yea, this is a nasty problem. >>>>>>>>>>>> >>>>>>>>>>>> One important thing to understand is that this is specific to >>>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard >>>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need >>>>>>>>>>>> to address. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We can call this type of dependencies (holding references) >>>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A' >>>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is >>>>>>>>>>> computed. >>>>>>>>>>> >>>>>>>>>>> There are actually quite a few examples of hard-dependency case. >>>>>>>>>>> For instance LoopAccessInfo, LazyValueInfo etc which hold references to >>>>>>>>>>> other analyses. >>>>>>>>>>> >>>>>>>>>>> Problem involving hard-dependency is actually easier to detect, >>>>>>>>>>> as it is usually a compile time problem. Issues involving soft dependencies >>>>>>>>>>> are more subtle and can lead to wrong code gen. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Did you mean to say that soft-dependency problems are easier to >>>>>>>>>> detect? At least my intuition is that soft-dependency is easier because >>>>>>>>>> there is no risk of dangling pointers to other analyses. >>>>>>>>>> >>>>>>>>> >>>>>>>>> The issue is that the fact that there is *any* dependency isn't >>>>>>>>> clear. >>>>>>>>> >>>>>>>>> However, I think the only real problem here are these "hard >>>>>>>>> dependencies" (I don't really like that term though). For others, only an >>>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about >>>>>>>>> the fact that people have to remember this. >>>>>>>>> >>>>>>>>> The question is how often there are cross-data-structure >>>>>>>>> references. David mentions a few examples, and I'm sure there are more, but >>>>>>>>> it isn't clear to me yet whether this is pervasive or occasional. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I just did a quick run-through of PassRegistry.def and this is what >>>>>>>> I found: >>>>>>>> >>>>>>>> Module analyses: 0/5 hold pointers to other analyses >>>>>>>> CallGraph: No pointers to other analyses. >>>>>>>> LazyCallGraph: No pointers to other analyses. >>>>>>>> ProfileSummaryAnalysis: No pointers to other analyses. >>>>>>>> TargetLibraryAnalysis: No pointers to other analyses. >>>>>>>> VerifierAnalysis: No pointers to other analyses. >>>>>>>> >>>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis. >>>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function >>>>>>>> analysis). >>>>>>>> >>>>>>>> Function analyses: 9/17 keep pointers to other analysis >>>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual >>>>>>>> AA result objects. >>>>>>>> AssumptionAnalysis: No pointers to other analyses. >>>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and >>>>>>>> BPI. >>>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses. >>>>>>>> (uses LoopInfo to "recalculate" though) >>>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache >>>>>>>> and DominatorTree >>>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses. >>>>>>>> (uses DominatorTreeAnalysis for "recalculate" though). >>>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores >>>>>>>> no pointers. >>>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache, >>>>>>>> TargetLibraryInfo, DominatorTree. >>>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>>> ScalarEvolution, LoopInfo >>>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree >>>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree >>>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree, >>>>>>>> DomFrontier >>>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo, >>>>>>>> AssumptionCache, DominatorTree, LoopInfo >>>>>>>> TargetLibraryAnalysis: Has no dependencies >>>>>>>> TargetIRAnalysis: Has no dependencies. >>>>>>>> >>>>>>>> Function alias analyses: 3/5 keep pointers to other analyses >>>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache, >>>>>>>> DominatorTree, LoopInfo >>>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo >>>>>>>> SCEVAA: Keeps pointer to ScalarEvolution >>>>>>>> ScopedNoAliasAA: No dependencies >>>>>>>> TypeBasedAA: No dependencies >>>>>>>> >>>>>>>> >>>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses. >>>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have >>>>>>>> no dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have >>>>>>>> dependencies that are used just for a "recalculate" step that retains no >>>>>>>> pointers. >>>>>>>> So I think it is fair to say that analyses which hold pointers to >>>>>>>> other analyses is not an exceptional case. In fact, analyses that use other >>>>>>>> analyses just for a "recalculate" step seems to be the exceptional case >>>>>>>> (only 3/28 or about 10%) >>>>>>>> >>>>>>> >>>>>>> Interesting! >>>>>>> >>>>>>> Most of these look like they hold a pointer to the root analysis as >>>>>>> opposed to detailed objects *inside* the analysis? >>>>>>> >>>>>>> It might make sense to try to handle this very specific pattern in a >>>>>>> special way of overriding the invalidate routines is too error prone.... We >>>>>>> could try to make this work "automatically" but I'm worried this would be >>>>>>> challenging to get right. Open to suggestions of course. >>>>>>> >>>>>>> Any other ideas about what would make sense to handle this? >>>>>>> >>>>>>> Does it make sense to override the invalidate routines now and >>>>>>> iterate from there? I feel like you've done a lot of the research necessary >>>>>>> for this already... >>>>>>> >>>>>> >>>>>> I'll keep pushing forward tomorrow with building test-suite >>>>>> successfully using the new PM for the LTO pipeline (I was doing some >>>>>> unrelated LLD stuff for most of today). It will be interesting to see how >>>>>> many `invalidate` overrides will be needed to avoid these issues for just >>>>>> the LTO pipeline on test-suite. >>>>>> >>>>> >>>>> I spent the better part of today working on this and will continue >>>>> tomorrow; this problem seems nastier than I thought. For some reason the >>>>> LTO pipeline (or something about LTO) seems to hit on these issues much >>>>> more (I'm talking like 40k lines of ASan error reports from building >>>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using >>>>> the old PM). Some notes: >>>>> >>>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to >>>>> account for quite a few of the problems. >>>>> - BasicAA and other stuff are marked (by overriding `invalidate` to >>>>> return false) to never be invalidated because they are "stateless". However >>>>> they still hold pointers and so they do need to be invalidated. >>>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround >>>>> similar to r274656 in various passes. >>>>> - D21921 is holding up -- I haven't hit any issues with the core >>>>> logic of that patch. >>>>> - AAResults holds handles to various AA result objects. This means it >>>>> pretty much always needs to be invalidated unless you are sure that none of >>>>> the AA's will get invalidated. >>>>> >>>>> >>>>> The existing `invalidate` method doesn't have the right semantics for >>>>> even an error-prone solution :( We are going to need to make some >>>>> significant changes to even get basic sanity I think. Perhaps each analysis >>>>> can expose a "preserve" static function. E.g. instead of >>>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`. >>>>> I'm actually not quite sure that that will even work. Once I have >>>>> test-suite fully building successfully with the LTO pipeline in the new PM >>>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for >>>>> different IRUnitT's). >>>>> But at this point I'm not confident running *any* pass pipeline in the >>>>> new PM without at least assertions+ASan. >>>>> >>>>> We may want to have a proper design discussion around this problem >>>>> though. >>>>> >>>>> Also I'd like to have test-suite working (by hook or by crook) with >>>>> LTO in the new PM so we can get some numbers on the resident set impact of >>>>> all this caching; if it is really problematic then we may need to start >>>>> talking front-and-center about different invalidation policies for keeping >>>>> this in check instead of leaving it as something that we will be able to >>>>> patch later. >>>>> >>>>> >>>>> >>>>> The more I think about it, the more I'm convinced that the real "hard" >>>>> problem that the new PM is exposing us to is having the ability for any >>>>> pass to ask for any analysis on any IRUnitT (and any specific IRUnit of >>>>> that IRUnitT) and have the result stored somewhere and then invalidated. >>>>> This means that "getAnalysisUsage" is not just a list of passes, but much >>>>> more complicated and is essentially a set of arbitrary pairs "(analysis, >>>>> IRUnit)" (and the associated potential tangle of dependencies between the >>>>> state cached on these tuples). With the old PM, you essentially are looking >>>>> at a problem of scheduling the lifetime of analyses of the same IRUnit >>>>> intermingled with transformation passes on that same IRUnit, so you only >>>>> have the "analysis" part of the tuple above, making things much simpler >>>>> (and handling dependencies is much simpler too). We've obviously outgrown >>>>> this model with examples like LAA, AssumptionCacheTracker, etc. that hack >>>>> around this in the old PM. We may want to have a fresh re-examination of >>>>> what problems we are exactly trying to solve. >>>>> >>>>> For me, my main concern now is what changes need to be made in order >>>>> to feel confident running a pipeline in the new PM without assertions+ASan. >>>>> >>>>> >>>>> Sorry for the long post, just brain-dumping before heading home. >>>>> >>>>> -- Sean Silva >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> -- Sean Silva >>>>>> >>>>>> >>>>> >>>> >>> >> > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/c13c7c4f/attachment-0001.html>
Sean Silva via llvm-dev
2016-Jul-26 08:32 UTC
[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...
I'm not quite sure what post to respond to for a status update, but I guess this one will do (and you can check my log for more info of course). My current working branch for the analysis manager stuff is at https://github.com/chisophugis/llvm/commits/analysis-manager (4ecf6115890bd01caa52c0b99424974e3469291e) I described in my log a bit more my thought process and how to do this without having to churn every single new PM pass (search for "Okay, so now that I think about it, adding the dependency management is a second step after making the analysis manager work for all IRUnit’s."). Essentially, the first step is to still have AnalysisManager<Function>, AnalysisManager<Module>, etc. but the template argument is just a dummy. The real templating is on the methods, which can accept any IRUnit. The advantage of this is that it is source compatible with all the existing code using multiple manager, proxies, etc. The code on my branch at least passes check-llvm. But of course the existing code doesn't test any of the situations where being able to handle multiple IRUnitT's in a single analysis manager would matter. Tomorrow, I get to churn all the passes that have been ported so far, removing the proxies etc. Another thing that just came to me: the current way things work with adaptors means that if a function transformation invalidates a module analysis, the module analysis won't be invalidated until after the ModuleToFunctionPassAdapter returns. So we can end up running an arbitrary number of function transformations that query that stale module analysis. This is (I think?) mostly benign given our current set of module analyses (more info in the log; search for "We may be getting by simply because we seem to not have many module analyses"). But if we invalidate more "correctly" (after every transformation on any contained IRUnit), there's a potential quadratic compile time issue lurking here once have the unified analysis manager: we could easily end up invalidating/recomputing a module analysis once per function visitation. For a pipeline that just does a lot of churn (e.g. function simplification passes we run as part of the regular O3 pipeline; that can delete most code in a function, change the cfg, etc.) it's not clear what useful properties we can guarantee will be preserved which would prevent us from invalidating pretty much all non-immutable outer analyses (module or (theoretically)CGSCC). So for any really nontrivial module analysis, we may end up having to change the interface of module passes to have some way to incrementally recompute themselves as function passes mutate the IR? In some sense, Chandler's CGSCC patch https://reviews.llvm.org/D21464 is trying to do this for a specific module analysis (lazy call graph) and even for just that one (and lots of special handling in the adaptors etc.) it still only gets incrementally updated it after a potentially large set of function passes have run. And it's already quite tricky. Broadly speaking, I can think of two major kinds of "solutions" to this problem: - severely restrict the interaction of function transformations and module analyses in some way. Essentially, we could define a restricted class of module analyses that "are conservatively correct in the face of any transformation that can be made by a function transformation" (this includes immutable module analyses and IIRC things like GlobalsModRef). But the issue is a bit deeper, because it is really about transformations at any smaller IRUnitT. This includes CGSCC. And CGSCC can do substantial interprocedural transformation (e.g. delete a wholefunction IIRC?); we would then need to have another class of module analyses that are "conservatively correct in the face of CGSCC transformations" which seems quite restrictive. -- alternatively or in addition, this can involve restricting what kind of information a function pass can get out of a module analysis - provide some sort of update callback for the outer IRUnit analysis to update itself if the inner one is modified (not sure how practical this will be). E.g. `AM.invalidate<FooModuleAnalysis>(F);` doesn't actually invalidate FooModuleAnalysis, but rather invokes a callback for it to update itself (possibly as an opt-in for the module analysis). -- The current analysis manager does have a `invalidate` hook that it will call, but the argument is always inherently the IRUnit that the analysis itself is cached on so it isn't useful for partial recomputation. We would need to extend that, which requires making the AnalysisResultConcept more complicated (and potentially having to have centralized awareness of all the different IRUnitT's, which until now are fairly decoupled in the new PM). I'm curious what others' thoughts are on this issue. -- Sean Silva On Sun, Jul 24, 2016 at 2:58 AM, Sean Silva <chisophugis at gmail.com> wrote:> I've started looking specifically at the existing code and how it needs to > be changed. It seems like the concept-based polymorphism stuff in > PassManagerInternal.h actually don't need to be changed that much. > AnalysisPassConcept and AnalysisResultConcept need to be changed to take a > type-erased IRUnit (e.g. a void* or something). AnalysisPassModel and > AnalysisResultModel (which inherit from their respective abstract base > classes I mentioned above) are already templated on the IRUnitT and so they > can just cast the void* back to the right type. > > Adding the dependency tracking seems like it will be mostly a data > structure change with some isolated "algorithmic" changes to > track/invalidate dependencies. > > Most of the methods that need to know the specific IRUnitT type already > take the IRUnit as an argument (e.g. getResult, getCachedResult, > invalidate) and so it's actually somewhat natural for the analysis manager > not be templated on IRUnitT (but rather to have just those methods be > templated). > > Also, up until now I hadn't noticed this weird "registerPass" thing on the > analysis manager (AnalysisManagerBase to be specific). Effectively, it > allows analyses to hold state (this is separate from analysis *results* > which of course can hold state). But the state is effectively global to the > AnalysisManager (and hence the pass pipeline, and hence (essentially) the > context (since you can currently only run a single pass pipeline > concurrently on a single LLVMContext)). The net result is that you can > specify a context-global "configuration" for each analysis type (not to be > confused with the analysis *result* type!). Right now, AAManager is the > only thing that uses it though (the "configuration" is the AA pipeline). > > > > btw, I've started to keep a log for this like I do for my projects at > home: > https://docs.google.com/document/d/1-nrq2y_hTiZhrsJDmH8TzFjFEeApfccs6wwGyaXjSkg/edit?usp=sharing > It's fairly stream-of-consciousness. My log style is mostly append-only (I > rarely edit stuff from a previous day or from too long ago on the same day; > if I do I usually don't overwrite and instead insert something like "(edit: > I was actually totally wrong about this, see below)"). So if you want to > follow this across multiple days just go to the end and scroll up until you > see something that you've already looked at. (right now there is just one > day though so there is not very much). > > (some interesting things to search for are "interesting", "okay", "oh", > and "?"; I tend to use these a lot) > > I've gone back to google docs for this log since it is easier to share on > the mailing list. Unfortunately google docs does not have an explicit > notion of "cell" like Mathematica does, so I've tried to just insert lots > of line breaks for things where there would have just been a cell boundary > in Mathematica. (I use Mathematica for my logs at home). > > > -- Sean Silva > > On Fri, Jul 22, 2016 at 1:55 AM, Sean Silva <chisophugis at gmail.com> wrote: > >> The more closely I look at this, the more it seems like there may be a >> useful incremental step in the transition to the new PM: use the new PM >> analysis machinery in the old PM. If this is possible, it will simplify the >> old PM and (hopefully) allow an incremental transition to the new PM >> instead of a flag day transition for the switch. >> >> I.e., AFAICT, the new PM transition is essentially about 2 mostly >> orthogonal aspects of running optimization pipelines: >> 1. Analysis computation and analysis result lifetime management >> (including avoiding analysis recomputation) >> 2. Running transformation passes over their respective IRUnit's in some >> order >> >> These are conflated in the old PM. In reality, the only interaction >> between them (with the new PM machinery for 1.) is a small number of places >> within 2. which need to call 'invalidate'. >> >> I'm pretty sure that 2. is fairly similar in the new PM and old PM (the >> main difference is that the notion of "adapters" is split out in the new >> PM). The analysis handling seems to be what makes the old PM so difficult >> to understand (e.g. it is the cause of the multiple inheritance in the >> implementation). Trying to unify analyses and transformations (and some >> questionable (in hindsight) implementation decisions) seems to be the main >> "problem" with the design of the old PM AFAICT (there are other issues, but >> they are more "nice to have"). >> >> IMO it is an anti-pattern to think of analyses as "passes". There are >> just "analyses" and "transformations" and they are two separate things. In >> fact, the `run` method on analyses should probably be called >> `computeResult` or something like that to avoid confusion. And the `run` >> method on transformations could just as easily be called >> `performTransformation`. >> >> >> I remember asking and getting and answer from Chandler ( >> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150907/299083.html) >> about how to coexist the old and new PM compatibly so individual passes >> would be able to work for both and we wouldn't need to have a flag day. I >> wasn't able to find the discussions that Chandler cited, but I suspect that >> the answer is that we didn't have enough analyses ported at that point to >> consider sharing the analysis management between the old and new PM. >> >> >> If this works out it may be the evolutionary path we have all been >> wanting for this pass manager transition. Fingers crossed. Hopefully I'm >> not overlooking some major issue. >> >> Anyway... back to working on the analysis manager dependency tracking. >> >> -- Sean Silva >> >> On Thu, Jul 21, 2016 at 1:59 AM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> We did some basic sanity checking that memory usage didn't go out of >>> control (it doesn't; at least with with a simple >>> preserves-all/preserves-none invalidation scheme and the current LTO >>> pipeline). Also, I did some basic sanity checking for compile time. The >>> simple preserves-all/preserves-none invalidation scheme seems marginally >>> slower, but no real conclusions (besides a simple sanity check) can be >>> drawn without the real analysis preservation semantics in place. >>> >>> I'll start working on fixing the analysis managers. There seem to >>> basically be two parts (although they may need to be done simultaneously to >>> make sure all the pieces fit together): >>> - unify all the analysis managers into a single analysis manager for all >>> IRUnitT's (requires type-erasing the IRUnit) >>> - introduce the dependency tracking machinery >>> >>> I think I gave a reasonable outline in the two posts above: >>> - the one starting with "To clarify, it seems like the current new PM >>> is essentially trying to solve the problem of maintaining/updating a >>> mapping" >>> - the one starting with "Yeah, the mechanics of maintaining this fully >>> general mapping are straightforward in the abstract" >>> >>> I'm happy to do a centralized writeup if anybody wants. Just let me know. >>> >>> As far as changes to the code, the updates to the new PM passes should >>> hopefully be mechanical (despite there being many of them). However, from >>> what I can tell, fixing this problem will require touching most lines of >>> the analysis manager machinery so the diff will probably be a bit scary, >>> even though I think we can keep the same basic structure (i.e. a per-IRUnit >>> std::list holding one analysis result (at a stable address) per element, >>> combined with a DenseMap from (Analysis, IRUnit) to an element of the >>> per-IRUnit storage list (see AnalysisResultListMapT and AnalysisResultMapT >>> in include/llvm/IR/PassManager.h)). >>> The main changes to the analysis manager will be: >>> - type-erasing the IRUnit >>> - the elements of the AnalysisResultListMapT will need to keep track of >>> any dependents >>> - the analysis manager will need to update those dependencies as it is >>> re-entered by analyses getting results of other analyses >>> - the analysis manager will need to walk the dependencies to do >>> transitive invalidation >>> >>> I think the most robust approach is for analysis dependencies to be >>> implicitly constructed by the analysis manager via tracking entry/exit from >>> get{,Cached}Result. >>> One alternative is for analyses to explicitly pass in their ID to >>> getResult to indicate the dependency, but that seems error-prone (and also >>> verbose). The issue is that we will need a getResult API that does not >>> track dependencies for use by transformation passes (since there is no >>> dependency to track in that case); so an innocuous copy-paste from a >>> transform pass to an analysis can result in a failure to track dependencies >>> and risk of use-after-free (we could fight this with the type system, but I >>> think that would get a bit verbose (I'm willing to try it though if people >>> would prefer)) >>> One restriction of the implicit tracking approach is that it requires >>> all calls into the analysis manager to occur in the `run` method of the >>> analysis (so that the dependencies are implicitly tracked via re-entrance >>> to get{,Cached}Result); is this a reasonable restriction? >>> >>> >>> One annoying problem is that I think that the dependency links will need >>> to be bidirectional. To use the example analysis cache from my other post: >>> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar, >>> [(SomeModuleAnalysis, module TheModule)]) >>> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz, >>> [(SomeModuleAnalysis, module TheModule)]) >>> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for >>> TheModule, [(SomeFunctionAnalysis, function @baz)]) >>> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for >>> @baz, []) >>> >>> if we delete function @baz, then the dependent list [(SomeFunctionAnalysis, >>> function @baz)] for `(SomeModuleAnalysis, module TheModule)` will now >>> have a stale pointer to function @baz. I think that in practice we can >>> check to see if `(SomeFunctionAnalysis, function @baz)` is in our hash >>> table and if it isn't then just ignore the dependency as "this dependent >>> analysis result has already been freed". In the worst case (memory >>> allocator reuses the memory for another function) we may spuriously free an >>> analysis result for a different function. However it is still unsettling >>> (and may actually be UB in C++?). >>> Ideally we would track bidirectional links; that way when we remove an >>> analysis result we also have it remove itself from dependence lists of all >>> of its dependencies. >>> >>> -- Sean Silva >>> >>> On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <chisophugis at gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <chisophugis at gmail.com> >>>> wrote: >>>> >>>>> It looks like there is really no sane fix within the current >>>>> infrastructure. I've had to essentially trigger invalidation (except in the >>>>> PreservedAnalyses::all() case) in the function pass manager and function to >>>>> loop adapters. >>>>> >>>> >>>> invalidation of *everything* I mean. >>>> >>>> -- Sean Silva >>>> >>>> >>>>> >>>>> So we basically need to get the analysis manager dependency tracking >>>>> fixed. >>>>> >>>>> Davide and I will get measurements on the resident set impact of all >>>>> this caching (even with the overconservative invalidation for now) to see >>>>> the impact. If there is a big rss impact then we probably want to consider >>>>> that problem at the same time as the rewrite of the analysis manager. >>>>> >>>>> -- Sean Silva >>>>> >>>>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth < >>>>>>> chandlerc at gmail.com> wrote: >>>>>>> >>>>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <chisophugis at gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth < >>>>>>>>> chandlerc at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva < >>>>>>>>>> chisophugis at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang David Li < >>>>>>>>>>> davidxl at google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM, Chandler Carruth < >>>>>>>>>>>> chandlerc at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Yea, this is a nasty problem. >>>>>>>>>>>>> >>>>>>>>>>>>> One important thing to understand is that this is specific to >>>>>>>>>>>>> analyses which hold references to other analyses. While this isn't unheard >>>>>>>>>>>>> of, it isn't as common as it could be. Still, definitely something we need >>>>>>>>>>>>> to address. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> We can call this type of dependencies (holding references) >>>>>>>>>>>> hard-dependency. The soft dependency refers to the case where analysis 'A' >>>>>>>>>>>> depends on 'B' during computation, but does not need 'B' once it is >>>>>>>>>>>> computed. >>>>>>>>>>>> >>>>>>>>>>>> There are actually quite a few examples of hard-dependency >>>>>>>>>>>> case. For instance LoopAccessInfo, LazyValueInfo etc which hold references >>>>>>>>>>>> to other analyses. >>>>>>>>>>>> >>>>>>>>>>>> Problem involving hard-dependency is actually easier to detect, >>>>>>>>>>>> as it is usually a compile time problem. Issues involving soft dependencies >>>>>>>>>>>> are more subtle and can lead to wrong code gen. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Did you mean to say that soft-dependency problems are easier to >>>>>>>>>>> detect? At least my intuition is that soft-dependency is easier because >>>>>>>>>>> there is no risk of dangling pointers to other analyses. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The issue is that the fact that there is *any* dependency isn't >>>>>>>>>> clear. >>>>>>>>>> >>>>>>>>>> However, I think the only real problem here are these "hard >>>>>>>>>> dependencies" (I don't really like that term though). For others, only an >>>>>>>>>> analysis that is *explicitly* preserved survives. So I'm not worried about >>>>>>>>>> the fact that people have to remember this. >>>>>>>>>> >>>>>>>>>> The question is how often there are cross-data-structure >>>>>>>>>> references. David mentions a few examples, and I'm sure there are more, but >>>>>>>>>> it isn't clear to me yet whether this is pervasive or occasional. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I just did a quick run-through of PassRegistry.def and this is >>>>>>>>> what I found: >>>>>>>>> >>>>>>>>> Module analyses: 0/5 hold pointers to other analyses >>>>>>>>> CallGraph: No pointers to other analyses. >>>>>>>>> LazyCallGraph: No pointers to other analyses. >>>>>>>>> ProfileSummaryAnalysis: No pointers to other analyses. >>>>>>>>> TargetLibraryAnalysis: No pointers to other analyses. >>>>>>>>> VerifierAnalysis: No pointers to other analyses. >>>>>>>>> >>>>>>>>> Module alias analyses: 1/1 keeps pointer to other analysis. >>>>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a function >>>>>>>>> analysis). >>>>>>>>> >>>>>>>>> Function analyses: 9/17 keep pointers to other analysis >>>>>>>>> AAManager: Its Result holds TLI pointer and pointers to individual >>>>>>>>> AA result objects. >>>>>>>>> AssumptionAnalysis: No pointers to other analyses. >>>>>>>>> BlockFrequencyAnalysis: Its Result holds pointers to LoopInfo and >>>>>>>>> BPI. >>>>>>>>> BranchProbabilityAnalysis: Stores no pointers to other analyses. >>>>>>>>> (uses LoopInfo to "recalculate" though) >>>>>>>>> DominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>>>> PostDominatorTreeAnalysis: Stores no pointers to other analyses. >>>>>>>>> DemandedBitsAnalysis: Stores pointers to AssumptionCache >>>>>>>>> and DominatorTree >>>>>>>>> DominanceFrontierAnalysis: Stores no pointers to other analyses. >>>>>>>>> (uses DominatorTreeAnalysis for "recalculate" though). >>>>>>>>> LoopInfo: Uses DominatorTreeAnalysis for "recalculate" but stores >>>>>>>>> no pointers. >>>>>>>>> LazyValueAnalysis: Stores pointers to AssumptionCache, >>>>>>>>> TargetLibraryInfo, DominatorTree. >>>>>>>>> DependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>>>> ScalarEvolution, LoopInfo >>>>>>>>> MemoryDependenceAnalysis: Stores pointers to AliasAnalysis, >>>>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree >>>>>>>>> MemorySSAAnalysis: Stores pointers to AliasAnalysis, DominatorTree >>>>>>>>> RegionInfoAnalysis: Stores pointers to DomTree, PostDomTree, >>>>>>>>> DomFrontier >>>>>>>>> ScalarEvolutionAnalysis: Stores pointers to TargetLibraryInfo, >>>>>>>>> AssumptionCache, DominatorTree, LoopInfo >>>>>>>>> TargetLibraryAnalysis: Has no dependencies >>>>>>>>> TargetIRAnalysis: Has no dependencies. >>>>>>>>> >>>>>>>>> Function alias analyses: 3/5 keep pointers to other analyses >>>>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo, AssumptionCache, >>>>>>>>> DominatorTree, LoopInfo >>>>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo >>>>>>>>> SCEVAA: Keeps pointer to ScalarEvolution >>>>>>>>> ScopedNoAliasAA: No dependencies >>>>>>>>> TypeBasedAA: No dependencies >>>>>>>>> >>>>>>>>> >>>>>>>>> Total: 13/28 analyses (~50%) hold pointers to other analyses. >>>>>>>>> Of the 15/28 analyses that don't hold pointers, 12/15 simply have >>>>>>>>> no dependencies. Only 3/15 (BPI, LoopInfo, DominanceFrontier) have >>>>>>>>> dependencies that are used just for a "recalculate" step that retains no >>>>>>>>> pointers. >>>>>>>>> So I think it is fair to say that analyses which hold pointers to >>>>>>>>> other analyses is not an exceptional case. In fact, analyses that use other >>>>>>>>> analyses just for a "recalculate" step seems to be the exceptional case >>>>>>>>> (only 3/28 or about 10%) >>>>>>>>> >>>>>>>> >>>>>>>> Interesting! >>>>>>>> >>>>>>>> Most of these look like they hold a pointer to the root analysis as >>>>>>>> opposed to detailed objects *inside* the analysis? >>>>>>>> >>>>>>>> It might make sense to try to handle this very specific pattern in >>>>>>>> a special way of overriding the invalidate routines is too error prone.... We >>>>>>>> could try to make this work "automatically" but I'm worried this would be >>>>>>>> challenging to get right. Open to suggestions of course. >>>>>>>> >>>>>>>> Any other ideas about what would make sense to handle this? >>>>>>>> >>>>>>>> Does it make sense to override the invalidate routines now and >>>>>>>> iterate from there? I feel like you've done a lot of the research necessary >>>>>>>> for this already... >>>>>>>> >>>>>>> >>>>>>> I'll keep pushing forward tomorrow with building test-suite >>>>>>> successfully using the new PM for the LTO pipeline (I was doing some >>>>>>> unrelated LLD stuff for most of today). It will be interesting to see how >>>>>>> many `invalidate` overrides will be needed to avoid these issues for just >>>>>>> the LTO pipeline on test-suite. >>>>>>> >>>>>> >>>>>> I spent the better part of today working on this and will continue >>>>>> tomorrow; this problem seems nastier than I thought. For some reason the >>>>>> LTO pipeline (or something about LTO) seems to hit on these issues much >>>>>> more (I'm talking like 40k lines of ASan error reports from building >>>>>> test-suite with the LTO pipeline in the new PM; per-TU steps still using >>>>>> the old PM). Some notes: >>>>>> >>>>>> - BasicAA's dependence on domtree and loopinfo in the new PM seems to >>>>>> account for quite a few of the problems. >>>>>> - BasicAA and other stuff are marked (by overriding `invalidate` to >>>>>> return false) to never be invalidated because they are "stateless". However >>>>>> they still hold pointers and so they do need to be invalidated. >>>>>> - CallGraph uses AssertingVH (PR28400) and so I needed a workaround >>>>>> similar to r274656 in various passes. >>>>>> - D21921 is holding up -- I haven't hit any issues with the core >>>>>> logic of that patch. >>>>>> - AAResults holds handles to various AA result objects. This means it >>>>>> pretty much always needs to be invalidated unless you are sure that none of >>>>>> the AA's will get invalidated. >>>>>> >>>>>> >>>>>> The existing `invalidate` method doesn't have the right semantics for >>>>>> even an error-prone solution :( We are going to need to make some >>>>>> significant changes to even get basic sanity I think. Perhaps each analysis >>>>>> can expose a "preserve" static function. E.g. instead of >>>>>> `PA.preserve<Foo>();` you have to do `Foo::setPreserved(PA);`. >>>>>> I'm actually not quite sure that that will even work. Once I have >>>>>> test-suite fully building successfully with the LTO pipeline in the new PM >>>>>> I'll be able to give a more confident answer (esp. w.r.t. the manager for >>>>>> different IRUnitT's). >>>>>> But at this point I'm not confident running *any* pass pipeline in >>>>>> the new PM without at least assertions+ASan. >>>>>> >>>>>> We may want to have a proper design discussion around this problem >>>>>> though. >>>>>> >>>>>> Also I'd like to have test-suite working (by hook or by crook) with >>>>>> LTO in the new PM so we can get some numbers on the resident set impact of >>>>>> all this caching; if it is really problematic then we may need to start >>>>>> talking front-and-center about different invalidation policies for keeping >>>>>> this in check instead of leaving it as something that we will be able to >>>>>> patch later. >>>>>> >>>>>> >>>>>> >>>>>> The more I think about it, the more I'm convinced that the real >>>>>> "hard" problem that the new PM is exposing us to is having the ability for >>>>>> any pass to ask for any analysis on any IRUnitT (and any specific IRUnit of >>>>>> that IRUnitT) and have the result stored somewhere and then invalidated. >>>>>> This means that "getAnalysisUsage" is not just a list of passes, but much >>>>>> more complicated and is essentially a set of arbitrary pairs "(analysis, >>>>>> IRUnit)" (and the associated potential tangle of dependencies between the >>>>>> state cached on these tuples). With the old PM, you essentially are looking >>>>>> at a problem of scheduling the lifetime of analyses of the same IRUnit >>>>>> intermingled with transformation passes on that same IRUnit, so you only >>>>>> have the "analysis" part of the tuple above, making things much simpler >>>>>> (and handling dependencies is much simpler too). We've obviously outgrown >>>>>> this model with examples like LAA, AssumptionCacheTracker, etc. that hack >>>>>> around this in the old PM. We may want to have a fresh re-examination of >>>>>> what problems we are exactly trying to solve. >>>>>> >>>>>> For me, my main concern now is what changes need to be made in order >>>>>> to feel confident running a pipeline in the new PM without assertions+ASan. >>>>>> >>>>>> >>>>>> Sorry for the long post, just brain-dumping before heading home. >>>>>> >>>>>> -- Sean Silva >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> -- Sean Silva >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160726/08fab4ba/attachment-0001.html>