thr3ads.net - llvm dev - [llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies... [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2016-Jul-25 16:27 UTC

[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

----- Original Message -----
> From: "Sean Silva" <chisophugis at gmail.com>
> To: "Chandler Carruth" <chandlerc at gmail.com>
> Cc: "Xinliang David Li" <davidxl at google.com>,
"llvm-dev"
> <llvm-dev at lists.llvm.org>, "Davide Italiano"
> <dccitaliano at gmail.com>, "Tim Amini Golling"
> <mehdi.amini at apple.com>, "Hal Finkel" <hfinkel at
anl.gov>, "Sanjoy
> Das" <sanjoy at playingwithpointers.com>, "Pete
Cooper"
> <peter_cooper at apple.com>
> Sent: Thursday, July 21, 2016 3:59:28 AM
> Subject: Re: [PM] I think that the new PM needs to learn about
> inter-analysis dependencies...
> We did some basic sanity checking that memory usage didn't go out of
> control (it doesn't; at least with with a simple
> preserves-all/preserves-none invalidation scheme and the current LTO
> pipeline). Also, I did some basic sanity checking for compile time.
> The simple preserves-all/preserves-none invalidation scheme seems
> marginally slower, but no real conclusions (besides a simple sanity
> check) can be drawn without the real analysis preservation semantics
> in place.
> I'll start working on fixing the analysis managers. There seem to
> basically be two parts (although they may need to be done
> simultaneously to make sure all the pieces fit together):
> - unify all the analysis managers into a single analysis manager for
> all IRUnitT's (requires type-erasing the IRUnit)
> - introduce the dependency tracking machinery
> I think I gave a reasonable outline in the two posts above:
> - the one starting with " To clarify, it seems like the current new
> PM is essentially trying to solve the problem of
> maintaining/updating a mapping"
> - the one starting with " Yeah, the mechanics of maintaining this
> fully general mapping are straightforward in the abstract"
> I'm happy to do a centralized writeup if anybody wants. Just let me
> know.
> As far as changes to the code, t he updates to the new PM passes
> should hopefully be mechanical (despite there being many of them).
> However, from what I can tell, fixing this problem will require
> touching most lines of the analysis manager machinery so the diff
> will probably be a bit scary, even though I think we can keep the
> same basic structure (i.e. a per-IRUnit std::list holding one
> analysis result (at a stable address) per element, combined with a
> DenseMap from (Analysis, IRUnit) to an element of the per-IRUnit
> storage list (see AnalysisResultListMapT and AnalysisResultMapT in
> include/llvm/IR/PassManager.h)).
> The main changes to the analysis manager will be:
> - type-erasing the IRUnit
> - the elements of the AnalysisResultListMapT will need to keep track
> of any dependents
> - the analysis manager will need to update those dependencies as it
> is re-entered by analyses getting results of other analyses
> - the analysis manager will need to walk the dependencies to do
> transitive invalidation
> I think the most robust approach is for analysis dependencies to be
> implicitly constructed by the analysis manager via tracking
> entry/exit from get{,Cached}Result.
> One alternative is for analyses to explicitly pass in their ID to
> getResult to indicate the dependency, but that seems error-prone
> (and also verbose). The issue is that we will need a getResult API
> that does not track dependencies for use by transformation passes
> (since there is no dependency to track in that case); so an
> innocuous copy-paste from a transform pass to an analysis can result
> in a failure to track dependencies and risk of use-after-free (we
> could fight this with the type system, but I think that would get a
> bit verbose (I'm willing to try it though if people would prefer))
> One restriction of the implicit tracking approach is that it requires
> all calls into the analysis manager to occur in the `run` method of
> the analysis (so that the dependencies are implicitly tracked via
> re-entrance to get{,Cached}Result); is this a reasonable
> restriction?
What's the potential use case for getting an analysis outside of something
in, or called by, run()?
> One annoying problem is that I think that the dependency links will
> need to be bidirectional. To use the example analysis cache from my
> other post:
> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar,
> [(SomeModuleAnalysis, module TheModule)])
> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz,
> [(SomeModuleAnalysis, module TheModule)])
> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult
> for TheModule, [(SomeFunctionAnalysis, function @baz)])
> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult
> for @baz, [])
> if we delete function @baz, then the dependent list
> [(SomeFunctionAnalysis, function @baz)] for ` (SomeModuleAnalysis,
> module TheModule)` will now have a stale pointer to function @baz. I
> think that in practice we can check to see if `
> (SomeFunctionAnalysis, function @baz)` is in our hash table and if
> it isn't then just ignore the dependency as "this dependent
analysis
> result has already been freed". In the worst case (memory allocator
> reuses the memory for another function) we may spuriously free an
> analysis result for a different function. However it is still
> unsettling (and may actually be UB in C++?).
> Ideally we would track bidirectional links; that way when we remove
> an analysis result we also have it remove itself from dependence
> lists of all of its dependencies.
I think that bidirectional tracking is preferable. I don't see how to avoid
UB otherwise, and spuriously freeing things based on allocator address reuse
will likely lead to non-deterministic behavior (e.g. because a re-run analysis
might be more accurate than a preserved one).

Thanks again, 
Hal 
> -- Sean Silva
> On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva < chisophugis at gmail.com
>
> wrote:
> > On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva < chisophugis at
gmail.com
> > >
> > wrote:
> 
> > > It looks like there is really no sane fix within the current
> > > infrastructure. I've had to essentially trigger invalidation
> > > (except
> > > in the PreservedAnalyses::all() case) in the function pass
> > > manager
> > > and function to loop adapters.
> > 
> 
> > invalidation of *everything* I mean.
> 
> > -- Sean Silva
> 
> > > So we basically need to get the analysis manager dependency
> > > tracking
> > > fixed.
> > 
> 
> > > Davide and I will get measurements on the resident set impact of
> > > all
> > > this caching (even with the overconservative invalidation for
> > > now)
> > > to see the impact. If there is a big rss impact then we probably
> > > want to consider that problem at the same time as the rewrite of
> > > the
> > > analysis manager.
> > 
> 
> > > -- Sean Silva
> > 
> 
> > > On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <
> > > chisophugis at gmail.com
> > > > wrote:
> > 
> 
> > > > On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <
> > > > chisophugis at gmail.com
> > > > >
> > > > wrote:
> > > 
> > 
> 
> > > > > On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <
> > > > > chandlerc at gmail.com > wrote:
> > > > 
> > > 
> > 
> 
> > > > > > On Wed, Jul 13, 2016 at 12:25 AM Sean Silva <
> > > > > > chisophugis at gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > On Tue, Jul 12, 2016 at 11:39 PM, Chandler
Carruth <
> > > > > > > chandlerc at gmail.com > wrote:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > On Tue, Jul 12, 2016 at 11:34 PM Sean
Silva <
> > > > > > > > chisophugis at gmail.com
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > On Tue, Jul 12, 2016 at 11:32 PM,
Xinliang David Li <
> > > > > > > > > davidxl at google.com > wrote:
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > On Tue, Jul 12, 2016 at 10:57
PM, Chandler Carruth
> > > > > > > > > > <
> > > > > > > > > > chandlerc at gmail.com >
wrote:
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > Yea, this is a nasty
problem.
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > One important thing to
understand is that this is
> > > > > > > > > > > specific
> > > > > > > > > > > to
> > > > > > > > > > > analyses which hold
references to other analyses.
> > > > > > > > > > > While
> > > > > > > > > > > this
> > > > > > > > > > > isn't
> > > > > > > > > > > unheard of, it isn't
as common as it could be.
> > > > > > > > > > > Still,
> > > > > > > > > > > definitely
> > > > > > > > > > > something we need to
address.
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > We can call this type of
dependencies (holding
> > > > > > > > > > references)
> > > > > > > > > > hard-dependency. The soft
dependency refers to the
> > > > > > > > > > case
> > > > > > > > > > where
> > > > > > > > > > analysis 'A' depends
on 'B' during computation, but
> > > > > > > > > > does
> > > > > > > > > > not
> > > > > > > > > > need
> > > > > > > > > > 'B' once it is
computed.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > There are actually quite a few
examples of
> > > > > > > > > > hard-dependency
> > > > > > > > > > case.
> > > > > > > > > > For
> > > > > > > > > > instance LoopAccessInfo,
LazyValueInfo etc which
> > > > > > > > > > hold
> > > > > > > > > > references
> > > > > > > > > > to
> > > > > > > > > > other analyses.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > Problem involving
hard-dependency is actually
> > > > > > > > > > easier
> > > > > > > > > > to
> > > > > > > > > > detect,
> > > > > > > > > > as
> > > > > > > > > > it
> > > > > > > > > > is usually a compile time
problem. Issues involving
> > > > > > > > > > soft
> > > > > > > > > > dependencies are more subtle
and can lead to wrong
> > > > > > > > > > code
> > > > > > > > > > gen.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Did you mean to say that
soft-dependency problems are
> > > > > > > > > easier
> > > > > > > > > to
> > > > > > > > > detect? At least my intuition is
that soft-dependency
> > > > > > > > > is
> > > > > > > > > easier
> > > > > > > > > because there is no risk of
dangling pointers to
> > > > > > > > > other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > The issue is that the fact that there is
*any*
> > > > > > > > dependency
> > > > > > > > isn't
> > > > > > > > clear.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > However, I think the only real problem
here are these
> > > > > > > > "hard
> > > > > > > > dependencies" (I don't really
like that term though).
> > > > > > > > For
> > > > > > > > others,
> > > > > > > > only an analysis that is *explicitly*
preserved
> > > > > > > > survives.
> > > > > > > > So
> > > > > > > > I'm
> > > > > > > > not
> > > > > > > > worried about the fact that people have
to remember
> > > > > > > > this.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > The question is how often there are
> > > > > > > > cross-data-structure
> > > > > > > > references.
> > > > > > > > David mentions a few examples, and
I'm sure there are
> > > > > > > > more,
> > > > > > > > but
> > > > > > > > it
> > > > > > > > isn't clear to me yet whether this
is pervasive or
> > > > > > > > occasional.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > I just did a quick run-through of
PassRegistry.def and
> > > > > > > this
> > > > > > > is
> > > > > > > what
> > > > > > > I
> > > > > > > found:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Module analyses: 0/5 hold pointers to other
analyses
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > CallGraph: No pointers to other analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > LazyCallGraph: No pointers to other analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ProfileSummaryAnalysis: No pointers to other
analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > TargetLibraryAnalysis: No pointers to other
analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > VerifierAnalysis: No pointers to other
analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Module alias analyses: 1/1 keeps pointer to
other
> > > > > > > analysis.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > GlobalsAA: Result keeps pointer to TLI (this
is a
> > > > > > > function
> > > > > > > analysis).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Function analyses: 9/17 keep pointers to
other analysis
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > AAManager: Its Result holds TLI pointer and
pointers to
> > > > > > > individual
> > > > > > > AA
> > > > > > > result objects.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > AssumptionAnalysis: No pointers to other
analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > BlockFrequencyAnalysis: Its Result holds
pointers to
> > > > > > > LoopInfo
> > > > > > > and
> > > > > > > BPI.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > BranchProbabilityAnalysis: Stores no pointers
to other
> > > > > > > analyses.
> > > > > > > (uses LoopInfo to "recalculate"
though)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > DominatorTreeAnalysis: Stores no pointers to
other
> > > > > > > analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > PostDominatorTreeAnalysis: Stores no pointers
to other
> > > > > > > analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > DemandedBitsAnalysis: Stores pointers to
AssumptionCache
> > > > > > > and
> > > > > > > DominatorTree
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > DominanceFrontierAnalysis: Stores no pointers
to other
> > > > > > > analyses.
> > > > > > > (uses DominatorTreeAnalysis for
"recalculate" though).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > LoopInfo: Uses DominatorTreeAnalysis for
"recalculate"
> > > > > > > but
> > > > > > > stores
> > > > > > > no
> > > > > > > pointers.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > LazyValueAnalysis: Stores pointers to
AssumptionCache,
> > > > > > > TargetLibraryInfo, DominatorTree.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > DependenceAnalysis: Stores pointers to
AliasAnalysis,
> > > > > > > ScalarEvolution, LoopInfo
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > MemoryDependenceAnalysis: Stores pointers to
> > > > > > > AliasAnalysis,
> > > > > > > AssumptionCache, TargetLibraryInfo,
DominatorTree
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > MemorySSAAnalysis: Stores pointers to
AliasAnalysis,
> > > > > > > DominatorTree
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > RegionInfoAnalysis: Stores pointers to
DomTree,
> > > > > > > PostDomTree,
> > > > > > > DomFrontier
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ScalarEvolutionAnalysis: Stores pointers to
> > > > > > > TargetLibraryInfo,
> > > > > > > AssumptionCache, DominatorTree, LoopInfo
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > TargetLibraryAnalysis: Has no dependencies
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > TargetIRAnalysis: Has no dependencies.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Function alias analyses: 3/5 keep pointers to
other
> > > > > > > analyses
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > BasicAA: Keeps pointers to TargetLibraryInfo,
> > > > > > > AssumptionCache,
> > > > > > > DominatorTree, LoopInfo
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > CFLAA: Keeps pointer to TargetLibraryInfo
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > SCEVAA: Keeps pointer to ScalarEvolution
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ScopedNoAliasAA: No dependencies
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > TypeBasedAA: No dependencies
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Total: 13/28 analyses (~50%) hold pointers to
other
> > > > > > > analyses.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Of the 15/28 analyses that don't hold
pointers, 12/15
> > > > > > > simply
> > > > > > > have
> > > > > > > no
> > > > > > > dependencies. Only 3/15 (BPI, LoopInfo,
> > > > > > > DominanceFrontier)
> > > > > > > have
> > > > > > > dependencies that are used just for a
"recalculate" step
> > > > > > > that
> > > > > > > retains no pointers.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > So I think it is fair to say that analyses
which hold
> > > > > > > pointers
> > > > > > > to
> > > > > > > other analyses is not an exceptional case. In
fact,
> > > > > > > analyses
> > > > > > > that
> > > > > > > use other analyses just for a
"recalculate" step seems to
> > > > > > > be
> > > > > > > the
> > > > > > > exceptional case (only 3/28 or about 10%)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Interesting!
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Most of these look like they hold a pointer to the
root
> > > > > > analysis
> > > > > > as
> > > > > > opposed to detailed objects *inside* the analysis?
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > It might make sense to try to handle this very
specific
> > > > > > pattern
> > > > > > in
> > > > > > a
> > > > > > special way of overriding the invalidate routines
is too
> > > > > > error
> > > > > > prone.... We could try to make this work
"automatically"
> > > > > > but
> > > > > > I'm
> > > > > > worried this would be challenging to get right.
Open to
> > > > > > suggestions
> > > > > > of course.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Any other ideas about what would make sense to
handle this?
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Does it make sense to override the invalidate
routines now
> > > > > > and
> > > > > > iterate from there? I feel like you've done a
lot of the
> > > > > > research
> > > > > > necessary for this already...
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > I'll keep pushing forward tomorrow with building
test-suite
> > > > > successfully using the new PM for the LTO pipeline (I
was
> > > > > doing
> > > > > some
> > > > > unrelated LLD stuff for most of today). It will be
> > > > > interesting
> > > > > to
> > > > > see how many `invalidate` overrides will be needed to
avoid
> > > > > these
> > > > > issues for just the LTO pipeline on test-suite.
> > > > 
> > > 
> > 
> 
> > > > I spent the better part of today working on this and will
> > > > continue
> > > > tomorrow; this problem seems nastier than I thought. For
some
> > > > reason
> > > > the LTO pipeline (or something about LTO) seems to hit on
these
> > > > issues much more (I'm talking like 40k lines of ASan
error
> > > > reports
> > > > from building test-suite with the LTO pipeline in the new
PM;
> > > > per-TU
> > > > steps still using the old PM). Some notes:
> > > 
> > 
> 
> > > > - BasicAA's dependence on domtree and loopinfo in the
new PM
> > > > seems
> > > > to
> > > > account for quite a few of the problems.
> > > 
> > 
> 
> > > > - BasicAA and other stuff are marked (by overriding
> > > > `invalidate`
> > > > to
> > > > return false) to never be invalidated because they are
> > > > "stateless".
> > > > However they still hold pointers and so they do need to be
> > > > invalidated.
> > > 
> > 
> 
> > > > - CallGraph uses AssertingVH (PR28400) and so I needed a
> > > > workaround
> > > > similar to r274656 in various passes.
> > > 
> > 
> 
> > > > - D21921 is holding up -- I haven't hit any issues with
the
> > > > core
> > > > logic of that patch.
> > > 
> > 
> 
> > > > - AAResults holds handles to various AA result objects. This
> > > > means
> > > > it
> > > > pretty much always needs to be invalidated unless you are
sure
> > > > that
> > > > none of the AA's will get invalidated.
> > > 
> > 
> 
> > > > The existing `invalidate` method doesn't have the right
> > > > semantics
> > > > for
> > > > even an error-prone solution :( We are going to need to make
> > > > some
> > > > significant changes to even get basic sanity I think.
Perhaps
> > > > each
> > > > analysis can expose a "preserve" static function.
E.g. instead
> > > > of
> > > > `PA.preserve<Foo>();` you have to do
`Foo::setPreserved(PA);`.
> > > 
> > 
> 
> > > > I'm actually not quite sure that that will even work.
Once I
> > > > have
> > > > test-suite fully building successfully with the LTO pipeline
in
> > > > the
> > > > new PM I'll be able to give a more confident answer
(esp.
> > > > w.r.t.
> > > > the
> > > > manager for different IRUnitT's).
> > > 
> > 
> 
> > > > But at this point I'm not confident running *any* pass
pipeline
> > > > in
> > > > the new PM without at least assertions+ASan.
> > > 
> > 
> 
> > > > We may want to have a proper design discussion around this
> > > > problem
> > > > though.
> > > 
> > 
> 
> > > > Also I'd like to have test-suite working (by hook or by
crook)
> > > > with
> > > > LTO in the new PM so we can get some numbers on the resident
> > > > set
> > > > impact of all this caching; if it is really problematic then
we
> > > > may
> > > > need to start talking front-and-center about different
> > > > invalidation
> > > > policies for keeping this in check instead of leaving it as
> > > > something that we will be able to patch later.
> > > 
> > 
> 
> > > > The more I think about it, the more I'm convinced that
the real
> > > > "hard" problem that the new PM is exposing us to
is having the
> > > > ability for any pass to ask for any analysis on any IRUnitT
> > > > (and
> > > > any
> > > > specific IRUnit of that IRUnitT) and have the result stored
> > > > somewhere and then invalidated. This means that
> > > > "getAnalysisUsage"
> > > > is not just a list of passes, but much more complicated and
is
> > > > essentially a set of arbitrary pairs "(analysis,
IRUnit)" (and
> > > > the
> > > > associated potential tangle of dependencies between the
state
> > > > cached
> > > > on these tuples). With the old PM, you essentially are
looking
> > > > at
> > > > a
> > > > problem of scheduling the lifetime of analyses of the same
> > > > IRUnit
> > > > intermingled with transformation passes on that same IRUnit,
so
> > > > you
> > > > only have the "analysis" part of the tuple above,
making things
> > > > much
> > > > simpler (and handling dependencies is much simpler too).
We've
> > > > obviously outgrown this model with examples like LAA,
> > > > AssumptionCacheTracker, etc. that hack around this in the
old
> > > > PM.
> > > > We
> > > > may want to have a fresh re-examination of what problems we
are
> > > > exactly trying to solve.
> > > 
> > 
> 
> > > > For me, my main concern now is what changes need to be made
in
> > > > order
> > > > to feel confident running a pipeline in the new PM without
> > > > assertions+ASan.
> > > 
> > 
> 
> > > > Sorry for the long post, just brain-dumping before heading
> > > > home.
> > > 
> > 
> 
> > > > -- Sean Silva
> > > 
> > 
> 
> > > > > -- Sean Silva
> > > > 
> > > 
> > 
> 
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/79a29c82/attachment-0001.html>

Sean Silva via llvm-dev

2016-Jul-25 19:47 UTC

head link

[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

On Mon, Jul 25, 2016 at 9:27 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ------------------------------
>
> *From: *"Sean Silva" <chisophugis at gmail.com>
> *To: *"Chandler Carruth" <chandlerc at gmail.com>
> *Cc: *"Xinliang David Li" <davidxl at google.com>,
"llvm-dev" <
> llvm-dev at lists.llvm.org>, "Davide Italiano" <dccitaliano
at gmail.com>, "Tim
> Amini Golling" <mehdi.amini at apple.com>, "Hal
Finkel" <hfinkel at anl.gov>,
> "Sanjoy Das" <sanjoy at playingwithpointers.com>,
"Pete Cooper" <
> peter_cooper at apple.com>
> *Sent: *Thursday, July 21, 2016 3:59:28 AM
> *Subject: *Re: [PM] I think that the new PM needs to learn about
> inter-analysis dependencies...
>
> We did some basic sanity checking that memory usage didn't go out of
> control (it doesn't; at least with with a simple
> preserves-all/preserves-none invalidation scheme and the current LTO
> pipeline). Also, I did some basic sanity checking for compile time. The
> simple preserves-all/preserves-none invalidation scheme seems marginally
> slower, but no real conclusions (besides a simple sanity check) can be
> drawn without the real analysis preservation semantics in place.
>
> I'll start working on fixing the analysis managers. There seem to
> basically be two parts (although they may need to be done simultaneously to
> make sure all the pieces fit together):
> - unify all the analysis managers into a single analysis manager for all
> IRUnitT's (requires type-erasing the IRUnit)
> - introduce the dependency tracking machinery
>
> I think I gave a reasonable outline in the two posts above:
> - the one starting with "To clarify, it seems like the current new PM
is
> essentially trying to solve the problem of maintaining/updating a
mapping"
> - the one starting with "Yeah, the mechanics of maintaining this fully
> general mapping are straightforward in the abstract"
>
> I'm happy to do a centralized writeup if anybody wants. Just let me
know.
>
> As far as changes to the code, the updates to the new PM passes should
> hopefully be mechanical (despite there being many of them). However, from
> what I can tell, fixing this problem will require touching most lines of
> the analysis manager machinery so the diff will probably be a bit scary,
> even though I think we can keep the same basic structure (i.e. a per-IRUnit
> std::list holding one analysis result (at a stable address) per element,
> combined with a DenseMap from (Analysis, IRUnit) to an element of the
> per-IRUnit storage list (see AnalysisResultListMapT and AnalysisResultMapT
> in include/llvm/IR/PassManager.h)).
> The main changes to the analysis manager will be:
> - type-erasing the IRUnit
> - the elements of the AnalysisResultListMapT will need to keep track of
> any dependents
> - the analysis manager will need to update those dependencies as it is
> re-entered by analyses getting results of other analyses
> - the analysis manager will need to walk the dependencies to do transitive
> invalidation
>
> I think the most robust approach is for analysis dependencies to be
> implicitly constructed by the analysis manager via tracking entry/exit from
> get{,Cached}Result.
> One alternative is for analyses to explicitly pass in their ID to
> getResult to indicate the dependency, but that seems error-prone (and also
> verbose). The issue is that we will need a getResult API that does not
> track dependencies for use by transformation passes (since there is no
> dependency to track in that case); so an innocuous copy-paste from a
> transform pass to an analysis can result in a failure to track dependencies
> and risk of use-after-free (we could fight this with the type system, but I
> think that would get a bit verbose (I'm willing to try it though if
people
> would prefer))
> One restriction of the implicit tracking approach is that it requires all
> calls into the analysis manager to occur in the `run` method of the
> analysis (so that the dependencies are implicitly tracked via re-entrance
> to get{,Cached}Result); is this a reasonable restriction?
>
> What's the potential use case for getting an analysis outside of
something
> in, or called by, run()?
>
The main thing I had in mind was that if an analysis happens to stash a
pointer directly to the analysis manager, then it may call into the
analysis manager in the query path. For example, you would have this
situation if a module analysis has a query path that wants to lazily
compute a function analysis (i.e. compute the function analysis outside of
the `run` call).

>
>
> One annoying problem is that I think that the dependency links will need
> to be bidirectional. To use the example analysis cache from my other post:
> (AssumptionAnalysis, function @bar) -> (AssumptionCache for @bar,
> [(SomeModuleAnalysis, module TheModule)])
> (AssumptionAnalysis, function @baz) -> (AssumptionCache for @baz,
> [(SomeModuleAnalysis, module TheModule)])
> (SomeModuleAnalysis, module TheModule) -> (SomeModuleAnalysisResult for
> TheModule, [(SomeFunctionAnalysis, function @baz)])
> (SomeFunctionAnalysis, function @baz) -> (SomeFunctionAnalysisResult for
> @baz, [])
>
> if we delete function @baz, then the dependent list 
[(SomeFunctionAnalysis,
> function @baz)] for `(SomeModuleAnalysis, module TheModule)` will now
> have a stale pointer to function @baz. I think that in practice we can
> check to see if `(SomeFunctionAnalysis, function @baz)` is in our hash
> table and if it isn't then just ignore the dependency as "this
dependent
> analysis result has already been freed". In the worst case (memory
> allocator reuses the memory for another function) we may spuriously free an
> analysis result for a different function. However it is still unsettling
> (and may actually be UB in C++?).
> Ideally we would track bidirectional links; that way when we remove an
> analysis result we also have it remove itself from dependence lists of all
> of its dependencies.
>
> I think that bidirectional tracking is preferable. I don't see how to
> avoid UB otherwise, and spuriously freeing things based on allocator
> address reuse will likely lead to non-deterministic behavior (e.g. because
> a re-run analysis might be more accurate than a preserved one).
>
Yeah, it doesn't seem onerous to maintain the bidirectional tracking (I
have a design mocked up in my log). The problem actually ends up being
similar to use/user tracking we have in the IR data structures (although
it's different enough that there isn't really any code to share).

-- Sean Silva

>
> Thanks again,
> Hal
>
> -- Sean Silva
>
> On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <chisophugis at
gmail.com> wrote:
>
>>
>>
>> On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <chisophugis at
gmail.com>
>> wrote:
>>
>>> It looks like there is really no sane fix within the current
>>> infrastructure. I've had to essentially trigger invalidation
(except in the
>>> PreservedAnalyses::all() case) in the function pass manager and
function to
>>> loop adapters.
>>>
>>
>> invalidation of *everything* I mean.
>>
>> -- Sean Silva
>>
>>
>>>
>>> So we basically need to get the analysis manager dependency
tracking
>>> fixed.
>>>
>>> Davide and I will get measurements on the resident set impact of
all
>>> this caching (even with the overconservative invalidation for now)
to see
>>> the impact. If there is a big rss impact then we probably want to
consider
>>> that problem at the same time as the rewrite of the analysis
manager.
>>>
>>> -- Sean Silva
>>>
>>> On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <chisophugis at
gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <chisophugis at
gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 12:34 AM, Chandler Carruth <
>>>>> chandlerc at gmail.com> wrote:
>>>>>
>>>>>> On Wed, Jul 13, 2016 at 12:25 AM Sean Silva
<chisophugis at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Tue, Jul 12, 2016 at 11:39 PM, Chandler Carruth
<
>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>
>>>>>>>> On Tue, Jul 12, 2016 at 11:34 PM Sean Silva
<chisophugis at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Jul 12, 2016 at 11:32 PM, Xinliang
David Li <
>>>>>>>>> davidxl at google.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 12, 2016 at 10:57 PM,
Chandler Carruth <
>>>>>>>>>> chandlerc at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yea, this is a nasty problem.
>>>>>>>>>>>
>>>>>>>>>>> One important thing to understand
is that this is specific to
>>>>>>>>>>> analyses which hold references to
other analyses. While this isn't unheard
>>>>>>>>>>> of, it isn't as common as it
could be. Still, definitely something we need
>>>>>>>>>>> to address.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We can call this type of dependencies
(holding references)
>>>>>>>>>> hard-dependency. The soft dependency
refers to the case where analysis 'A'
>>>>>>>>>> depends on 'B' during
computation, but does not need 'B' once it is
>>>>>>>>>> computed.
>>>>>>>>>>
>>>>>>>>>> There are actually quite a few examples
of hard-dependency case.
>>>>>>>>>> For instance LoopAccessInfo,
LazyValueInfo etc which hold references to
>>>>>>>>>> other analyses.
>>>>>>>>>>
>>>>>>>>>> Problem involving hard-dependency is
actually easier to detect,
>>>>>>>>>> as it is usually a compile time
problem. Issues involving soft dependencies
>>>>>>>>>> are more subtle and can lead to wrong
code gen.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Did you mean to say that soft-dependency
problems are easier to
>>>>>>>>> detect? At least my intuition is that
soft-dependency is easier because
>>>>>>>>> there is no risk of dangling pointers to
other analyses.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The issue is that the fact that there is *any*
dependency isn't
>>>>>>>> clear.
>>>>>>>>
>>>>>>>> However, I think the only real problem here are
these "hard
>>>>>>>> dependencies" (I don't really like
that term though). For others, only an
>>>>>>>> analysis that is *explicitly* preserved
survives. So I'm not worried about
>>>>>>>> the fact that people have to remember this.
>>>>>>>>
>>>>>>>> The question is how often there are
cross-data-structure
>>>>>>>> references. David mentions a few examples, and
I'm sure there are more, but
>>>>>>>> it isn't clear to me yet whether this is
pervasive or occasional.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I just did a quick run-through of PassRegistry.def
and this is what
>>>>>>> I found:
>>>>>>>
>>>>>>> Module analyses: 0/5 hold pointers to other
analyses
>>>>>>> CallGraph: No pointers to other analyses.
>>>>>>> LazyCallGraph: No pointers to other analyses.
>>>>>>> ProfileSummaryAnalysis: No pointers to other
analyses.
>>>>>>> TargetLibraryAnalysis: No pointers to other
analyses.
>>>>>>> VerifierAnalysis: No pointers to other analyses.
>>>>>>>
>>>>>>> Module alias analyses: 1/1 keeps pointer to other
analysis.
>>>>>>> GlobalsAA: Result keeps pointer to TLI (this is a
function analysis).
>>>>>>>
>>>>>>> Function analyses: 9/17 keep pointers to other
analysis
>>>>>>> AAManager: Its Result holds TLI pointer and
pointers to individual
>>>>>>> AA result objects.
>>>>>>> AssumptionAnalysis: No pointers to other analyses.
>>>>>>> BlockFrequencyAnalysis: Its Result holds pointers
to LoopInfo and
>>>>>>> BPI.
>>>>>>> BranchProbabilityAnalysis: Stores no pointers to
other analyses.
>>>>>>> (uses LoopInfo to "recalculate" though)
>>>>>>> DominatorTreeAnalysis: Stores no pointers to other
analyses.
>>>>>>> PostDominatorTreeAnalysis: Stores no pointers to
other analyses.
>>>>>>> DemandedBitsAnalysis: Stores pointers to
AssumptionCache
>>>>>>> and DominatorTree
>>>>>>> DominanceFrontierAnalysis: Stores no pointers to
other analyses.
>>>>>>> (uses DominatorTreeAnalysis for
"recalculate" though).
>>>>>>> LoopInfo: Uses DominatorTreeAnalysis for
"recalculate" but stores no
>>>>>>> pointers.
>>>>>>> LazyValueAnalysis: Stores pointers to
AssumptionCache,
>>>>>>> TargetLibraryInfo, DominatorTree.
>>>>>>> DependenceAnalysis: Stores pointers to
AliasAnalysis,
>>>>>>> ScalarEvolution, LoopInfo
>>>>>>> MemoryDependenceAnalysis: Stores pointers to
AliasAnalysis,
>>>>>>> AssumptionCache, TargetLibraryInfo, DominatorTree
>>>>>>> MemorySSAAnalysis: Stores pointers to
AliasAnalysis, DominatorTree
>>>>>>> RegionInfoAnalysis: Stores pointers to DomTree,
PostDomTree,
>>>>>>> DomFrontier
>>>>>>> ScalarEvolutionAnalysis: Stores pointers to
TargetLibraryInfo,
>>>>>>> AssumptionCache, DominatorTree, LoopInfo
>>>>>>> TargetLibraryAnalysis: Has no dependencies
>>>>>>> TargetIRAnalysis: Has no dependencies.
>>>>>>>
>>>>>>> Function alias analyses: 3/5 keep pointers to other
analyses
>>>>>>> BasicAA: Keeps pointers to TargetLibraryInfo,
AssumptionCache,
>>>>>>> DominatorTree, LoopInfo
>>>>>>> CFLAA: Keeps pointer to TargetLibraryInfo
>>>>>>> SCEVAA: Keeps pointer to ScalarEvolution
>>>>>>> ScopedNoAliasAA: No dependencies
>>>>>>> TypeBasedAA: No dependencies
>>>>>>>
>>>>>>>
>>>>>>> Total: 13/28 analyses (~50%) hold pointers to other
analyses.
>>>>>>> Of the 15/28 analyses that don't hold pointers,
12/15 simply have no
>>>>>>> dependencies. Only 3/15 (BPI, LoopInfo,
DominanceFrontier) have
>>>>>>> dependencies that are used just for a
"recalculate" step that retains no
>>>>>>> pointers.
>>>>>>> So I think it is fair to say that analyses which
hold pointers to
>>>>>>> other analyses is not an exceptional case. In fact,
analyses that use other
>>>>>>> analyses just for a "recalculate" step
seems to be the exceptional case
>>>>>>> (only 3/28 or about 10%)
>>>>>>>
>>>>>>
>>>>>> Interesting!
>>>>>>
>>>>>> Most of these look like they hold a pointer to the root
analysis as
>>>>>> opposed to detailed objects *inside* the analysis?
>>>>>>
>>>>>> It might make sense to try to handle this very specific
pattern in a
>>>>>> special way of overriding the invalidate routines is
too error prone.... We
>>>>>> could try to make this work "automatically"
but I'm worried this would be
>>>>>> challenging to get right. Open to suggestions of
course.
>>>>>>
>>>>>> Any other ideas about what would make sense to handle
this?
>>>>>>
>>>>>> Does it make sense to override the invalidate routines
now and
>>>>>> iterate from there? I feel like you've done a lot
of the research necessary
>>>>>> for this already...
>>>>>>
>>>>>
>>>>> I'll keep pushing forward tomorrow with building
test-suite
>>>>> successfully using the new PM for the LTO pipeline (I was
doing some
>>>>> unrelated LLD stuff for most of today). It will be
interesting to see how
>>>>> many `invalidate` overrides will be needed to avoid these
issues for just
>>>>> the LTO pipeline on test-suite.
>>>>>
>>>>
>>>> I spent the better part of today working on this and will
continue
>>>> tomorrow; this problem seems nastier than I thought. For some
reason the
>>>> LTO pipeline (or something about LTO) seems to hit on these
issues much
>>>> more (I'm talking like 40k lines of ASan error reports from
building
>>>> test-suite with the LTO pipeline in the new PM; per-TU steps
still using
>>>> the old PM). Some notes:
>>>>
>>>> - BasicAA's dependence on domtree and loopinfo in the new
PM seems to
>>>> account for quite a few of the problems.
>>>> - BasicAA and other stuff are marked (by overriding
`invalidate` to
>>>> return false) to never be invalidated because they are
"stateless". However
>>>> they still hold pointers and so they do need to be invalidated.
>>>> - CallGraph uses AssertingVH (PR28400) and so I needed a
workaround
>>>> similar to r274656 in various passes.
>>>> - D21921 is holding up -- I haven't hit any issues with the
core logic
>>>> of that patch.
>>>> - AAResults holds handles to various AA result objects. This
means it
>>>> pretty much always needs to be invalidated unless you are sure
that none of
>>>> the AA's will get invalidated.
>>>>
>>>>
>>>> The existing `invalidate` method doesn't have the right
semantics for
>>>> even an error-prone solution :( We are going to need to make
some
>>>> significant changes to even get basic sanity I think. Perhaps
each analysis
>>>> can expose a "preserve" static function. E.g. instead
of
>>>> `PA.preserve<Foo>();` you have to do
`Foo::setPreserved(PA);`.
>>>> I'm actually not quite sure that that will even work. Once
I have
>>>> test-suite fully building successfully with the LTO pipeline in
the new PM
>>>> I'll be able to give a more confident answer (esp. w.r.t.
the manager for
>>>> different IRUnitT's).
>>>> But at this point I'm not confident running *any* pass
pipeline in the
>>>> new PM without at least assertions+ASan.
>>>>
>>>> We may want to have a proper design discussion around this
problem
>>>> though.
>>>>
>>>> Also I'd like to have test-suite working (by hook or by
crook) with LTO
>>>> in the new PM so we can get some numbers on the resident set
impact of all
>>>> this caching; if it is really problematic then we may need to
start talking
>>>> front-and-center about different invalidation policies for
keeping this in
>>>> check instead of leaving it as something that we will be able
to patch
>>>> later.
>>>>
>>>>
>>>>
>>>> The more I think about it, the more I'm convinced that the
real "hard"
>>>> problem that the new PM is exposing us to is having the ability
for any
>>>> pass to ask for any analysis on any IRUnitT (and any specific
IRUnit of
>>>> that IRUnitT) and have the result stored somewhere and then
invalidated.
>>>> This means that "getAnalysisUsage" is not just a list
of passes, but much
>>>> more complicated and is essentially a set of arbitrary pairs
"(analysis,
>>>> IRUnit)" (and the associated potential tangle of
dependencies between the
>>>> state cached on these tuples). With the old PM, you essentially
are looking
>>>> at a problem of scheduling the lifetime of analyses of the same
IRUnit
>>>> intermingled with transformation passes on that same IRUnit, so
you only
>>>> have the "analysis" part of the tuple above, making
things much simpler
>>>> (and handling dependencies is much simpler too). We've
obviously outgrown
>>>> this model with examples like LAA, AssumptionCacheTracker, etc.
that hack
>>>> around this in the old PM. We may want to have a fresh
re-examination of
>>>> what problems we are exactly trying to solve.
>>>>
>>>> For me, my main concern now is what changes need to be made in
order to
>>>> feel confident running a pipeline in the new PM without
assertions+ASan.
>>>>
>>>>
>>>> Sorry for the long post, just brain-dumping before heading
home.
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/ddd6e413/attachment.html>

Hal Finkel via llvm-dev

2016-Jul-25 20:03 UTC

head link

[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

----- Original Message -----
> From: "Sean Silva" <chisophugis at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Xinliang David Li" <davidxl at google.com>,
"llvm-dev"
> <llvm-dev at lists.llvm.org>, "Davide Italiano"
> <dccitaliano at gmail.com>, "Tim Amini Golling"
> <mehdi.amini at apple.com>, "Sanjoy Das"
> <sanjoy at playingwithpointers.com>, "Pete Cooper"
> <peter_cooper at apple.com>, "Chandler Carruth"
<chandlerc at gmail.com>
> Sent: Monday, July 25, 2016 2:47:19 PM
> Subject: Re: [PM] I think that the new PM needs to learn about
> inter-analysis dependencies...
> On Mon, Jul 25, 2016 at 9:27 AM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> > > From: "Sean Silva" < chisophugis at gmail.com >
> > 
> 
> > > To: "Chandler Carruth" < chandlerc at gmail.com >
> > 
> 
> > > Cc: "Xinliang David Li" < davidxl at google.com
>, "llvm-dev" <
> > > llvm-dev at lists.llvm.org >, "Davide Italiano" <
> > > dccitaliano at gmail.com
> > > >, "Tim Amini Golling" < mehdi.amini at apple.com
>, "Hal Finkel" <
> > > hfinkel at anl.gov >, "Sanjoy Das" < sanjoy at
playingwithpointers.com
> > > >,
> > > "Pete Cooper" < peter_cooper at apple.com >
> > 
> 
> > > Sent: Thursday, July 21, 2016 3:59:28 AM
> > 
> 
> > > Subject: Re: [PM] I think that the new PM needs to learn about
> > > inter-analysis dependencies...
> > 
> 
> > > We did some basic sanity checking that memory usage didn't go
out
> > > of
> > > control (it doesn't; at least with with a simple
> > > preserves-all/preserves-none invalidation scheme and the current
> > > LTO
> > > pipeline). Also, I did some basic sanity checking for compile
> > > time.
> > > The simple preserves-all/preserves-none invalidation scheme seems
> > > marginally slower, but no real conclusions (besides a simple
> > > sanity
> > > check) can be drawn without the real analysis preservation
> > > semantics
> > > in place.
> > 
> 
> > > I'll start working on fixing the analysis managers. There
seem to
> > > basically be two parts (although they may need to be done
> > > simultaneously to make sure all the pieces fit together):
> > 
> 
> > > - unify all the analysis managers into a single analysis manager
> > > for
> > > all IRUnitT's (requires type-erasing the IRUnit)
> > 
> 
> > > - introduce the dependency tracking machinery
> > 
> 
> > > I think I gave a reasonable outline in the two posts above:
> > 
> 
> > > - the one starting with " To clarify, it seems like the
current
> > > new
> > > PM is essentially trying to solve the problem of
> > > maintaining/updating a mapping"
> > 
> 
> > > - the one starting with " Yeah, the mechanics of maintaining
this
> > > fully general mapping are straightforward in the abstract"
> > 
> 
> > > I'm happy to do a centralized writeup if anybody wants. Just
let
> > > me
> > > know.
> > 
> 
> > > As far as changes to the code, t he updates to the new PM passes
> > > should hopefully be mechanical (despite there being many of
> > > them).
> > > However, from what I can tell, fixing this problem will require
> > > touching most lines of the analysis manager machinery so the diff
> > > will probably be a bit scary, even though I think we can keep the
> > > same basic structure (i.e. a per-IRUnit std::list holding one
> > > analysis result (at a stable address) per element, combined with
> > > a
> > > DenseMap from (Analysis, IRUnit) to an element of the per-IRUnit
> > > storage list (see AnalysisResultListMapT and AnalysisResultMapT
> > > in
> > > include/llvm/IR/PassManager.h)).
> > 
> 
> > > The main changes to the analysis manager will be:
> > 
> 
> > > - type-erasing the IRUnit
> > 
> 
> > > - the elements of the AnalysisResultListMapT will need to keep
> > > track
> > > of any dependents
> > 
> 
> > > - the analysis manager will need to update those dependencies as
> > > it
> > > is re-entered by analyses getting results of other analyses
> > 
> 
> > > - the analysis manager will need to walk the dependencies to do
> > > transitive invalidation
> > 
> 
> > > I think the most robust approach is for analysis dependencies to
> > > be
> > > implicitly constructed by the analysis manager via tracking
> > > entry/exit from get{,Cached}Result.
> > 
> 
> > > One alternative is for analyses to explicitly pass in their ID to
> > > getResult to indicate the dependency, but that seems error-prone
> > > (and also verbose). The issue is that we will need a getResult
> > > API
> > > that does not track dependencies for use by transformation passes
> > > (since there is no dependency to track in that case); so an
> > > innocuous copy-paste from a transform pass to an analysis can
> > > result
> > > in a failure to track dependencies and risk of use-after-free (we
> > > could fight this with the type system, but I think that would get
> > > a
> > > bit verbose (I'm willing to try it though if people would
> > > prefer))
> > 
> 
> > > One restriction of the implicit tracking approach is that it
> > > requires
> > > all calls into the analysis manager to occur in the `run` method
> > > of
> > > the analysis (so that the dependencies are implicitly tracked via
> > > re-entrance to get{,Cached}Result); is this a reasonable
> > > restriction?
> > 
> 
> > What's the potential use case for getting an analysis outside of
> > something in, or called by, run()?
> 
> The main thing I had in mind was that if an analysis happens to stash
> a pointer directly to the analysis manager, then it may call into
> the analysis manager in the query path. For example, you would have
> this situation if a module analysis has a query path that wants to
> lazily compute a function analysis (i.e. compute the function
> analysis outside of the `run` call).Yes, I'm pretty sure that we want to support that (a module analysis lazily
getting function-level analysis results on the functions in the module). To put
it another way, we definitely want a module transformation to be able to do
that, and if an analysis can't, that limits our ability to properly
organize/reuse code.

We might have a slightly different interface for that, however, if it can't
be implicit in that case.

-Hal 
> > > One annoying problem is that I think that the dependency links
> > > will
> > > need to be bidirectional. To use the example analysis cache from
> > > my
> > > other post:
> > 
> 
> > > (AssumptionAnalysis, function @bar) -> (AssumptionCache for
@bar,
> > > [(SomeModuleAnalysis, module TheModule)])
> > 
> 
> > > (AssumptionAnalysis, function @baz) -> (AssumptionCache for
@baz,
> > > [(SomeModuleAnalysis, module TheModule)])
> > 
> 
> > > (SomeModuleAnalysis, module TheModule) ->
> > > (SomeModuleAnalysisResult
> > > for TheModule, [(SomeFunctionAnalysis, function @baz)])
> > 
> 
> > > (SomeFunctionAnalysis, function @baz) ->
> > > (SomeFunctionAnalysisResult
> > > for @baz, [])
> > 
> 
> > > if we delete function @baz, then the dependent list
> > > [(SomeFunctionAnalysis, function @baz)] for `
> > > (SomeModuleAnalysis,
> > > module TheModule)` will now have a stale pointer to function
> > > @baz.
> > > I
> > > think that in practice we can check to see if `
> > > (SomeFunctionAnalysis, function @baz)` is in our hash table and
> > > if
> > > it isn't then just ignore the dependency as "this
dependent
> > > analysis
> > > result has already been freed". In the worst case (memory
> > > allocator
> > > reuses the memory for another function) we may spuriously free an
> > > analysis result for a different function. However it is still
> > > unsettling (and may actually be UB in C++?).
> > 
> 
> > > Ideally we would track bidirectional links; that way when we
> > > remove
> > > an analysis result we also have it remove itself from dependence
> > > lists of all of its dependencies.
> > 
> 
> > I think that bidirectional tracking is preferable. I don't see how
> > to
> > avoid UB otherwise, and spuriously freeing things based on
> > allocator
> > address reuse will likely lead to non-deterministic behavior (e.g.
> > because a re-run analysis might be more accurate than a preserved
> > one).
> 
> Yeah, it doesn't seem onerous to maintain the bidirectional tracking
> (I have a design mocked up in my log). The problem actually ends up
> being similar to use/user tracking we have in the IR data structures
> (although it's different enough that there isn't really any code to
> share).
> -- Sean Silva
> > Thanks again,
> 
> > Hal
> 
> > > -- Sean Silva
> > 
> 
> > > On Fri, Jul 15, 2016 at 8:40 PM, Sean Silva <
> > > chisophugis at gmail.com
> > > >
> > > wrote:
> > 
> 
> > > > On Fri, Jul 15, 2016 at 8:39 PM, Sean Silva <
> > > > chisophugis at gmail.com
> > > > >
> > > > wrote:
> > > 
> > 
> 
> > > > > It looks like there is really no sane fix within the
current
> > > > > infrastructure. I've had to essentially trigger
invalidation
> > > > > (except
> > > > > in the PreservedAnalyses::all() case) in the function
pass
> > > > > manager
> > > > > and function to loop adapters.
> > > > 
> > > 
> > 
> 
> > > > invalidation of *everything* I mean.
> > > 
> > 
> 
> > > > -- Sean Silva
> > > 
> > 
> 
> > > > > So we basically need to get the analysis manager
dependency
> > > > > tracking
> > > > > fixed.
> > > > 
> > > 
> > 
> 
> > > > > Davide and I will get measurements on the resident set
impact
> > > > > of
> > > > > all
> > > > > this caching (even with the overconservative
invalidation for
> > > > > now)
> > > > > to see the impact. If there is a big rss impact then we
> > > > > probably
> > > > > want to consider that problem at the same time as the
rewrite
> > > > > of
> > > > > the
> > > > > analysis manager.
> > > > 
> > > 
> > 
> 
> > > > > -- Sean Silva
> > > > 
> > > 
> > 
> 
> > > > > On Thu, Jul 14, 2016 at 12:51 AM, Sean Silva <
> > > > > chisophugis at gmail.com
> > > > > > wrote:
> > > > 
> > > 
> > 
> 
> > > > > > On Wed, Jul 13, 2016 at 1:48 AM, Sean Silva <
> > > > > > chisophugis at gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > On Wed, Jul 13, 2016 at 12:34 AM, Chandler
Carruth <
> > > > > > > chandlerc at gmail.com > wrote:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > On Wed, Jul 13, 2016 at 12:25 AM Sean
Silva <
> > > > > > > > chisophugis at gmail.com
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > On Tue, Jul 12, 2016 at 11:39 PM,
Chandler Carruth <
> > > > > > > > > chandlerc at gmail.com > wrote:
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > On Tue, Jul 12, 2016 at 11:34
PM Sean Silva <
> > > > > > > > > > chisophugis at gmail.com
> > > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > On Tue, Jul 12, 2016 at
11:32 PM, Xinliang David
> > > > > > > > > > > Li
> > > > > > > > > > > <
> > > > > > > > > > > davidxl at google.com
> wrote:
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > > On Tue, Jul 12, 2016
at 10:57 PM, Chandler
> > > > > > > > > > > > Carruth
> > > > > > > > > > > > <
> > > > > > > > > > > > chandlerc at
gmail.com > wrote:
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > > > Yea, this is a
nasty problem.
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > > > One important
thing to understand is that
> > > > > > > > > > > > > this
> > > > > > > > > > > > > is
> > > > > > > > > > > > > specific
> > > > > > > > > > > > > to
> > > > > > > > > > > > > analyses which
hold references to other
> > > > > > > > > > > > > analyses.
> > > > > > > > > > > > > While
> > > > > > > > > > > > > this
> > > > > > > > > > > > > isn't
> > > > > > > > > > > > > unheard of, it
isn't as common as it could
> > > > > > > > > > > > > be.
> > > > > > > > > > > > > Still,
> > > > > > > > > > > > > definitely
> > > > > > > > > > > > > something we
need to address.
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > > We can call this
type of dependencies (holding
> > > > > > > > > > > > references)
> > > > > > > > > > > > hard-dependency. The
soft dependency refers to
> > > > > > > > > > > > the
> > > > > > > > > > > > case
> > > > > > > > > > > > where
> > > > > > > > > > > > analysis 'A'
depends on 'B' during computation,
> > > > > > > > > > > > but
> > > > > > > > > > > > does
> > > > > > > > > > > > not
> > > > > > > > > > > > need
> > > > > > > > > > > > 'B' once it
is computed.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > > There are actually
quite a few examples of
> > > > > > > > > > > > hard-dependency
> > > > > > > > > > > > case.
> > > > > > > > > > > > For
> > > > > > > > > > > > instance
LoopAccessInfo, LazyValueInfo etc
> > > > > > > > > > > > which
> > > > > > > > > > > > hold
> > > > > > > > > > > > references
> > > > > > > > > > > > to
> > > > > > > > > > > > other analyses.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > > Problem involving
hard-dependency is actually
> > > > > > > > > > > > easier
> > > > > > > > > > > > to
> > > > > > > > > > > > detect,
> > > > > > > > > > > > as
> > > > > > > > > > > > it
> > > > > > > > > > > > is usually a compile
time problem. Issues
> > > > > > > > > > > > involving
> > > > > > > > > > > > soft
> > > > > > > > > > > > dependencies are
more subtle and can lead to
> > > > > > > > > > > > wrong
> > > > > > > > > > > > code
> > > > > > > > > > > > gen.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > > Did you mean to say that
soft-dependency problems
> > > > > > > > > > > are
> > > > > > > > > > > easier
> > > > > > > > > > > to
> > > > > > > > > > > detect? At least my
intuition is that
> > > > > > > > > > > soft-dependency
> > > > > > > > > > > is
> > > > > > > > > > > easier
> > > > > > > > > > > because there is no risk
of dangling pointers to
> > > > > > > > > > > other
> > > > > > > > > > > analyses.
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > The issue is that the fact
that there is *any*
> > > > > > > > > > dependency
> > > > > > > > > > isn't
> > > > > > > > > > clear.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > However, I think the only real
problem here are
> > > > > > > > > > these
> > > > > > > > > > "hard
> > > > > > > > > > dependencies" (I
don't really like that term
> > > > > > > > > > though).
> > > > > > > > > > For
> > > > > > > > > > others,
> > > > > > > > > > only an analysis that is
*explicitly* preserved
> > > > > > > > > > survives.
> > > > > > > > > > So
> > > > > > > > > > I'm
> > > > > > > > > > not
> > > > > > > > > > worried about the fact that
people have to remember
> > > > > > > > > > this.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > The question is how often
there are
> > > > > > > > > > cross-data-structure
> > > > > > > > > > references.
> > > > > > > > > > David mentions a few examples,
and I'm sure there
> > > > > > > > > > are
> > > > > > > > > > more,
> > > > > > > > > > but
> > > > > > > > > > it
> > > > > > > > > > isn't clear to me yet
whether this is pervasive or
> > > > > > > > > > occasional.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > I just did a quick run-through of
PassRegistry.def
> > > > > > > > > and
> > > > > > > > > this
> > > > > > > > > is
> > > > > > > > > what
> > > > > > > > > I
> > > > > > > > > found:
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Module analyses: 0/5 hold pointers
to other analyses
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > CallGraph: No pointers to other
analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > LazyCallGraph: No pointers to other
analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > ProfileSummaryAnalysis: No pointers
to other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > TargetLibraryAnalysis: No pointers
to other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > VerifierAnalysis: No pointers to
other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Module alias analyses: 1/1 keeps
pointer to other
> > > > > > > > > analysis.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > GlobalsAA: Result keeps pointer to
TLI (this is a
> > > > > > > > > function
> > > > > > > > > analysis).
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Function analyses: 9/17 keep
pointers to other
> > > > > > > > > analysis
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > AAManager: Its Result holds TLI
pointer and pointers
> > > > > > > > > to
> > > > > > > > > individual
> > > > > > > > > AA
> > > > > > > > > result objects.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > AssumptionAnalysis: No pointers to
other analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > BlockFrequencyAnalysis: Its Result
holds pointers to
> > > > > > > > > LoopInfo
> > > > > > > > > and
> > > > > > > > > BPI.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > BranchProbabilityAnalysis: Stores
no pointers to
> > > > > > > > > other
> > > > > > > > > analyses.
> > > > > > > > > (uses LoopInfo to
"recalculate" though)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > DominatorTreeAnalysis: Stores no
pointers to other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > PostDominatorTreeAnalysis: Stores
no pointers to
> > > > > > > > > other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > DemandedBitsAnalysis: Stores
pointers to
> > > > > > > > > AssumptionCache
> > > > > > > > > and
> > > > > > > > > DominatorTree
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > DominanceFrontierAnalysis: Stores
no pointers to
> > > > > > > > > other
> > > > > > > > > analyses.
> > > > > > > > > (uses DominatorTreeAnalysis for
"recalculate"
> > > > > > > > > though).
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > LoopInfo: Uses
DominatorTreeAnalysis for
> > > > > > > > > "recalculate"
> > > > > > > > > but
> > > > > > > > > stores
> > > > > > > > > no
> > > > > > > > > pointers.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > LazyValueAnalysis: Stores pointers
to
> > > > > > > > > AssumptionCache,
> > > > > > > > > TargetLibraryInfo, DominatorTree.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > DependenceAnalysis: Stores pointers
to AliasAnalysis,
> > > > > > > > > ScalarEvolution, LoopInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > MemoryDependenceAnalysis: Stores
pointers to
> > > > > > > > > AliasAnalysis,
> > > > > > > > > AssumptionCache, TargetLibraryInfo,
DominatorTree
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > MemorySSAAnalysis: Stores pointers
to AliasAnalysis,
> > > > > > > > > DominatorTree
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > RegionInfoAnalysis: Stores pointers
to DomTree,
> > > > > > > > > PostDomTree,
> > > > > > > > > DomFrontier
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > ScalarEvolutionAnalysis: Stores
pointers to
> > > > > > > > > TargetLibraryInfo,
> > > > > > > > > AssumptionCache, DominatorTree,
LoopInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > TargetLibraryAnalysis: Has no
dependencies
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > TargetIRAnalysis: Has no
dependencies.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Function alias analyses: 3/5 keep
pointers to other
> > > > > > > > > analyses
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > BasicAA: Keeps pointers to
TargetLibraryInfo,
> > > > > > > > > AssumptionCache,
> > > > > > > > > DominatorTree, LoopInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > CFLAA: Keeps pointer to
TargetLibraryInfo
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > SCEVAA: Keeps pointer to
ScalarEvolution
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > ScopedNoAliasAA: No dependencies
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > TypeBasedAA: No dependencies
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Total: 13/28 analyses (~50%) hold
pointers to other
> > > > > > > > > analyses.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Of the 15/28 analyses that
don't hold pointers, 12/15
> > > > > > > > > simply
> > > > > > > > > have
> > > > > > > > > no
> > > > > > > > > dependencies. Only 3/15 (BPI,
LoopInfo,
> > > > > > > > > DominanceFrontier)
> > > > > > > > > have
> > > > > > > > > dependencies that are used just for
a "recalculate"
> > > > > > > > > step
> > > > > > > > > that
> > > > > > > > > retains no pointers.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > So I think it is fair to say that
analyses which hold
> > > > > > > > > pointers
> > > > > > > > > to
> > > > > > > > > other analyses is not an
exceptional case. In fact,
> > > > > > > > > analyses
> > > > > > > > > that
> > > > > > > > > use other analyses just for a
"recalculate" step
> > > > > > > > > seems
> > > > > > > > > to
> > > > > > > > > be
> > > > > > > > > the
> > > > > > > > > exceptional case (only 3/28 or
about 10%)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Interesting!
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Most of these look like they hold a
pointer to the root
> > > > > > > > analysis
> > > > > > > > as
> > > > > > > > opposed to detailed objects *inside* the
analysis?
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > It might make sense to try to handle
this very specific
> > > > > > > > pattern
> > > > > > > > in
> > > > > > > > a
> > > > > > > > special way of overriding the invalidate
routines is
> > > > > > > > too
> > > > > > > > error
> > > > > > > > prone.... We could try to make this work
> > > > > > > > "automatically"
> > > > > > > > but
> > > > > > > > I'm
> > > > > > > > worried this would be challenging to get
right. Open to
> > > > > > > > suggestions
> > > > > > > > of course.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Any other ideas about what would make
sense to handle
> > > > > > > > this?
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Does it make sense to override the
invalidate routines
> > > > > > > > now
> > > > > > > > and
> > > > > > > > iterate from there? I feel like
you've done a lot of
> > > > > > > > the
> > > > > > > > research
> > > > > > > > necessary for this already...
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > I'll keep pushing forward tomorrow with
building
> > > > > > > test-suite
> > > > > > > successfully using the new PM for the LTO
pipeline (I was
> > > > > > > doing
> > > > > > > some
> > > > > > > unrelated LLD stuff for most of today). It
will be
> > > > > > > interesting
> > > > > > > to
> > > > > > > see how many `invalidate` overrides will be
needed to
> > > > > > > avoid
> > > > > > > these
> > > > > > > issues for just the LTO pipeline on
test-suite.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > I spent the better part of today working on this
and will
> > > > > > continue
> > > > > > tomorrow; this problem seems nastier than I
thought. For
> > > > > > some
> > > > > > reason
> > > > > > the LTO pipeline (or something about LTO) seems to
hit on
> > > > > > these
> > > > > > issues much more (I'm talking like 40k lines
of ASan error
> > > > > > reports
> > > > > > from building test-suite with the LTO pipeline in
the new
> > > > > > PM;
> > > > > > per-TU
> > > > > > steps still using the old PM). Some notes:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - BasicAA's dependence on domtree and loopinfo
in the new
> > > > > > PM
> > > > > > seems
> > > > > > to
> > > > > > account for quite a few of the problems.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - BasicAA and other stuff are marked (by
overriding
> > > > > > `invalidate`
> > > > > > to
> > > > > > return false) to never be invalidated because they
are
> > > > > > "stateless".
> > > > > > However they still hold pointers and so they do
need to be
> > > > > > invalidated.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - CallGraph uses AssertingVH (PR28400) and so I
needed a
> > > > > > workaround
> > > > > > similar to r274656 in various passes.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - D21921 is holding up -- I haven't hit any
issues with the
> > > > > > core
> > > > > > logic of that patch.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - AAResults holds handles to various AA result
objects.
> > > > > > This
> > > > > > means
> > > > > > it
> > > > > > pretty much always needs to be invalidated unless
you are
> > > > > > sure
> > > > > > that
> > > > > > none of the AA's will get invalidated.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > The existing `invalidate` method doesn't have
the right
> > > > > > semantics
> > > > > > for
> > > > > > even an error-prone solution :( We are going to
need to
> > > > > > make
> > > > > > some
> > > > > > significant changes to even get basic sanity I
think.
> > > > > > Perhaps
> > > > > > each
> > > > > > analysis can expose a "preserve" static
function. E.g.
> > > > > > instead
> > > > > > of
> > > > > > `PA.preserve<Foo>();` you have to do
> > > > > > `Foo::setPreserved(PA);`.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > I'm actually not quite sure that that will
even work. Once
> > > > > > I
> > > > > > have
> > > > > > test-suite fully building successfully with the
LTO
> > > > > > pipeline
> > > > > > in
> > > > > > the
> > > > > > new PM I'll be able to give a more confident
answer (esp.
> > > > > > w.r.t.
> > > > > > the
> > > > > > manager for different IRUnitT's).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > But at this point I'm not confident running
*any* pass
> > > > > > pipeline
> > > > > > in
> > > > > > the new PM without at least assertions+ASan.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > We may want to have a proper design discussion
around this
> > > > > > problem
> > > > > > though.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Also I'd like to have test-suite working (by
hook or by
> > > > > > crook)
> > > > > > with
> > > > > > LTO in the new PM so we can get some numbers on
the
> > > > > > resident
> > > > > > set
> > > > > > impact of all this caching; if it is really
problematic
> > > > > > then
> > > > > > we
> > > > > > may
> > > > > > need to start talking front-and-center about
different
> > > > > > invalidation
> > > > > > policies for keeping this in check instead of
leaving it as
> > > > > > something that we will be able to patch later.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > The more I think about it, the more I'm
convinced that the
> > > > > > real
> > > > > > "hard" problem that the new PM is
exposing us to is having
> > > > > > the
> > > > > > ability for any pass to ask for any analysis on
any IRUnitT
> > > > > > (and
> > > > > > any
> > > > > > specific IRUnit of that IRUnitT) and have the
result stored
> > > > > > somewhere and then invalidated. This means that
> > > > > > "getAnalysisUsage"
> > > > > > is not just a list of passes, but much more
complicated and
> > > > > > is
> > > > > > essentially a set of arbitrary pairs
"(analysis, IRUnit)"
> > > > > > (and
> > > > > > the
> > > > > > associated potential tangle of dependencies
between the
> > > > > > state
> > > > > > cached
> > > > > > on these tuples). With the old PM, you essentially
are
> > > > > > looking
> > > > > > at
> > > > > > a
> > > > > > problem of scheduling the lifetime of analyses of
the same
> > > > > > IRUnit
> > > > > > intermingled with transformation passes on that
same
> > > > > > IRUnit,
> > > > > > so
> > > > > > you
> > > > > > only have the "analysis" part of the
tuple above, making
> > > > > > things
> > > > > > much
> > > > > > simpler (and handling dependencies is much simpler
too).
> > > > > > We've
> > > > > > obviously outgrown this model with examples like
LAA,
> > > > > > AssumptionCacheTracker, etc. that hack around this
in the
> > > > > > old
> > > > > > PM.
> > > > > > We
> > > > > > may want to have a fresh re-examination of what
problems we
> > > > > > are
> > > > > > exactly trying to solve.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > For me, my main concern now is what changes need
to be made
> > > > > > in
> > > > > > order
> > > > > > to feel confident running a pipeline in the new PM
without
> > > > > > assertions+ASan.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Sorry for the long post, just brain-dumping before
heading
> > > > > > home.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > -- Sean Silva
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > -- Sean Silva
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > --
> 
> > Hal Finkel
> 
> > Assistant Computational Scientist
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> 
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/37c7d629/attachment-0001.html>

llvm dev - Jul 2016 - [PM] I think that the new PM needs to learn about inter-analysis dependencies...

[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...

[llvm-dev] [PM] I think that the new PM needs to learn about inter-analysis dependencies...