thr3ads.net - llvm dev - [LLVMdev] RFC: A plan for stateful alias analysis in LLVM [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2015-Jul-14 02:59 UTC

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

# Front matter #
First, I want to emphasize that this is not a short-term plan. This is a
long-term plan. When I was looking at refactoring the existing Alias
Analysis infrastructure in LLVM in ways that would break stateful AA, Hal
Finkel asked me to figure out how things would work with the new pass
manager and make sure that was sane.

This plan assumes the new pass manager. There is nothing deeply or
intrinsically tied to it, but the pieces will fit together naturally there,
and I don't see how to make them fit naturally in the existing pass
manager. I've not invested a lot of time in figuring that out as it seems a
sunk cost.

# Core ideas of what stateful AA should look like: #
1) An alias analysis *is* an analysis pass, and if it needs to build up
state, it must do so for a "unit" of IR at a time (using the new
PM's
terminology).
2) Per-IR-unit invalidation from the pass manager is the primary mechanism
to clear out stale (and thus invalid) state from any alias analysis. That
means that most passes will not preserve AA by default (unlike today).
3) To be reasonably scalable, most alias analyses which retain state will
need to handle these invalidation events in a way that gracefully degrades
their results rather than wiping out all state.

I think everything in-tree but GlobalsModRef fits these requirements
already. Notably, this will work well for CFL, Andersen, or generally any
"intersection" based AA. A "union" based AA could work here,
but would have
to do considerable work to be able to tolerate updates that
"invalidate"
its state without simply forced recomputation. I think this is a reasonable
tradeoff as it preserves our ability to experiment, while simplifying the
update model.

The rest of this email contains much more detailed musing on this. Feel
free to stop reading when necessary.

The last section touches on the core functionality that would be needed to
support something like GlobalsModRef in its current form. I'm much less
confident in this part, so it's not part of my core requirements above.

# Details about analysis passes and pass manager invalidation #
My thought is that an alias analysis pass in the new model would be an
analysis pass. It would produce an analysis result that contained any state
necessary to vend AA queries. This result would be cached just like all the
other analysis results are in the pass manager. It would have the same
invalidation rules too.

A really important consequence of this is that analysis results in the new
scheme get notified of invalidation events and can attempt to update their
internal state to *avoid* becoming invalid. AAs can leverage this to do
fast incremental updates to their internal structures if appropriate.

An AA could even use ValueHandles now to automatically stay up-to-date with
a conservatively correct data structure, and just drop all the invalidation
requests. In fact, I suspect many stateful AAs will take exactly this
approach.

# How will users of AA information query it? #
This will work much like the normal analysis queries. I plan to provide an
alias analysis manager at each IR unit that will provide a single query
interface for code to use. This layer can then be configured to query any
AnalysisManager<T> for an IR unit T for the relevant actual AliasAnalysis
implementations.

The AA manager layer will be responsible for the delegation that is
currently done through chaining of AA implementations. It will also be
responsible for enforcing that queries at a particular layer don't violate
IR unit abstraction boundaries. For example, you cannot query a
FuntionAAManager for alias information of two Values from two different
functions.

# Why GlobalsModRef doesn't fit, and what we could do about it #
There is a fundamental construct in GlobalsModRef that exists today, and
seems somewhat useful, but which doesn't fit the above: using a global
analysis of which memory locations are *not* escaped by any function, in
order to conclude that within a given function, arguments, returns, and
loads from globals cannot produce an aliasing memory location. Thus, if you
always directly index or manipulate a global within a function, and it
never escapes that function (including through integer hacks), you know
that some other function can't produce an aliasing location by loading a
pointer from a global or receiving a pointer in an argument.

In *practice* this is actually totally reasonable. Passes don't run around
escaping pointers just because they feel like it. However, we don't have a
good way to really formally reason about such things.

My suggested way to reason about it is to define a new constraint on
transformation passes over an IR unit T (ie, Function, Module, etc.). This
constraint is that a pass may not *observably* escape a pointer value P
from the unit being transformed. Now, "observably" is the key word
here.
The goal is to still allow fundamental transformations like Reg2Mem and
instrumentation passes. However, the guarantee must be that the passes do
not introduce new ways to observe an escape of a pointer. Because any
actual escapes cannot be observed, it is safe for them to happen and for
some other function pass (for example) to transform based on the
non-escaping property computed prior to the transformation. I *think* this
is a reasonable model, but I've not really had enough time thinking about
it to be 100% certain.

With such a model, we could regain most of the power of GlobalsModRef
w.r.t. non-escaping pointers and interprocedural aliasing inference. We
would change it to be a two-pronged analysis. On one hand, a local analysis
would have to prove that one location came from a module-level memory
location (global) and the other location came from some dynamic input to
the function (argument, return, or a load of memory that is def'ed outside
the function), and then the other hand would provide the interprocedural
information about what global memory locations were non-escaping.


Ok, those are all the thoughts I have currently about AA, and where I'm
headed. This should hopefully provide context for some of the upcoming
patches.

I'm also going to write a separate email about specific tactical problems
I've found with GlobalsModRef today (nothing to do with the above
theoretical problems) and how I plan to approach them.

Thanks,
-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/fe52a9da/attachment.html>

Pete Cooper

2015-Jul-14 05:29 UTC

head link

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

Firstly, thanks for the detailed update on this...> On Jul 13, 2015, at 7:59 PM, Chandler Carruth <chandlerc at
google.com> wrote:
>
> Ok, those are all the thoughts I have currently about AA, and where I'm
headed. This should hopefully provide context for some of the upcoming patches.‘Headed’ w.r.t. to the new pass manager specifically, or AA in both old and new
pass managers? ‘Upcoming patches’ makes it sound like this is coming soon so
will likely be before the new PM is the only PM.

TBH i’m worried about the surface area we have to debug for
correctness/performance issues when it comes time to adopt the new pass manager.

The new pass manager by design is going to lead to different behavior. Analyses
will be preserved and invalidated in different orders from old to new PMs
leading to differences in behavior of transform passes. Bugs will be found in
transforms themselves as a result of the new PM ultimately (indirectly) giving
them different code. All of this is going to take time to debug and block
adoption of the new PM.

I’d like to see us try get the new PM enabled before increasing the surface area
any more. Removing the bad AA updating code was fine as we still have the same
AA updating behavior between old/new (i.e., no AA updating). But I think adding
new AA API should be done either in both PMs or neither (for now). For one
thing, we can’t have enough confidence that the code is working correctly if it
only works in the new PM which isn’t yet being exposed to as much code as the
old PM. Unit tests can only do so much as we can have confidence that the AA
API is good with a unit test, but its harder to have confidence that a transform
is calling it correctly without lots of code going through the transform pass.

Bugs in transforms related to AA updating will likely be extremely subtle and
its better to have a blame list as short as possible. That can only happen if
the AA code is running on the bots or else we may not catch it until the new PM
is adopted.

Thanks,
Pete
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150713/123872ad/attachment.html>

Chandler Carruth

2015-Jul-14 05:46 UTC

head link

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

On Mon, Jul 13, 2015 at 10:29 PM Pete Cooper <peter_cooper at apple.com>
wrote:
> Firstly, thanks for the detailed update on this...
>
> On Jul 13, 2015, at 7:59 PM, Chandler Carruth <chandlerc at
google.com>
> wrote:
>
> Ok, those are all the thoughts I have currently about AA, and where I'm
> headed. This should hopefully provide context for some of the upcoming
> patches.
>
> ‘Headed’ w.r.t. to the new pass manager specifically, or AA in both old
> and new pass managers?  ‘Upcoming patches’ makes it sound like this is
> coming soon so will likely be before the new PM is the only PM.
>
> TBH i’m worried about the surface area we have to debug for
> correctness/performance issues when it comes time to adopt the new pass
> manager.
>
> The new pass manager by design is going to lead to different behavior.
> Analyses will be preserved and invalidated in different orders from old to
> new PMs leading to differences in behavior of transform passes.  Bugs will
> be found in transforms themselves as a result of the new PM ultimately
> (indirectly) giving them different code.  All of this is going to take time
> to debug and block adoption of the new PM.
>
Yep. Trust me, I'm not happy about this, and its something I think about a
lot.

I’d like to see us try get the new PM enabled before increasing the
surface> area any more.
>
Just to clarify, I'm not working on AA because I want to. =] I mean, I'm
enjoying it, but I'm *only* doing what is on my critical path of getting
the new PM working. For example, I wrote this email specifically because
Hal encouraged me to write it up prior to proceeding with refactorings to
remove the update API and establish the core APIs that will work with the
new pass manager.

> Removing the bad AA updating code was fine as we still have the same AA
> updating behavior between old/new (i.e., no AA updating).  But I think
> adding new AA API should be done either in both PMs or neither (for now).
> For one thing, we can’t have enough confidence that the code is working
> correctly if it only works in the new PM which isn’t yet being exposed to
> as much code as the old PM.  Unit tests can only do so much as we can have
> confidence that the AA API is good with a unit test, but its harder to have
> confidence that a transform is calling it correctly without lots of code
> going through the transform pass.
>
> Bugs in transforms related to AA updating will likely be extremely subtle
> and its better to have a blame list as short as possible.  That can only
> happen if the AA code is running on the bots or else we may not catch it
> until the new PM is adopted.
>
Preaching to the choir. I tried to make my opening clarify this. I'm not
planning on adding any special functionality for updating until the new
pass manager is firmly in place precisely because the greater divergence
between the two, the worse the transition will be.

Here, I'm just trying to give an idea of the structure I'm imagining
that
will allow us to *eventually* do the update thing. The APIs I'll need to
introduce to get minimal functionality with the new pass manager will
actually go a long way down this path (we'll need FunctionAAManager etc) so
I think Hal was right that I need to lay out this context before doing
stuff. But I plan to do exactly is little as I can get away with until
there is a reasonable baseline entirely based on the new PM.

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/d37b0446/attachment.html>

Hal Finkel

2015-Jul-16 19:36 UTC

head link

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

----- Original Message -----> From: "Chandler Carruth" <chandlerc at google.com>
> To: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>, "Hal Finkel" <hfinkel at anl.gov>, "Daniel
Berlin"
> <dannyb at google.com>
> Sent: Monday, July 13, 2015 9:59:10 PM
> Subject: RFC: A plan for stateful alias analysis in LLVM
> 
> 
> 
> # Front matter # First, I want to emphasize that this is not a
> short-term plan. This is a long-term plan. When I was looking at
> refactoring the existing Alias Analysis infrastructure in LLVM in
> ways that would break stateful AA, Hal Finkel asked me to figure out
> how things would work with the new pass manager and make sure that
> was sane.
> 
> 
> This plan assumes the new pass manager. There is nothing deeply or
> intrinsically tied to it, but the pieces will fit together naturally
> there, and I don't see how to make them fit naturally in the
> existing pass manager. I've not invested a lot of time in figuring
> that out as it seems a sunk cost.
> 
> 
> # Core ideas of what stateful AA should look like: #
> 1) An alias analysis *is* an analysis pass, and if it needs to build
> up state, it must do so for a "unit" of IR at a time (using the
new
> PM's terminology).
> 2) Per-IR-unit invalidation from the pass manager is the primary
> mechanism to clear out stale (and thus invalid) state from any alias
> analysis. That means that most passes will not preserve AA by
> default (unlike today).
> 3) To be reasonably scalable, most alias analyses which retain state
> will need to handle these invalidation events in a way that
> gracefully degrades their results rather than wiping out all state.
> 
> 
> I think everything in-tree but GlobalsModRef fits these requirements
> already. Notably, this will work well for CFL, Andersen, or
> generally any "intersection" based AA. A "union" based
AA could work
> here, but would have to do considerable work to be able to tolerate
> updates that "invalidate" its state without simply forced
> recomputation. I think this is a reasonable tradeoff as it preserves
> our ability to experiment, while simplifying the update model.
> 
> 
> The rest of this email contains much more detailed musing on this.
> Feel free to stop reading when necessary.
> 
> 
> The last section touches on the core functionality that would be
> needed to support something like GlobalsModRef in its current form.
> I'm much less confident in this part, so it's not part of my core
> requirements above.
> 
> 
> # Details about analysis passes and pass manager invalidation #
> My thought is that an alias analysis pass in the new model would be
> an analysis pass. It would produce an analysis result that contained
> any state necessary to vend AA queries. This result would be cached
> just like all the other analysis results are in the pass manager. It
> would have the same invalidation rules too.
> 
> 
> A really important consequence of this is that analysis results in
> the new scheme get notified of invalidation events and can attempt
> to update their internal state to *avoid* becoming invalid. AAs can
> leverage this to do fast incremental updates to their internal
> structures if appropriate.
> 
> 
> An AA could even use ValueHandles now to automatically stay
> up-to-date with a conservatively correct data structure, and just
> drop all the invalidation requests. In fact, I suspect many stateful
> AAs will take exactly this approach.
> 
> 
> # How will users of AA information query it? #
> This will work much like the normal analysis queries. I plan to
> provide an alias analysis manager at each IR unit that will provide
> a single query interface for code to use. This layer can then be
> configured to query any AnalysisManager<T> for an IR unit T for the
> relevant actual AliasAnalysis implementations.
> 
> 
> The AA manager layer will be responsible for the delegation that is
> currently done through chaining of AA implementations. It will also
> be responsible for enforcing that queries at a particular layer
> don't violate IR unit abstraction boundaries. For example, you
> cannot query a FuntionAAManager for alias information of two Values
> from two different functions.
One thing I'd like to clarify is what is responsible for 'requiring'
the various alias-analysis pass, so that they'll be recomputed if
they're invalidated. So if I have an AliasManager, and I've requested
that it manage a BasicAA and an SCEVAA (for example), and something happens such
that SCEVAA is invalidated, then before future AA queries are processed, the
AliasManager is responsible for triggering the re-computation of SCEVAA as part
of the chaining process (or before the first AA query). That's my
expectation anyway; does that match this model?

Thanks again,
Hal
> 
> 
> # Why GlobalsModRef doesn't fit, and what we could do about it #
> There is a fundamental construct in GlobalsModRef that exists today,
> and seems somewhat useful, but which doesn't fit the above: using a
> global analysis of which memory locations are *not* escaped by any
> function, in order to conclude that within a given function,
> arguments, returns, and loads from globals cannot produce an
> aliasing memory location. Thus, if you always directly index or
> manipulate a global within a function, and it never escapes that
> function (including through integer hacks), you know that some other
> function can't produce an aliasing location by loading a pointer
> from a global or receiving a pointer in an argument.
> 
> 
> In *practice* this is actually totally reasonable. Passes don't run
> around escaping pointers just because they feel like it. However, we
> don't have a good way to really formally reason about such things.
> 
> 
> My suggested way to reason about it is to define a new constraint on
> transformation passes over an IR unit T (ie, Function, Module,
> etc.). This constraint is that a pass may not *observably* escape a
> pointer value P from the unit being transformed. Now,
"observably"
> is the key word here. The goal is to still allow fundamental
> transformations like Reg2Mem and instrumentation passes. However,
> the guarantee must be that the passes do not introduce new ways to
> observe an escape of a pointer. Because any actual escapes cannot be
> observed, it is safe for them to happen and for some other function
> pass (for example) to transform based on the non-escaping property
> computed prior to the transformation. I *think* this is a reasonable
> model, but I've not really had enough time thinking about it to be
> 100% certain.
> 
> 
> With such a model, we could regain most of the power of GlobalsModRef
> w.r.t. non-escaping pointers and interprocedural aliasing inference.
> We would change it to be a two-pronged analysis. On one hand, a
> local analysis would have to prove that one location came from a
> module-level memory location (global) and the other location came
> from some dynamic input to the function (argument, return, or a load
> of memory that is def'ed outside the function), and then the other
> hand would provide the interprocedural information about what global
> memory locations were non-escaping.
> 
> 
> 
> 
> Ok, those are all the thoughts I have currently about AA, and where
> I'm headed. This should hopefully provide context for some of the
> upcoming patches.
> 
> 
> I'm also going to write a separate email about specific tactical
> problems I've found with GlobalsModRef today (nothing to do with the
> above theoretical problems) and how I plan to approach them.
> 
> 
> Thanks,
> -Chandler
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Shen Liu

2015-Jul-16 19:53 UTC

head link

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

Sounds great! Hope the inter-procedure alias analysis can be added in next
version, so next time i can call it directly ^_^

On Thu, Jul 16, 2015 at 3:36 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > From: "Chandler Carruth" <chandlerc at google.com>
> > To: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>, "Hal Finkel" <
> hfinkel at anl.gov>, "Daniel Berlin"
> > <dannyb at google.com>
> > Sent: Monday, July 13, 2015 9:59:10 PM
> > Subject: RFC: A plan for stateful alias analysis in LLVM
> >
> >
> >
> > # Front matter # First, I want to emphasize that this is not a
> > short-term plan. This is a long-term plan. When I was looking at
> > refactoring the existing Alias Analysis infrastructure in LLVM in
> > ways that would break stateful AA, Hal Finkel asked me to figure out
> > how things would work with the new pass manager and make sure that
> > was sane.
> >
> >
> > This plan assumes the new pass manager. There is nothing deeply or
> > intrinsically tied to it, but the pieces will fit together naturally
> > there, and I don't see how to make them fit naturally in the
> > existing pass manager. I've not invested a lot of time in figuring
> > that out as it seems a sunk cost.
> >
> >
> > # Core ideas of what stateful AA should look like: #
> > 1) An alias analysis *is* an analysis pass, and if it needs to build
> > up state, it must do so for a "unit" of IR at a time (using
the new
> > PM's terminology).
> > 2) Per-IR-unit invalidation from the pass manager is the primary
> > mechanism to clear out stale (and thus invalid) state from any alias
> > analysis. That means that most passes will not preserve AA by
> > default (unlike today).
> > 3) To be reasonably scalable, most alias analyses which retain state
> > will need to handle these invalidation events in a way that
> > gracefully degrades their results rather than wiping out all state.
> >
> >
> > I think everything in-tree but GlobalsModRef fits these requirements
> > already. Notably, this will work well for CFL, Andersen, or
> > generally any "intersection" based AA. A "union"
based AA could work
> > here, but would have to do considerable work to be able to tolerate
> > updates that "invalidate" its state without simply forced
> > recomputation. I think this is a reasonable tradeoff as it preserves
> > our ability to experiment, while simplifying the update model.
> >
> >
> > The rest of this email contains much more detailed musing on this.
> > Feel free to stop reading when necessary.
> >
> >
> > The last section touches on the core functionality that would be
> > needed to support something like GlobalsModRef in its current form.
> > I'm much less confident in this part, so it's not part of my
core
> > requirements above.
> >
> >
> > # Details about analysis passes and pass manager invalidation #
> > My thought is that an alias analysis pass in the new model would be
> > an analysis pass. It would produce an analysis result that contained
> > any state necessary to vend AA queries. This result would be cached
> > just like all the other analysis results are in the pass manager. It
> > would have the same invalidation rules too.
> >
> >
> > A really important consequence of this is that analysis results in
> > the new scheme get notified of invalidation events and can attempt
> > to update their internal state to *avoid* becoming invalid. AAs can
> > leverage this to do fast incremental updates to their internal
> > structures if appropriate.
> >
> >
> > An AA could even use ValueHandles now to automatically stay
> > up-to-date with a conservatively correct data structure, and just
> > drop all the invalidation requests. In fact, I suspect many stateful
> > AAs will take exactly this approach.
> >
> >
> > # How will users of AA information query it? #
> > This will work much like the normal analysis queries. I plan to
> > provide an alias analysis manager at each IR unit that will provide
> > a single query interface for code to use. This layer can then be
> > configured to query any AnalysisManager<T> for an IR unit T for
the
> > relevant actual AliasAnalysis implementations.
> >
> >
> > The AA manager layer will be responsible for the delegation that is
> > currently done through chaining of AA implementations. It will also
> > be responsible for enforcing that queries at a particular layer
> > don't violate IR unit abstraction boundaries. For example, you
> > cannot query a FuntionAAManager for alias information of two Values
> > from two different functions.
>
> One thing I'd like to clarify is what is responsible for
'requiring' the
> various alias-analysis pass, so that they'll be recomputed if
they're
> invalidated. So if I have an AliasManager, and I've requested that it
> manage a BasicAA and an SCEVAA (for example), and something happens such
> that SCEVAA is invalidated, then before future AA queries are processed,
> the AliasManager is responsible for triggering the re-computation of SCEVAA
> as part of the chaining process (or before the first AA query). That's
my
> expectation anyway; does that match this model?
>
> Thanks again,
> Hal
>
> >
> >
> > # Why GlobalsModRef doesn't fit, and what we could do about it #
> > There is a fundamental construct in GlobalsModRef that exists today,
> > and seems somewhat useful, but which doesn't fit the above: using
a
> > global analysis of which memory locations are *not* escaped by any
> > function, in order to conclude that within a given function,
> > arguments, returns, and loads from globals cannot produce an
> > aliasing memory location. Thus, if you always directly index or
> > manipulate a global within a function, and it never escapes that
> > function (including through integer hacks), you know that some other
> > function can't produce an aliasing location by loading a pointer
> > from a global or receiving a pointer in an argument.
> >
> >
> > In *practice* this is actually totally reasonable. Passes don't
run
> > around escaping pointers just because they feel like it. However, we
> > don't have a good way to really formally reason about such things.
> >
> >
> > My suggested way to reason about it is to define a new constraint on
> > transformation passes over an IR unit T (ie, Function, Module,
> > etc.). This constraint is that a pass may not *observably* escape a
> > pointer value P from the unit being transformed. Now,
"observably"
> > is the key word here. The goal is to still allow fundamental
> > transformations like Reg2Mem and instrumentation passes. However,
> > the guarantee must be that the passes do not introduce new ways to
> > observe an escape of a pointer. Because any actual escapes cannot be
> > observed, it is safe for them to happen and for some other function
> > pass (for example) to transform based on the non-escaping property
> > computed prior to the transformation. I *think* this is a reasonable
> > model, but I've not really had enough time thinking about it to be
> > 100% certain.
> >
> >
> > With such a model, we could regain most of the power of GlobalsModRef
> > w.r.t. non-escaping pointers and interprocedural aliasing inference.
> > We would change it to be a two-pronged analysis. On one hand, a
> > local analysis would have to prove that one location came from a
> > module-level memory location (global) and the other location came
> > from some dynamic input to the function (argument, return, or a load
> > of memory that is def'ed outside the function), and then the other
> > hand would provide the interprocedural information about what global
> > memory locations were non-escaping.
> >
> >
> >
> >
> > Ok, those are all the thoughts I have currently about AA, and where
> > I'm headed. This should hopefully provide context for some of the
> > upcoming patches.
> >
> >
> > I'm also going to write a separate email about specific tactical
> > problems I've found with GlobalsModRef today (nothing to do with
the
> > above theoretical problems) and how I plan to approach them.
> >
> >
> > Thanks,
> > -Chandler
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150716/3cc55610/attachment.html>

Pete Cooper

2015-Jul-16 20:32 UTC

head link

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

> On Jul 16, 2015, at 12:36 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> ----- Original Message -----
>> From: "Chandler Carruth" <chandlerc at google.com>
>> To: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>, "Hal Finkel" <hfinkel at anl.gov>, "Daniel
Berlin"
>> <dannyb at google.com>
>> Sent: Monday, July 13, 2015 9:59:10 PM
>> Subject: RFC: A plan for stateful alias analysis in LLVM
>> 
>> 
>> 
>> # Front matter # First, I want to emphasize that this is not a
>> short-term plan. This is a long-term plan. When I was looking at
>> refactoring the existing Alias Analysis infrastructure in LLVM in
>> ways that would break stateful AA, Hal Finkel asked me to figure out
>> how things would work with the new pass manager and make sure that
>> was sane.
>> 
>> 
>> This plan assumes the new pass manager. There is nothing deeply or
>> intrinsically tied to it, but the pieces will fit together naturally
>> there, and I don't see how to make them fit naturally in the
>> existing pass manager. I've not invested a lot of time in figuring
>> that out as it seems a sunk cost.
>> 
>> 
>> # Core ideas of what stateful AA should look like: #
>> 1) An alias analysis *is* an analysis pass, and if it needs to build
>> up state, it must do so for a "unit" of IR at a time (using
the new
>> PM's terminology).
>> 2) Per-IR-unit invalidation from the pass manager is the primary
>> mechanism to clear out stale (and thus invalid) state from any alias
>> analysis. That means that most passes will not preserve AA by
>> default (unlike today).
>> 3) To be reasonably scalable, most alias analyses which retain state
>> will need to handle these invalidation events in a way that
>> gracefully degrades their results rather than wiping out all state.
>> 
>> 
>> I think everything in-tree but GlobalsModRef fits these requirements
>> already. Notably, this will work well for CFL, Andersen, or
>> generally any "intersection" based AA. A "union"
based AA could work
>> here, but would have to do considerable work to be able to tolerate
>> updates that "invalidate" its state without simply forced
>> recomputation. I think this is a reasonable tradeoff as it preserves
>> our ability to experiment, while simplifying the update model.
>> 
>> 
>> The rest of this email contains much more detailed musing on this.
>> Feel free to stop reading when necessary.
>> 
>> 
>> The last section touches on the core functionality that would be
>> needed to support something like GlobalsModRef in its current form.
>> I'm much less confident in this part, so it's not part of my
core
>> requirements above.
>> 
>> 
>> # Details about analysis passes and pass manager invalidation #
>> My thought is that an alias analysis pass in the new model would be
>> an analysis pass. It would produce an analysis result that contained
>> any state necessary to vend AA queries. This result would be cached
>> just like all the other analysis results are in the pass manager. It
>> would have the same invalidation rules too.
>> 
>> 
>> A really important consequence of this is that analysis results in
>> the new scheme get notified of invalidation events and can attempt
>> to update their internal state to *avoid* becoming invalid. AAs can
>> leverage this to do fast incremental updates to their internal
>> structures if appropriate.
>> 
>> 
>> An AA could even use ValueHandles now to automatically stay
>> up-to-date with a conservatively correct data structure, and just
>> drop all the invalidation requests. In fact, I suspect many stateful
>> AAs will take exactly this approach.
>> 
>> 
>> # How will users of AA information query it? #
>> This will work much like the normal analysis queries. I plan to
>> provide an alias analysis manager at each IR unit that will provide
>> a single query interface for code to use. This layer can then be
>> configured to query any AnalysisManager<T> for an IR unit T for
the
>> relevant actual AliasAnalysis implementations.
>> 
>> 
>> The AA manager layer will be responsible for the delegation that is
>> currently done through chaining of AA implementations. It will also
>> be responsible for enforcing that queries at a particular layer
>> don't violate IR unit abstraction boundaries. For example, you
>> cannot query a FuntionAAManager for alias information of two Values
>> from two different functions.
> 
> One thing I'd like to clarify is what is responsible for
'requiring' the various alias-analysis pass, so that they'll be
recomputed if they're invalidated. So if I have an AliasManager, and
I've requested that it manage a BasicAA and an SCEVAA (for example), and
something happens such that SCEVAA is invalidated, then before future AA queries
are processed, the AliasManager is responsible for triggering the re-computation
of SCEVAA as part of the chaining process (or before the first AA query).
That's my expectation anyway; does that match this model?That could get expensive if we keep recomputing AA we end up not needing.  We
might need to try find a way to be lazier about it.

Take TBAA for example, if it required any kind of computation, it would be
better to delay it until we actually query AA on a load/store which has the
metadata.  I’m pretty sure TBAA doesn’t have state to recompute, but imagine it
did and hopefully what i said makes sense.

Cheers,
Pete> 
> Thanks again,
> Hal
> 
>> 
>> 
>> # Why GlobalsModRef doesn't fit, and what we could do about it #
>> There is a fundamental construct in GlobalsModRef that exists today,
>> and seems somewhat useful, but which doesn't fit the above: using a
>> global analysis of which memory locations are *not* escaped by any
>> function, in order to conclude that within a given function,
>> arguments, returns, and loads from globals cannot produce an
>> aliasing memory location. Thus, if you always directly index or
>> manipulate a global within a function, and it never escapes that
>> function (including through integer hacks), you know that some other
>> function can't produce an aliasing location by loading a pointer
>> from a global or receiving a pointer in an argument.
>> 
>> 
>> In *practice* this is actually totally reasonable. Passes don't run
>> around escaping pointers just because they feel like it. However, we
>> don't have a good way to really formally reason about such things.
>> 
>> 
>> My suggested way to reason about it is to define a new constraint on
>> transformation passes over an IR unit T (ie, Function, Module,
>> etc.). This constraint is that a pass may not *observably* escape a
>> pointer value P from the unit being transformed. Now,
"observably"
>> is the key word here. The goal is to still allow fundamental
>> transformations like Reg2Mem and instrumentation passes. However,
>> the guarantee must be that the passes do not introduce new ways to
>> observe an escape of a pointer. Because any actual escapes cannot be
>> observed, it is safe for them to happen and for some other function
>> pass (for example) to transform based on the non-escaping property
>> computed prior to the transformation. I *think* this is a reasonable
>> model, but I've not really had enough time thinking about it to be
>> 100% certain.
>> 
>> 
>> With such a model, we could regain most of the power of GlobalsModRef
>> w.r.t. non-escaping pointers and interprocedural aliasing inference.
>> We would change it to be a two-pronged analysis. On one hand, a
>> local analysis would have to prove that one location came from a
>> module-level memory location (global) and the other location came
>> from some dynamic input to the function (argument, return, or a load
>> of memory that is def'ed outside the function), and then the other
>> hand would provide the interprocedural information about what global
>> memory locations were non-escaping.
>> 
>> 
>> 
>> 
>> Ok, those are all the thoughts I have currently about AA, and where
>> I'm headed. This should hopefully provide context for some of the
>> upcoming patches.
>> 
>> 
>> I'm also going to write a separate email about specific tactical
>> problems I've found with GlobalsModRef today (nothing to do with
the
>> above theoretical problems) and how I plan to approach them.
>> 
>> 
>> Thanks,
>> -Chandler
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Jul 2015 - [LLVMdev] RFC: A plan for stateful alias analysis in LLVM

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

[LLVMdev] RFC: A plan for stateful alias analysis in LLVM

Seemingly Similar Threads