thr3ads.net - llvm dev - [LLVMdev] Re: Alias Analysis Design & Implementation and LLVM [Nov 2003]

If this information is useful, please help other people find it:
Share via:

Chris Lattner

2003-Nov-06 18:22 UTC

[LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

On Thu, 6 Nov 2003, Michelle Strout wrote:> I think some clarifications and examples would be helpful.
No problem.  :)
> -  LLVM is in SSA.  It is in SSA before alias analysis has been
> >>>> performed.  With OA, it has been mentioned that the  SSA
generation
> >>>> is
> >>>> incorrect because it doesn't take alias analysis
information into
> >>>> account.  This seems logical, because the definition of
SSA requires
> >>>> that each variable use only has one definition point.  Is
the LLVM
> >>>> in
> >>>> fact not in legal SSA?
> >
> > While an SSA form, LLVM does not use SSA for memory locations, only
> > scalar.
>
> Scalar variables still have a stack location associated with them,
> don't they?
No, they don't.  All scalar values in LLVM represent "virtual
registers":
you cannot take their address, and they do not live in memory.  In fact,
LLVM does not have an "address of" operator.  Stack memory is all
explicitly allocated with the 'alloca' instruction.  See Section 2.3 of
this paper for more discussion on this topic:

http://llvm.cs.uiuc.edu/pubs/2003-09-30-LifelongOptimizationTR.html
> > This gives us some very nice properties: for example the C/C++
> > front-end does not have an "SSA construction pass", instead
all
> > automatic variables live on the stack (in memory) and are accessed via
> > loads & stores.  As such, no values are live across basic blocks,
and
> > there are no PHI nodes in the C frontend output (making the C frontend
> > easier to implement).
>
> I think an example would help here.  I am going to start one to expose
> my confusion and then you can fix it and enlighten all of us. =)
No problem at all, great idea in fact.  I've translated your example into
the following C function, let me know if it's not what you're thinking:

int test(int B, int C, _Bool Z, int *P) {
  int A = B + C;
  if (Z)
    A = 10;
  else
    *P = 20;
  return A;
}
> LLVM created by frontend (keep in mind I am unfamiliar with LLVM
> syntax, just doing something like 3-address code):
Yes, that's exactly right.  The actual (annotated) LLVM generated by
the C frontend looks like this:

int %test(int %B, int %C, bool %Z, int* %P) {
entry:          ; No predecessors!
;;; There is no "address-of" operator in LLVM.  As such, all values
which
;;; can have their address taken (arguments, automatic variables, etc),
;;; get stack allocated versions with the 'alloca' instruction.  The
;;; alloca instruction returns the address of the stack location, so &Z,
;;; for example, would just be compiled into %Z.0 by the frontend.
        %B.0 = alloca int       ; %B.0 is of type int*
        %C.0 = alloca int       ; %C.0 is of type int*
        %Z.0 = alloca bool      ; %Z.0 is of type bool*
        %P.0 = alloca int*      ; %P.0 is of type int**
        %result = alloca int    ; ...
        %A = alloca int

;;; Store the incoming arguments into their stack locations.
        store int %B, int* %B.0
        store int %C, int* %C.0
        store bool %Z, bool* %Z.0
        store int* %P, int** %P.0

;;; A = B + C;   -- I told you the front-end was simple!  :)
        %tmp.0 = load int* %B.0
        %tmp.1 = load int* %C.0
        %tmp.2 = add int %tmp.0, %tmp.1
        store int %tmp.2, int* %A

;;; The conditional branch
        %tmp.3 = load bool* %Z.0
        br bool %tmp.3, label %then, label %else

then:           ; preds = %entry
        store int 10, int* %A
        br label %endif

else:           ; preds = %entry
;;; Load the actual value of P
        %tmp.7 = load int** %P.0

;;; Store through the value of P
        store int 20, int* %tmp.7
        br label %endif

endif:          ; preds = %then, %else
        %tmp.8 = load int* %A

;;; If you have multiple 'return' instructions, they all store into the
;;; 'result' variable, then branch to the exit node.  In this case there
;;; is only a single return, so this is superfluous, but still created.
        store int %tmp.8, int* %result
        %tmp.9 = load int* %result
        ret int %tmp.9
}

I cannot stress enough that the C/C++ front-end has been designed to be
_extremely_ simple, and obviously the above code is grotesquely
inefficient.
> LLVM after the LLVM optimizer turns things into SSA:
> - Do you get rid of the loads and stores?
Yes.
> - In SSA there should be a phi node for a after the if statement.  From
> looking at the original code, it seems that the phi statement should
> look like this:
>      a_3 = phi(a_1,a_2,(*p_1))
> Seems like you need to include something involving *p because it might
> alias the location for a.  How do you handle this?
Yes, though alias analysis.  The optimized version of the above function
is:

int %test(int %B, int %C, bool %Z, int* %P) {
entry:          ; No predecessors!
        %tmp.2 = add int %C, %B             ; <int> [#uses=1]
        br bool %Z, label %then, label %else

then:           ; preds = %entry
        ret int 10

else:           ; preds = %entry
        store int 20, int* %P
        ret int %tmp.2
}

Ok, well the optimizer did a bit of tail duplication.  Without that, we
get this:

int %test(int %B, int %C, bool %Z, int* %P) {
entry:          ; No predecessors!
        %tmp.2 = add int %B, %C
        br bool %Z.1, label %endif, label %else

else:           ; preds = %entry
        store int 20, int* %P
        br label %endif

endif:          ; preds = %entry, %else
        %A.0 = phi int [ %tmp.2, %else ], [ 10, %entry ]
        ret int %A.0
}

There is your PHI node.  In this case, LLVM knows that P cannot alias A (A
is an automatic, P comes from outside of the function), so it promotes it
to be a register with no problem (as it does with all of the other
automatic variables as well).  If you took the address of A, for example,
and did unanalyzable stuff with it, it would leave it as an alloca, then
perform traditional load/store optimizations to eliminate as many accesses
to it as possible.  If you have any particular examples you want me to
run, I can certainly do that.  Also, you can play around with the online
web version of LLVM, here: http://llvm.cs.uiuc.edu/demo/

Note that, by default, we are just doing some simple intraprocedural alias
analysis, so don't expect miracles with the web form. :)

> > Because of this, there was no need to implement a dataflow framework,
> > though doing so would not be hard.  However, I'm not aware of any
> > analysis which requires traditional dataflow, or that cannot be
> > implemented more efficiently in SSA form.  If you are aware of one,
> > please let me know.
> >  :)
>
> Backward dataflow analyses, like live variable analysis, cannot be
> performed with SSA [Choi91, JohnsonPingali93].
I've implemented live variable analysis in LLVM, source here:

http://llvm.cs.uiuc.edu/doxygen/LiveVariables_8h-source.html
http://llvm.cs.uiuc.edu/doxygen/LiveVariables_8cpp-source.html

It uses a very straight-forward and efficient algorithm, described in
Appel's book.
> Attached to this email is an example (color pdf) that illustrates why.
I assume you're talking about the extra IN's in B1 (X_1, x_2)?  Our live
variable analysis doesn't have this problem.  I would have to see more
context to figure out why the algorithm described in the PDF would have
this problem, but it looks like they are not handling the PHI nodes
properly (PHI uses occur in the predecessor blocks, not in the PHI node
blocks): B3 should have an empty IN set.
> Do you convert from SSA back to a non-SSA LLVM to do the live-variable
> analysis that you have implemented in the backends?
The backend uses an SSA-based machine code representation, in which
'virtual registers' in SSA form are used for a majority of the register
references.  The register allocator (which uses the live variable analysis
on the machine code) is responsible from lowering the SSA representation
away and eliminating PHI nodes, producing code with 'physical' register
references only.  Note that the LLVM code generators do not use the main
LLVM code representation for code generation, they use a lower-level,
machine specific, representation.
> At Argonne one of the application specific analyses we need to perform,
> activity analysis, has a forward and backward dataflow analysis
> component.
I still do not understand why you think you cannot perform backward
analysis, but if you are truly convinced, it would be easy to add a DFA
framework.  :)

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Michelle Strout

2003-Nov-10 16:44 UTC

head link

[LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

Chris and everyone else,

Below I summarize my understanding of what llvm does when converting to 
SSA and a clarification on why backward dataflow analyses can not be 
performed on "just" SSA.

>> Scalar variables still have a stack location associated with them,
>> don't they?
>
> No, they don't.  All scalar values in LLVM represent "virtual 
> registers":
> you cannot take their address, and they do not live in memory.  In 
> fact,
> LLVM does not have an "address of" operator.  Stack memory is all
> explicitly allocated with the 'alloca' instruction.  See Section
2.3 of
> this paper for more discussion on this topic:
>
> http://llvm.cs.uiuc.edu/pubs/2003-09-30-LifelongOptimizationTR.html
>
>>> This gives us some very nice properties: for example the C/C++
>>> front-end does not have an "SSA construction pass",
instead all
>>> automatic variables live on the stack (in memory) and are accessed 
>>> via
>>> loads & stores.  As such, no values are live across basic
blocks, and
>>> there are no PHI nodes in the C frontend output (making the C 
>>> frontend
>>> easier to implement).
I made various test programs, ran them through the llvm C frontend, and 
then just called the mem2reg pass, which raises some memory references 
to virtual registers (what people typically think of as scalars) to 
generate an SSA representation.  Any virtual register use satisfies the 
SSA requirements that only one definition reaches that use.

llvmgcc test6.c -o test6.ll -S
llvm-as < test6.ll | opt -mem2reg -simplifycfg | llvm-dis > 
test6_mem2reg.ll

In a sense llvm is doing a simple type of alias analysis when they do
a mem2reg pass using no other analyses.  It appears that the
following memory references are assumed to possibly be aliased and
therefore are not treated as scalar variables (ie. they are always 
loaded
and stored to memory) in the resulting SSA:
- global variables
- fields in a struct allocated with malloc and fields in local structs
- a dereference of any parameter pointer, *p
- an array reference whether the array is malloced or local
- if a local variable has its address taken as parameter to a function,
it could be aliased and therefore is not treated as a scalar
     NOTE: p = &A;
           d = *p + 10;
     is treated with some implicit copy propagation.
     quote from Chris,
     "Ok, there are certain 'optimizations' that you cannot turn
off in
LLVM.
      A trivial example of this is copy-propagation.  LLVM does not 
_even have_
      a copy instruction, so the 'q = &A' instruction is just
deleted,
and any
      references to q use &A instead.  This means that your *q turns 
into a
      direct assignment to A."
Anyone from the LLVM group should correct me if any of the above
observations are incorrect or incomplete.

I think this has the following consequences for the OpenAnalysis 
project:
- Instead of having the most naive alias analysis as default, an alias
analysis that assumes any local variables that do not have their
address taken in any way (var ref in parameter list, addressOf operator,
etc) should be considered to not alias anyone other variable references.
- To write even this most simple alias analysis in OA we must define
an interface to the SourceIR (ROSE/Sage or Open64/Whirl or LLVM) that
allows us to query the necessary information.
>>> Because of this, there was no need to implement a dataflow
framework,
>>> though doing so would not be hard.  However, I'm not aware of
any
>>> analysis which requires traditional dataflow, or that cannot be
>>> implemented more efficiently in SSA form.  If you are aware of one,
>>> please let me know.
>>>  :)
>>
>> Backward dataflow analyses, like live variable analysis, cannot be
>> performed with SSA [Choi91, JohnsonPingali93].
>
> I've implemented live variable analysis in LLVM, source here:
>
> http://llvm.cs.uiuc.edu/doxygen/LiveVariables_8h-source.html
> http://llvm.cs.uiuc.edu/doxygen/LiveVariables_8cpp-source.html
>
> It uses a very straight-forward and efficient algorithm, described in
> Appel's book.
Ok, my previously sent example was bogus. =)  Upon further 
consideration, I realized that all of the papers that said backward 
dataflow analysis can not be done with SSA meant "with only SSA".  In 
other words, an algorithm that uses both the CFG and SSA (like the live 
var analysis in LLVM) can do backward dataflow problems.  Attached to 
this email is another example of  the backward flow problem live 
variable analysis.  This time the SSA graph is shown as well as the 
CFG.  The example shows that constant propagation (a forward analysis) 
can be performed by only traversing the SSA edges, but that live var 
analysis requires the CFG as well.

Cheers,
Michelle


-------------- next part --------------
A non-text attachment was scrubbed...
Name: SSA-woCFG-backward.pdf
Type: application/pdf
Size: 36160 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20031110/05bf4f9b/attachment.pdf>
-------------- next part --------------


======================================  Michelle Mills Strout, Ph.D.
  Argonne National Laboratory, MCS Division
  9700 S. Cass Avenue, Bldg 221, Rm D243
  Argonne, IL  60439-4844

  http://www.mcs.anl.gov/~mstrout
  mstrout at mcs.anl.gov
=======================================

Chris Lattner

2003-Nov-10 17:02 UTC

head link

[LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

On Mon, 10 Nov 2003, Michelle Strout wrote:> In a sense llvm is doing a simple type of alias analysis when they do
> a mem2reg pass using no other analyses.  It appears that the
> following memory references are assumed to possibly be aliased and
> therefore are not treated as scalar variables (ie. they are always
> loaded
> and stored to memory) in the resulting SSA:
> - global variables
> - fields in a struct allocated with malloc and fields in local structs
> - a dereference of any parameter pointer, *p
> - an array reference whether the array is malloced or local
> - if a local variable has its address taken as parameter to a function,
> it could be aliased and therefore is not treated as a scalar
>      NOTE: p = &A;
>            d = *p + 10;
>      is treated with some implicit copy propagation.
Note that copy propagation in general cannot be disabled in LLVM, this is
not specific to pointers.
> Anyone from the LLVM group should correct me if any of the above
> observations are incorrect or incomplete.
I believe that is a correct assessment of the LLVM mem2reg pass.  Note
that there are several other passes (like -scalarrepl, -licm, "-load-vn
-gcse"), that do more advanced analyses as well, and can more aggressively
transform the program based on alias analysis.

LICM, for example, turns this loop:

  for ( ... )
    *P = *P + ...

into

  tmp = *P;
  for ( ... )
    tmp += ...
  *P = tmp;

when it is safe (a common example of this is when *P is actually a global
variable).
> Upon further consideration, I realized that all of the papers that said
> backward dataflow analysis can not be done with SSA meant "with only
> SSA".  In other words, an algorithm that uses both the CFG and SSA
(like
> the live var analysis in LLVM) can do backward dataflow problems.
> Attached to this email is another example of the backward flow problem
> live variable analysis.  This time the SSA graph is shown as well as the
> CFG.  The example shows that constant propagation (a forward analysis)
> can be performed by only traversing the SSA edges, but that live var
> analysis requires the CFG as well.
Ah, that makes perfect sense.  Ok, I see what you mean.  I forgot that
some compilers do not always have a CFG implicitly available.

Just to be perfectly clear with my previous response, there are cases when
solving problems with dense bit-vector dataflow can be faster than using
'sparse' SSA solutions: bit-vectors have a tremendous amount of
parallelism. That said, we do just about everything without dataflow in
LLVM, and it seems to work pretty well so far. *shrug*

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Nov 2003 - [LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

[LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

[LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

[LLVMdev] Re: Alias Analysis Design & Implementation and LLVM

Maybe Matching Threads