Hi,
I met an alias analysis problem in the LICM phase. I am using the following
piece of code to present this problem.
      1 int size;
      2 int ** AAA;
      3 void * xalloc(int);
      4
      5 void foo(void)
      6 {
      7   int i;
      8   AAA = (int**) xalloc(size * sizeof(int*));
      9
     10   for (i=0; i<size; i++)
     11     AAA[i] = 0;
     12 }
This code tries to initialize an array of pointers. The array is
allocated from heap.
"AAA" points to the beginning of the array and it is a global
variable.
I got the following IR dump after LICM:
    82 *** IR Dump After Loop Invariant Code Motion ***
    83 for.body:                                         ; preds
%for.body.lr.ph, %for.body
    84   %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
    85   %4 = load i32*** @AAA, align 4, !tbaa !3
    86   %arrayidx = getelementptr inbounds i32** %4, i32 %i.02
    87   store i32* null, i32** %arrayidx, align 4, !tbaa !3
    88   %inc = add nsw i32 %i.02, 1
    89   %cmp = icmp slt i32 %inc, %3
    90   br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
I was expecting that, line 85: "%4 = load i32*** @AAA", would be
treated as loop
invariant code and be moved out of the loop body. However, it didn't.
The reason is
that, LLVM thinks that pointer "@AAA" would alias with pointer
"&AAA[i]". I found GCC
has no problem to move this load out of the loop body.
Then I went to read the current LLVM (v2.9) alias analysis code, i.e.
BasicAA and TBAA.
I found that TBAA does not differentiate pointers. Any pointer will be
given the same tbaa meta-data
named "any pointer". After reading the TBAA code, I don't believe
that
my problem can be
easily fixed in its framework.
Since the "Anderson" alias analysis code is already moved out from
LLVM, we can only count
on BasicAA and TBAA for all alias analysis problems at this moment.
But these two piece of code
can only do very limited alias analysis and would become a big
performance bottleneck to
other optimization phases, e.g. instruction scheduling, LICM. Can
anybody tell me:
1. How to solve the above alias analysis problem in an acceptable short time?
2. Is there any existing project that tries to enhance LLVM alias analysis?
Thank you!
-- 
Best Regards
Gan
On 11/3/11 4:07 PM, Gan wrote:> Hi, > > [snip] > > Since the "Anderson" alias analysis code is already moved out from > LLVM, we can only count > on BasicAA and TBAA for all alias analysis problems at this moment. > But these two piece of code > can only do very limited alias analysis and would become a big > performance bottleneck to > other optimization phases, e.g. instruction scheduling, LICM. Can > anybody tell me: > > 1. How to solve the above alias analysis problem in an acceptable short time? > 2. Is there any existing project that tries to enhance LLVM alias analysis?The poolalloc project (directions for downloading it are on the SAFECode web page at http://sva.cs.illinois.edu/docs/Install.html) contains a points-to analysis called DSA. DSA is a unification-based points-to analysis. The interface to use it as an AliasAnalysis pass was removed since no one was using it and it was probably buggy. There are currently two routes for re-adding it. One is to review and fix up a patch that was submitted earlier this year. The other approach is to rewrite the code from scratch. The former approach may be easier, but I don't know how well the code would work in practice. The latter approach could be easily done using some code that we wrote (and could probably distribute if someone wants to tackle this problem). Bug 11130 (http://llvm.org/bugs/show_bug.cgi?id=11130) is tracking interest in this feature so that we can give it an appropriate priority. Until one of us gets time to implement it or we get a contributor that provides a clean patch that works with LLVM 3.0, though, it's like to remain unfixed. You can, of course, always examine the points-to graph directly. Another alternative is to look at the analyses by Ben Hardekopf (http://www.cs.ucsb.edu/~benh/downloads.html). I'm not sure if his code offers an AliasAnalysis interface, though. -- John T.> > Thank you! >
Hi John, Thank you for your reply! Please see my other questions below:> DSA is a unification-based points-to analysis. The interface to use it as > an AliasAnalysis pass was removed since no one was using it and it was > probably buggy.- When you say "it was probably buggy", what is buggy? the "interface", or the AliasAnalysis? I'm so surprise that no one is using it.> There are currently two routes for re-adding it. One is to > review and fix up a patch that was submitted earlier this year. The other > approach is to rewrite the code from scratch. The former approach may be > easier, but I don't know how well the code would work in practice. The > latter approach could be easily done using some code that we wrote (and > could probably distribute if someone wants to tackle this problem).I can see that you like the latter approach. Let me think whether I have time to do it. Do you have any plan to integrate DSA into LLVM mailine, say the 3.0 release? I learned from your previous emails that you are working on this goal. What is the current status? Thank you again for your reply! -- Best Regards Gan
Gan wrote:> Hi, > > I met an alias analysis problem in the LICM phase. I am using the following > piece of code to present this problem. > > 1 int size; > 2 int ** AAA; > 3 void * xalloc(int); > 4 > 5 void foo(void) > 6 { > 7 int i; > 8 AAA = (int**) xalloc(size * sizeof(int*)); > 9 > 10 for (i=0; i<size; i++) > 11 AAA[i] = 0; > 12 } > > This code tries to initialize an array of pointers. The array is > allocated from heap. > "AAA" points to the beginning of the array and it is a global variable. > > I got the following IR dump after LICM: > > 82 *** IR Dump After Loop Invariant Code Motion *** > 83 for.body: ; preds > %for.body.lr.ph, %for.body > 84 %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ] > 85 %4 = load i32*** @AAA, align 4, !tbaa !3 > 86 %arrayidx = getelementptr inbounds i32** %4, i32 %i.02 > 87 store i32* null, i32** %arrayidx, align 4, !tbaa !3 > 88 %inc = add nsw i32 %i.02, 1 > 89 %cmp = icmp slt i32 %inc, %3 > 90 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge > > I was expecting that, line 85: "%4 = load i32*** @AAA", would be > treated as loop > invariant code and be moved out of the loop body. However, it didn't. > The reason is > that, LLVM thinks that pointer "@AAA" would alias with pointer > "&AAA[i]". I found GCC > has no problem to move this load out of the loop body. > > Then I went to read the current LLVM (v2.9) alias analysis code, i.e. > BasicAA and TBAA. > I found that TBAA does not differentiate pointers. Any pointer will be > given the same tbaa meta-data > named "any pointer". After reading the TBAA code, I don't believe that > my problem can be > easily fixed in its framework. > > Since the "Anderson" alias analysis code is already moved out from > LLVM, we can only count > on BasicAA and TBAA for all alias analysis problems at this moment. > But these two piece of code > can only do very limited alias analysis and would become a big > performance bottleneck to > other optimization phases, e.g. instruction scheduling, LICM. Can > anybody tell me: > > 1. How to solve the above alias analysis problem in an acceptable short time? > 2. Is there any existing project that tries to enhance LLVM alias analysis?Thanks for this testcase, I think this is a serious problem. I see two possible approaches, hopefully there are more. 1. Use the fact that there's only one store and then make a select out of it. Pretending for a moment that partially overlapping writes don't exist over an i32*, the value loaded in %.pre is either %1 because we didn't overwrite it, or null because we did. Thus, we can transform it into: %cmp = icmp eq i32** %arrayidx2, bitcast (i32*** @AAA to i32**) %.pre = select i1 %cmp, i32** null, i32** %1 which saves a load at the cost of a compare. Sadly, LLVM fails to optimize out the compare afterward. (You'd think that the prospect of storing through a null pointer would be enough for llvm to realize that one arm of the select is unreachable. That optimization would not be hard to add.) 2. Find the least fixed point. Currently, your "xalloc" method could potentially return &AAA itself so the first step is to mark it as only returning pointers which do not alias any existing pointers, as such: void * xalloc(int) __attribute__((malloc)); Then we're starting the loop with two pointers (&AAA and AAA) which we know do not alias, and we'll continue to assume that pointers can't alias until there is evidence to the contrary. Because we know they don't alias when entering the loop, we'll never encounter evidence that %arrayidx could alias @AAA, then GVN would eliminate the load %.pre against the store before the start of the loop. Unfortunately, this would require writing a new AA pass, as it's too expensive to add to -basicaa. I suspect that GCC is doing option 2, or something else I haven't thought of (TBAA? if so, why doesn't our TBAA do as well?). Nick
On 11/4/2011 3:29 AM, Nick Lewycky wrote:> I suspect that GCC is doing option 2, or something else I haven't > thought of (TBAA? if so, why doesn't our TBAA do as well?).Nick, The problem is that LLVM's implementation of TBAA does not distinguish between these two types: i32*** @AAA and i32** %arrayidx. We believe that type based alias analysis should, in general, be able to figure out that those two variables cannot meaningfully alias. -Anshu -- Qualcomm Innovation Center, Inc is a member of Code Aurora Forum