thr3ads.net - llvm dev - [LLVMdev] Alias Analysis Problem in LICM [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Gan

2011-Nov-03 21:07 UTC

[LLVMdev] Alias Analysis Problem in LICM

Hi,

I met an alias analysis problem in the LICM phase. I am using the following
piece of code to present this problem.

      1 int size;
      2 int ** AAA;
      3 void * xalloc(int);
      4
      5 void foo(void)
      6 {
      7   int i;
      8   AAA = (int**) xalloc(size * sizeof(int*));
      9
     10   for (i=0; i<size; i++)
     11     AAA[i] = 0;
     12 }

This code tries to initialize an array of pointers. The array is
allocated from heap.
"AAA" points to the beginning of the array and it is a global
variable.

I got the following IR dump after LICM:

    82 *** IR Dump After Loop Invariant Code Motion ***
    83 for.body:                                         ; preds
%for.body.lr.ph, %for.body
    84   %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
    85   %4 = load i32*** @AAA, align 4, !tbaa !3
    86   %arrayidx = getelementptr inbounds i32** %4, i32 %i.02
    87   store i32* null, i32** %arrayidx, align 4, !tbaa !3
    88   %inc = add nsw i32 %i.02, 1
    89   %cmp = icmp slt i32 %inc, %3
    90   br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

I was expecting that, line 85: "%4 = load i32*** @AAA", would be
treated as loop
invariant code and be moved out of the loop body. However, it didn't.
The reason is
that, LLVM thinks that pointer "@AAA" would alias with pointer
"&AAA[i]". I found GCC
has no problem to move this load out of the loop body.

Then I went to read the current LLVM (v2.9) alias analysis code, i.e.
BasicAA and TBAA.
I found that TBAA does not differentiate pointers. Any pointer will be
given the same tbaa meta-data
named "any pointer". After reading the TBAA code, I don't believe
that
my problem can be
easily fixed in its framework.

Since the "Anderson" alias analysis code is already moved out from
LLVM, we can only count
on BasicAA and TBAA for all alias analysis problems at this moment.
But these two piece of code
can only do very limited alias analysis and would become a big
performance bottleneck to
other optimization phases, e.g. instruction scheduling, LICM. Can
anybody tell me:

1. How to solve the above alias analysis problem in an acceptable short time?
2. Is there any existing project that tries to enhance LLVM alias analysis?

Thank you!

-- 
Best Regards

Gan

John Criswell

2011-Nov-03 21:30 UTC

head link

[LLVMdev] Alias Analysis Problem in LICM

On 11/3/11 4:07 PM, Gan wrote:> Hi,
>
> [snip]
>
> Since the "Anderson" alias analysis code is already moved out
from
> LLVM, we can only count
> on BasicAA and TBAA for all alias analysis problems at this moment.
> But these two piece of code
> can only do very limited alias analysis and would become a big
> performance bottleneck to
> other optimization phases, e.g. instruction scheduling, LICM. Can
> anybody tell me:
>
> 1. How to solve the above alias analysis problem in an acceptable short
time?
> 2. Is there any existing project that tries to enhance LLVM alias analysis?
The poolalloc project (directions for downloading it are on the SAFECode 
web page at http://sva.cs.illinois.edu/docs/Install.html) contains a 
points-to analysis called DSA.

DSA is a unification-based points-to analysis.  The interface to use it 
as an AliasAnalysis pass was removed since no one was using it and it 
was probably buggy.  There are currently two routes for re-adding it.  
One is to review and fix up a patch that was submitted earlier this 
year.  The other approach is to rewrite the code from scratch.  The 
former approach may be easier, but I don't know how well the code would 
work in practice.  The latter approach could be easily done using some 
code that we wrote (and could probably distribute if someone wants to 
tackle this problem).

Bug 11130 (http://llvm.org/bugs/show_bug.cgi?id=11130) is tracking 
interest in this feature so that we can give it an appropriate 
priority.  Until one of us gets time to implement it or we get a 
contributor that provides a clean patch that works with LLVM 3.0, 
though, it's like to remain unfixed.

You can, of course, always examine the points-to graph directly.

Another alternative is to look at the analyses by Ben Hardekopf 
(http://www.cs.ucsb.edu/~benh/downloads.html).  I'm not sure if his code 
offers an AliasAnalysis interface, though.

-- John T.
>
> Thank you!
>

Gan

2011-Nov-03 22:27 UTC

head link

[LLVMdev] Alias Analysis Problem in LICM

Hi John,

Thank you for your reply! Please see my other questions below:
> DSA is a unification-based points-to analysis.  The interface to use it as
> an AliasAnalysis pass was removed since no one was using it and it was
> probably buggy.
- When you say "it was probably buggy", what is buggy? the
"interface", or the AliasAnalysis?

I'm so surprise that no one is using it.
> There are currently two routes for re-adding it.  One is to
> review and fix up a patch that was submitted earlier this year.  The other
> approach is to rewrite the code from scratch.  The former approach may be
> easier, but I don't know how well the code would work in practice.  The
> latter approach could be easily done using some code that we wrote (and
> could probably distribute if someone wants to tackle this problem).
I can see that you like the latter approach. Let me think whether I
have time to do it.

Do you have any plan to integrate DSA into LLVM mailine, say the 3.0 release?
I learned from your previous emails that you are working on this goal.
What is the
current status?

Thank you again for your reply!

-- 
Best Regards

Gan

Nick Lewycky

2011-Nov-04 08:29 UTC

head link

[LLVMdev] Alias Analysis Problem in LICM

Gan wrote:> Hi,
>
> I met an alias analysis problem in the LICM phase. I am using the following
> piece of code to present this problem.
>
>        1 int size;
>        2 int ** AAA;
>        3 void * xalloc(int);
>        4
>        5 void foo(void)
>        6 {
>        7   int i;
>        8   AAA = (int**) xalloc(size * sizeof(int*));
>        9
>       10   for (i=0; i<size; i++)
>       11     AAA[i] = 0;
>       12 }
>
> This code tries to initialize an array of pointers. The array is
> allocated from heap.
> "AAA" points to the beginning of the array and it is a global
variable.
>
> I got the following IR dump after LICM:
>
>      82 *** IR Dump After Loop Invariant Code Motion ***
>      83 for.body:                                         ; preds >
%for.body.lr.ph, %for.body
>      84   %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
>      85   %4 = load i32*** @AAA, align 4, !tbaa !3
>      86   %arrayidx = getelementptr inbounds i32** %4, i32 %i.02
>      87   store i32* null, i32** %arrayidx, align 4, !tbaa !3
>      88   %inc = add nsw i32 %i.02, 1
>      89   %cmp = icmp slt i32 %inc, %3
>      90   br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
>
> I was expecting that, line 85: "%4 = load i32*** @AAA", would be
> treated as loop
> invariant code and be moved out of the loop body. However, it didn't.
> The reason is
> that, LLVM thinks that pointer "@AAA" would alias with pointer
> "&AAA[i]". I found GCC
> has no problem to move this load out of the loop body.
>
> Then I went to read the current LLVM (v2.9) alias analysis code, i.e.
> BasicAA and TBAA.
> I found that TBAA does not differentiate pointers. Any pointer will be
> given the same tbaa meta-data
> named "any pointer". After reading the TBAA code, I don't
believe that
> my problem can be
> easily fixed in its framework.
>
> Since the "Anderson" alias analysis code is already moved out
from
> LLVM, we can only count
> on BasicAA and TBAA for all alias analysis problems at this moment.
> But these two piece of code
> can only do very limited alias analysis and would become a big
> performance bottleneck to
> other optimization phases, e.g. instruction scheduling, LICM. Can
> anybody tell me:
>
> 1. How to solve the above alias analysis problem in an acceptable short
time?
> 2. Is there any existing project that tries to enhance LLVM alias analysis?
Thanks for this testcase, I think this is a serious problem. I see two 
possible approaches, hopefully there are more.

   1. Use the fact that there's only one store and then make a select 
out of it. Pretending for a moment that partially overlapping writes 
don't exist over an i32*, the value loaded in %.pre is either %1 because 
we didn't overwrite it, or null because we did. Thus, we can transform 
it into:

     %cmp = icmp eq i32** %arrayidx2, bitcast (i32*** @AAA to i32**)
     %.pre = select i1 %cmp, i32** null, i32** %1

   which saves a load at the cost of a compare. Sadly, LLVM fails to 
optimize out the compare afterward. (You'd think that the prospect of 
storing through a null pointer would be enough for llvm to realize that 
one arm of the select is unreachable. That optimization would not be 
hard to add.)

   2. Find the least fixed point. Currently, your "xalloc" method
could
potentially return &AAA itself so the first step is to mark it as only 
returning pointers which do not alias any existing pointers, as such:

     void * xalloc(int) __attribute__((malloc));

   Then we're starting the loop with two pointers (&AAA and AAA) which 
we know do not alias, and we'll continue to assume that pointers can't 
alias until there is evidence to the contrary. Because we know they 
don't alias when entering the loop, we'll never encounter evidence that 
%arrayidx could alias @AAA, then GVN would eliminate the load %.pre 
against the store before the start of the loop.

   Unfortunately, this would require writing a new AA pass, as it's too 
expensive to add to -basicaa.

I suspect that GCC is doing option 2, or something else I haven't 
thought of (TBAA? if so, why doesn't our TBAA do as well?).

Nick

Anshuman Dasgupta

2011-Nov-04 15:37 UTC

head link

[LLVMdev] Alias Analysis Problem in LICM

On 11/4/2011 3:29 AM, Nick Lewycky wrote:> I suspect that GCC is doing option 2, or something else I haven't 
> thought of (TBAA? if so, why doesn't our TBAA do as well?). 
Nick,

The problem is that LLVM's implementation of TBAA does not distinguish 
between these two types: i32*** @AAA and i32** %arrayidx. We believe 
that type based alias analysis should, in general, be able to figure out 
that those two variables cannot meaningfully alias.

-Anshu

--
Qualcomm Innovation Center, Inc is a member of Code Aurora Forum

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Nov 2011 - [LLVMdev] Alias Analysis Problem in LICM

[LLVMdev] Alias Analysis Problem in LICM

[LLVMdev] Alias Analysis Problem in LICM

[LLVMdev] Alias Analysis Problem in LICM

[LLVMdev] Alias Analysis Problem in LICM

[LLVMdev] Alias Analysis Problem in LICM

Seemingly Similar Threads