thr3ads.net - llvm dev - [LLVMdev] alias information in codegen [Apr 2008]

If this information is useful, please help other people find it:
Share via:

Dan Gohman

2008-Apr-03 20:00 UTC

[LLVMdev] alias information in codegen

There have been a few queries about this recently, and I've done some
work in this area recently, so I'm posting a summary of what the major
outstanding issues are.

* BasicAliasAnalysis, the default AliasAnalysis implementation, doesn't
   understand lowered GEPs, integer arithmetic, or PHIs, and the
   regular codegen process involves passes that lower GEPs.

One way to solve this is to use a different AliasAnalysis  
implementation.
I haven't looked at it in detail, but I'd guess that the Andersen's
pass handles all of these constructs reasonably well. And it has the
advantage of being much more capable than the BasicAliasAnalysis.

Another alternative is to develop a SCEV-based alias analysis. The SCEV
framework currently has the opposite problem; it understands integer
arithmetic and PHIs quite well, but doesn't currently understand GEPs.
I know it can be made to understand GEPs though, and it's on my TODO
list to implement this.

* There are still a variety of places in SelectionDAG creation that
   don't preserve SVOperand/SVOffset (as well as alignment and  
volatile).

These places need to be found and fixed. This is pretty straight- 
forward,
and the places that need changing can be found by inserting some
strategic assert(SVOperand)'s.

  * The DAGCombiner's alias-analysis handling isn't ideal.

This is the -combiner-alias-analysis and -combiner-global-alias-analysis
options. They basically work, though they aren't turned on by default
currently. The algorithm used wants some scrutiny as well.

  * MachineInstrs currently don't reliably record information about
    memory accesses.

This is being addressed by MemOperands. However, currently there is a
problem; the current code misses memory references in anonymous
patterns, like this in x86:

def : Pat<(zextloadi64i32 addr:$src),
           (SUBREG_TO_REG (i64 0), (MOV32rm addr:$src),  
x86_subreg_32bit)>;

I can provide more details about what's going on here if anyone's
interested.

However, for people just interested in post-regalloc scheduling and
VLIW packing and similar things, MemOperands aren't the only approach.
A potentially better way to do this would be to extend MachineInstrs
to preserve the chain dependencies from the SelectionDAG.

This would be more efficient, as the SelectionDAG already needs a full
dependence graph, so there's no need to make any further alias queries.
There would be some details to resolve, such as how to handle loads
and stores inserted after instruction selection. Also, this depends on
the SelectionDAG having good dependence information, but improving
that would be beneficial for pre-regalloc scheduling as well.

Dan

Chris Lattner

2008-Apr-04 02:33 UTC

head link

[LLVMdev] alias information in codegen

On Apr 3, 2008, at 1:00 PM, Dan Gohman wrote:> * BasicAliasAnalysis, the default AliasAnalysis implementation,  
> doesn't
>   understand lowered GEPs, integer arithmetic, or PHIs, and the
>   regular codegen process involves passes that lower GEPs.
Sure.
> One way to solve this is to use a different AliasAnalysis
> implementation.
> I haven't looked at it in detail, but I'd guess that the
Andersen's
> pass handles all of these constructs reasonably well. And it has the
> advantage of being much more capable than the BasicAliasAnalysis.
I haven't tried, but I seriously doubt it.  Most AA's specifically  
just look at pointers and interprocedural ones are much more  
interested in this than local ones: this reduces the number of values  
to track, speeding up analysis and reducing memory use.
>
> Another alternative is to develop a SCEV-based alias analysis. The  
> SCEV
> framework currently has the opposite problem; it understands integer
> arithmetic and PHIs quite well, but doesn't currently understand GEPs.
> I know it can be made to understand GEPs though, and it's on my TODO
> list to implement this.
This could be interesting, but you're basically attacking the  
dependence analysis problem here.  Dependence analysis can be used to  
disambiguate pointers (and basicaa does a simple form of this) but  
that is orthogonal to the rest of the problem.

I think it would be better by starting to make LSR just not throw away  
pointer information when it is trivial to preserve it.  This is  
obviously not always the case, but it is VERY VERY VERY often the  
case, so we should see how far that gets us.  If nothing else, it  
makes the IR sent into the codegen simpler and smaller.
> * There are still a variety of places in SelectionDAG creation that
>   don't preserve SVOperand/SVOffset (as well as alignment and
> volatile).
>
> These places need to be found and fixed. This is pretty straight-
> forward,
> and the places that need changing can be found by inserting some
> strategic assert(SVOperand)'s.
This is also a serious issue that can manifest as bugs already, so it  
is a correctness issue, not a performance issue.
>  * The DAGCombiner's alias-analysis handling isn't ideal.
>
> This is the -combiner-alias-analysis and -combiner-global-alias- 
> analysis
> options. They basically work, though they aren't turned on by default
> currently. The algorithm used wants some scrutiny as well.
Yep, it would be very interesting to rewrite the combiner aa stuff to  
be more scalable.  We've seen significant improvements with it, and it  
currently just does trivial base+offset disambiguation.  It would be  
much more interesting to throw full AliasAnalysis info into it. :)

-Chris

Florian Brandner

2008-Apr-04 07:03 UTC

head link

[LLVMdev] alias information in codegen

On Thursday 03 April 2008 22:00:34 Dan Gohman wrote:> However, for people just interested in post-regalloc scheduling and
> VLIW packing and similar things, MemOperands aren't the only approach.
> A potentially better way to do this would be to extend MachineInstrs
> to preserve the chain dependencies from the SelectionDAG.
the selection dag may already contain unnecessary dependencies, even with 
alias analysis turned on. it is built after codegenprepare and thus the 
lowered GEPs cause imprecise results already at this point.

florian

Duncan Sands

2008-Apr-04 09:24 UTC

head link

[LLVMdev] alias information in codegen

Hi,
> * There are still a variety of places in SelectionDAG creation that
>    don't preserve SVOperand/SVOffset (as well as alignment and  
> volatile).
> 
> These places need to be found and fixed. This is pretty straight- 
> forward,
> and the places that need changing can be found by inserting some
> strategic assert(SVOperand)'s.
I've seen at least one place that passed through the original
SVOperand and SVOffset, but incorrectly forget to adjust SVOffset.
The assert you suggest wouldn't catch this.  There really need
to be some new or better methods to make it hard to get this kind
of thing wrong (currently it is hard to get it right!).

Ciao,

Duncan.

Dan Gohman

2008-Apr-05 16:00 UTC

head link

[LLVMdev] alias information in codegen

On Fri, April 4, 2008 2:24 am, Duncan Sands wrote:> Hi,
>
>> * There are still a variety of places in SelectionDAG creation that
>>    don't preserve SVOperand/SVOffset (as well as alignment and
>> volatile).
>>
>> These places need to be found and fixed. This is pretty straight-
>> forward,
>> and the places that need changing can be found by inserting some
>> strategic assert(SVOperand)'s.
>
> I've seen at least one place that passed through the original
> SVOperand and SVOffset, but incorrectly forget to adjust SVOffset.
> The assert you suggest wouldn't catch this.  There really need
> to be some new or better methods to make it hard to get this kind
> of thing wrong (currently it is hard to get it right!).
I agree that it's tricky to get these right. Do you have any
ideas for how the values might be verified?

One thing we can work on is simplifying SelectionDAG::getLoad and
getStore, or maybe providing better alternative convenience methods.
Right now they're quite chatty.

Dan

Dan Gohman

2008-Apr-05 16:13 UTC

head link

[LLVMdev] alias information in codegen

On Fri, April 4, 2008 12:03 am, Florian Brandner wrote:> On Thursday 03 April 2008 22:00:34 Dan Gohman wrote:
>> However, for people just interested in post-regalloc scheduling and
>> VLIW packing and similar things, MemOperands aren't the only
approach.
>> A potentially better way to do this would be to extend MachineInstrs
>> to preserve the chain dependencies from the SelectionDAG.
>
> the selection dag may already contain unnecessary dependencies, even with
> alias analysis turned on. it is built after codegenprepare and thus the
> lowered GEPs cause imprecise results already at this point.
Right; this idea depends on the other items being addressed,
giving the SelectiondDAG better dependence information, so it's
a somewhat longer-term strategy. But it has the nice property that
improvements to the dependence information can benefit both
pre-regalloc and post-regalloc scheduling.

Dan

Daniel Berlin

2008-Apr-07 03:14 UTC

head link

[LLVMdev] alias information in codegen

On Thu, Apr 3, 2008 at 10:33 PM, Chris Lattner <sabre at nondot.org>
wrote:> On Apr 3, 2008, at 1:00 PM, Dan Gohman wrote:
>  > * BasicAliasAnalysis, the default AliasAnalysis implementation,
>  > doesn't
>  >   understand lowered GEPs, integer arithmetic, or PHIs, and the
>  >   regular codegen process involves passes that lower GEPs.
>
>  Sure.
>
>
>  > One way to solve this is to use a different AliasAnalysis
>  > implementation.
>  > I haven't looked at it in detail, but I'd guess that the
Andersen's
>  > pass handles all of these constructs reasonably well. And it has the
>  > advantage of being much more capable than the BasicAliasAnalysis.
>
>  I haven't tried, but I seriously doubt it.  Most AA's specifically
>  just look at pointers and interprocedural ones are much more
>  interested in this than local ones: this reduces the number of values
>  to track, speeding up analysis and reducing memory use.
>
I agree with Chris here completely..
It will help with GEP's eventually (field sensitive versions are being
worked on), but otherwise, it isn't going to help you here.
Trying to do pointer analysis on very lowered forms is both expensive
and hard.  It is like trying to build a bookshelf out of mashed
potatoes.
:)

Dan Gohman

2008-Apr-07 15:00 UTC

head link

[LLVMdev] alias information in codegen

On Thu, April 3, 2008 7:33 pm, Chris Lattner wrote:> On Apr 3, 2008, at 1:00 PM, Dan Gohman wrote:
>> * BasicAliasAnalysis, the default AliasAnalysis implementation,
>> doesn't
>>   understand lowered GEPs, integer arithmetic, or PHIs, and the
>>   regular codegen process involves passes that lower GEPs.
>
> Sure.
>
>> One way to solve this is to use a different AliasAnalysis
>> implementation.
>> I haven't looked at it in detail, but I'd guess that the
Andersen's
>> pass handles all of these constructs reasonably well. And it has the
>> advantage of being much more capable than the BasicAliasAnalysis.
>
> I haven't tried, but I seriously doubt it.  Most AA's specifically
> just look at pointers and interprocedural ones are much more
> interested in this than local ones: this reduces the number of values
> to track, speeding up analysis and reducing memory use.
I just tried it, and it did work. Perhaps this means that the
current -anders-aa pass is doing more work than it should.
>> Another alternative is to develop a SCEV-based alias analysis. The
>> SCEV
>> framework currently has the opposite problem; it understands integer
>> arithmetic and PHIs quite well, but doesn't currently understand
GEPs.
>> I know it can be made to understand GEPs though, and it's on my
TODO
>> list to implement this.
>
> This could be interesting, but you're basically attacking the
> dependence analysis problem here.  Dependence analysis can be used to
> disambiguate pointers (and basicaa does a simple form of this) but
> that is orthogonal to the rest of the problem.
I was thinking of starting with something very simple here: an
AliasAnalysis implementation where alias queries are implemented by
computing a SCEV for each pointer and doing a getMinusSCEV with them.
If the result is a constant, it can be handled. If the result is a
subtraction of two simpler values, they can be handed off to the
next analysis in the chain.
> I think it would be better by starting to make LSR just not throw away
> pointer information when it is trivial to preserve it.  This is
> obviously not always the case, but it is VERY VERY VERY often the
> case, so we should see how far that gets us.  If nothing else, it
> makes the IR sent into the codegen simpler and smaller.
I agree this would be useful.

Dan

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Apr 2008 - [LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

[LLVMdev] alias information in codegen

Possibly Parallel Threads