thr3ads.net - llvm dev - [LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2011-Nov-11 23:11 UTC

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Fri, 2011-11-11 at 23:55 +0100, Tobias Grosser wrote:> On 11/11/2011 11:36 PM, Hal Finkel wrote:
> > On Thu, 2011-11-10 at 23:07 +0100, Tobias Grosser wrote:
> >> On 11/08/2011 11:29 PM, Hal Finkel wrote:
> >> Talking about this I looked again into ScalarEvolution.
> >>
> >> To analyze a load, you would do:
> >>
> >> LoadInst *Load = ...
> >> Value *Pointer = Load->getPointer();
> >> const SCEV *PointerSCEV = SE->getSCEV(Pointer);
> >> const SCEVUnknown *PointerBase > >> 
dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> >>
> >> if (!PointerBase)
> >>     return 'Analysis failed'
> >>
> >> const Value *BaseValue = PointerBase->getValue();
> >>
> >> You get the offset between two load addresses with
SE->getMinusSCEV().
> >> The size of an element is SE->getSizeOfExpr().
> >>
> >
> > The AliasAnalysis class has a set of interfaces that can be used to
> > preserve the analysis even when some things are changed. Does
> > ScalarEvolution have a similar capability?
> 
> You can state that your pass preserves ScalarEvolution. In this case all 
> analysis results are by default preserved and it is your job to 
> invalidate the scalar evolution for the loops/values where it needs to
> be recalculated.
> 
> The relevant functions are
> 
> ScalarEvolution::forgetValue(Value *)
Since the vectorization pass is currently just a basic-block pass, I
think that I should need only forgetValue, right? I suppose that I would
call that on all of the values that are fused.

Also, using getPointerBase to get the base pointer seems simple enough,
but how should I use getMinusSCEV to get the offset. Should I call it on
each load pointer and its base pointer, or between the two load pointers
once I know that they share the same base. And once I do that, how do I
get the offset (if known). I see the get[Uns|S]ignedRange functions, but
if there is a way to directly get a constant value, then that would be
more straightforward.

Thanks again,
Hal
> ScalarEvolution::forgetLoop(Loop *)
> 
> Cheers
> Tobi
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Tobias Grosser

2011-Nov-11 23:37 UTC

head link

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On 11/12/2011 12:11 AM, Hal Finkel wrote:> On Fri, 2011-11-11 at 23:55 +0100, Tobias Grosser wrote:
>> On 11/11/2011 11:36 PM, Hal Finkel wrote:
>>> On Thu, 2011-11-10 at 23:07 +0100, Tobias Grosser wrote:
>>>> On 11/08/2011 11:29 PM, Hal Finkel wrote: Talking about this I
>>>> looked again into ScalarEvolution.
>>>>
>>>> To analyze a load, you would do:
>>>>
>>>> LoadInst *Load = ... Value *Pointer = Load->getPointer();
const
>>>> SCEV *PointerSCEV = SE->getSCEV(Pointer); const SCEVUnknown
>>>> *PointerBase >>>>
dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
>>>>
>>>> if (!PointerBase) return 'Analysis failed'
>>>>
>>>> const Value *BaseValue = PointerBase->getValue();
>>>>
>>>> You get the offset between two load addresses with
>>>> SE->getMinusSCEV(). The size of an element is
>>>> SE->getSizeOfExpr().
>>>>
>>>
>>> The AliasAnalysis class has a set of interfaces that can be used
>>> to preserve the analysis even when some things are changed. Does
>>> ScalarEvolution have a similar capability?
>>
>> You can state that your pass preserves ScalarEvolution. In this
>> case all analysis results are by default preserved and it is your
>> job to invalidate the scalar evolution for the loops/values where
>> it needs to be recalculated.
>>
>> The relevant functions are
>>
>> ScalarEvolution::forgetValue(Value *)
>
> Since the vectorization pass is currently just a basic-block pass, I
> think that I should need only forgetValue, right? I suppose that I
> would call that on all of the values that are fused.
You call it on all the values/instructions that are removed and for
those where the result calculated has changed.
> Also, using getPointerBase to get the base pointer seems simple
> enough, but how should I use getMinusSCEV to get the offset.
This should give you the offset from the base pointer:

LoadInst *Load = ...
Value *Pointer = Load->getPointer();
const SCEV *PointerSCEV = SE->getSCEV(Pointer);
const SCEVUnknown *PointerBase 
dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));

if (!PointerBase)
   return 'Analysis failed'

const SCEV *OffsetFromBase = SE->getMinusSCEV(Pointer, PointerBase);
> Should I call it on each load pointer and its base pointer, or
> between the two load pointers once I know that they share the same
> base.That depends what you want.
> And once I do that, how do I get the offset (if known). I see the
> get[Uns|S]ignedRange functions, but if there is a way to directly get
> a constant value, then that would be more straightforward.
I assume you want to know if two load addresses are either identical, 
have stride one (have a offset of +1) or some other more complicated stuff.

What might work is the following (entirely untested):

LoadInst *LoadOne = ...
LoadInst *LoadTwo = ...

Value *PointerOne = LoadOne->getPointer();
Value *PointerTwo = LoadTwo->getPointer();

const SCEV *PointerOneSCEV = SE->getSCEV(PointerOne);
const SCEV *PointerTwoSCEV = SE->getSCEV(PointerTwo);

// If this is a trivial offset we get something like 1*sizeof(long)
const SCEV *Offset = SE->getMinusSCEV(PointerOneSCEV, PointerTwoSCEV);

// Now we devide it by the element size
Type *AllocTy = LoadOne->getType()->getAllocTy();
const SCEV *TypeOfSCEV = SE->getSizeOfExpr(AllocTy);
const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);

if (const SCEVConstant *IntOffsetSCEV
	= dyn_cast<SCEVConstant>(OffsetInElements)) {
   ConstantInt *IntOffset = IntOffsetSCEV->getValue()
   return IntOffset;
} else {
   return "This seems to be a complicated offset";
}

const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);

Let me know if this or something similar worked.
Tobi

Hal Finkel

2011-Nov-15 23:38 UTC

head link

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Tobias,

I've attached the latest version of my autovectorization patch. I was
able to add support for using the ScalarEvolution analysis for
load/store pairing (thanks for your help!). This led to a modest
performance increase and a modest compile-time increase. This version
also has a cutoff as you suggested (although the default value is set
high (4000 instructions between pairs) because setting it lower led to
performance regressions in both run time and compile time).

At this point there seem to be no unexpected test-suite compile
failures. Some of the tests, however, do seem to compute the wrong
output. I've yet to determine whether this is due to bad instruction
fusing or some error in a later stage. I'll try running the failing
tests in the interpreter and/or on another platform.

Thanks again,
Hal

On Sat, 2011-11-12 at 00:37 +0100, Tobias Grosser wrote:> On 11/12/2011 12:11 AM, Hal Finkel wrote:
> > On Fri, 2011-11-11 at 23:55 +0100, Tobias Grosser wrote:
> >> On 11/11/2011 11:36 PM, Hal Finkel wrote:
> >>> On Thu, 2011-11-10 at 23:07 +0100, Tobias Grosser wrote:
> >>>> On 11/08/2011 11:29 PM, Hal Finkel wrote: Talking about
this I
> >>>> looked again into ScalarEvolution.
> >>>>
> >>>> To analyze a load, you would do:
> >>>>
> >>>> LoadInst *Load = ... Value *Pointer =
Load->getPointer(); const
> >>>> SCEV *PointerSCEV = SE->getSCEV(Pointer); const
SCEVUnknown
> >>>> *PointerBase > >>>>
dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> >>>>
> >>>> if (!PointerBase) return 'Analysis failed'
> >>>>
> >>>> const Value *BaseValue = PointerBase->getValue();
> >>>>
> >>>> You get the offset between two load addresses with
> >>>> SE->getMinusSCEV(). The size of an element is
> >>>> SE->getSizeOfExpr().
> >>>>
> >>>
> >>> The AliasAnalysis class has a set of interfaces that can be
used
> >>> to preserve the analysis even when some things are changed.
Does
> >>> ScalarEvolution have a similar capability?
> >>
> >> You can state that your pass preserves ScalarEvolution. In this
> >> case all analysis results are by default preserved and it is your
> >> job to invalidate the scalar evolution for the loops/values where
> >> it needs to be recalculated.
> >>
> >> The relevant functions are
> >>
> >> ScalarEvolution::forgetValue(Value *)
> >
> > Since the vectorization pass is currently just a basic-block pass, I
> > think that I should need only forgetValue, right? I suppose that I
> > would call that on all of the values that are fused.
> 
> You call it on all the values/instructions that are removed and for
> those where the result calculated has changed.
> 
> > Also, using getPointerBase to get the base pointer seems simple
> > enough, but how should I use getMinusSCEV to get the offset.
> 
> This should give you the offset from the base pointer:
> 
> LoadInst *Load = ...
> Value *Pointer = Load->getPointer();
> const SCEV *PointerSCEV = SE->getSCEV(Pointer);
> const SCEVUnknown *PointerBase > 
dyn_cast<SCEVUnknown>(SE->getPointerBase(PointerSCEV));
> 
> if (!PointerBase)
>    return 'Analysis failed'
> 
> const SCEV *OffsetFromBase = SE->getMinusSCEV(Pointer, PointerBase);
> 
> > Should I call it on each load pointer and its base pointer, or
> > between the two load pointers once I know that they share the same
> > base.
> That depends what you want.
> 
> > And once I do that, how do I get the offset (if known). I see the
> > get[Uns|S]ignedRange functions, but if there is a way to directly get
> > a constant value, then that would be more straightforward.
> 
> I assume you want to know if two load addresses are either identical, 
> have stride one (have a offset of +1) or some other more complicated stuff.
> 
> What might work is the following (entirely untested):
> 
> LoadInst *LoadOne = ...
> LoadInst *LoadTwo = ...
> 
> Value *PointerOne = LoadOne->getPointer();
> Value *PointerTwo = LoadTwo->getPointer();
> 
> const SCEV *PointerOneSCEV = SE->getSCEV(PointerOne);
> const SCEV *PointerTwoSCEV = SE->getSCEV(PointerTwo);
> 
> // If this is a trivial offset we get something like 1*sizeof(long)
> const SCEV *Offset = SE->getMinusSCEV(PointerOneSCEV, PointerTwoSCEV);
> 
> // Now we devide it by the element size
> Type *AllocTy = LoadOne->getType()->getAllocTy();
> const SCEV *TypeOfSCEV = SE->getSizeOfExpr(AllocTy);
> const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> 
> if (const SCEVConstant *IntOffsetSCEV
> 	= dyn_cast<SCEVConstant>(OffsetInElements)) {
>    ConstantInt *IntOffset = IntOffsetSCEV->getValue()
>    return IntOffset;
> } else {
>    return "This seems to be a complicated offset";
> }
> 
> const SCEV *OffsetInElements = SE->getUDivExpr(Offset, TypeOfSCEV);
> 
> Let me know if this or something similar worked.
> Tobi
> 
> 
> 
> 
> 
> 
> 
> 
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm_bb_vectorize-20111115-2.diff
Type: text/x-patch
Size: 81446 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111115/7d99b311/attachment.bin>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Nov 2011 - [LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Possibly Parallel Threads