thr3ads.net - llvm dev - [LLVMdev] Loop vectorizer [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2012-Oct-17 07:43 UTC

[LLVMdev] Loop vectorizer

----- Original Message -----> From: "Ralf Karrenberg" <Chareos at gmx.de>
> To: llvmdev at cs.uiuc.edu
> Sent: Wednesday, October 17, 2012 2:13:08 AM
> Subject: Re: [LLVMdev] Loop vectorizer
> 
> Hi everybody,
> 
> On 10/17/12 12:32 AM, Hal Finkel wrote:
> >>> Do you have a plan for xforms to increase the amount of
> >>> vectorization?
> >>
> >> Yes. We will need to implement a predication phase and to design
> >> the
> >> interaction with other loop transformations. Also, this will have
> >> to
> >> work well with the cost model. We also need to think of a good way
> >> to
> >> detect early on if the transformations are likely to be effective,
> >> because we currently don't have a good way of undoing compiler
> >> transformations.
> >>
> >> I think that a simple if-converter will be a good place to start.
> >> What
> >> do you think ?
> >
> > Quick comment: IIRC, Ralf Karrenberg has already implemented this
> > (as part of his WVF project:
> > https://github.com/karrenberg/wfv/tree/llvm_30). It might be
> > worthwhile to work on cleaning up his implementation instead of
> > starting from scratch.
> >
> >   -Hal
> 
> WFV [1] does indeed include phases that correspond to full
> control-flow
> to data-flow conversion (not just if-conversion, it can flatten all
> kinds of control flow including nested loops with multiple exits
> etc.).
> 
> I am currently working on a full re-implementation of the WFV
> algorithm
> on top of the latest trunk.
> One part of it that is basically finished is an analysis pass that I
> call "vectorization analysis", which annotates a function (WFV
works
> on
> entire functions) with metadata used during control-flow to data-flow
> conversion and instruction vectorization.
Is there a reason to use metadata here as opposed to just keeping state in the
analysis pass?
> To give you a broad idea, this includes information like:
> - uniform/varying operation
> - same/consecutive/random index vector (for load/store)
> - aligned/unaligned index vector (for load/store)
> - operations that can not be vectorized (marked as "split", e.g.
> non-vectorizable types etc.)
> - operations that need to be split and guarded (e.g. unknown calls,
> stores)
> - mandatory/optional blocks (renamed from
"divergent"/"non-divergent"
> in
> [2])
> - divergent/non-divergent loops
Sounds great!
> 
> Generally, it would be possible to implement a loop vectorizer on top
> of
> WFV simply by running a loop dependency analysis to determine if the
> loop in question is vectorizable, extracting the loop body into a
> function, running WFV on it, and inlining the call again.
I presume that we could refactor your code in combination with Nadav's work
to directly vectorize loop bodies as well. Do you disagree?
> 
> I am willing to provide all of my implementation as soon as required.
> I hope to have mostly finished the rewrite at that point.
I encourage you to do this as soon as possible, otherwise I think that we might
miss the opportunity to take advantage of your work in current development.

Thanks again,
Hal
> 
> Cheers,
> Ralf
> 
> 
> [1] "Whole-Function Vectorization", Karrenberg and Hack,
CGO'11
> [2] "Improving Performance of OpenCL on CPUs", Karrenberg and
Hack,
> CC'12
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Ralf Karrenberg

2012-Oct-18 09:48 UTC

head link

[LLVMdev] Loop vectorizer

Hi Hal,

On 10/17/12 9:43 AM, Hal Finkel wrote:>> I am currently working on a full re-implementation of the WFV
>> algorithm
>> on top of the latest trunk.
>> One part of it that is basically finished is an analysis pass that I
>> call "vectorization analysis", which annotates a function
(WFV works
>> on
>> entire functions) with metadata used during control-flow to data-flow
>> conversion and instruction vectorization.
>
> Is there a reason to use metadata here as opposed to just keeping state in
the analysis pass?
Yes, two practical ones:
1) I don't need to to maintain an additional instruction->properties 
mapping and I don't need a map lookup every time I want to check if a 
block/instruction has a certain property (which happens quite often).
2) For debugging purposes, I don't need to write my own 
AssemblyAnnotationWriter but all information is directly visible in the IR.

However, other approaches may of course be viable. My last 
implementation kept its own state, but the metadata approach feels a lot 
more convenient (even though my block and argument metadata patch was 
refused ;) ).
>> Generally, it would be possible to implement a loop vectorizer on top
>> of
>> WFV simply by running a loop dependency analysis to determine if the
>> loop in question is vectorizable, extracting the loop body into a
>> function, running WFV on it, and inlining the call again.
>
> I presume that we could refactor your code in combination with Nadav's
work to directly vectorize loop bodies as well. Do you disagree?
I only meant to describe one way of using the code. Due to its 
complexity, I think that moving forward step by step is the right 
approach to include all that functionality in LLVM.
>> I am willing to provide all of my implementation as soon as required.
>> I hope to have mostly finished the rewrite at that point.
>
> I encourage you to do this as soon as possible, otherwise I think that we
might miss the opportunity to take advantage of your work in current
development.
I'll do my best :).

Cheers,
Ralf

Duncan Sands

2012-Oct-18 10:15 UTC

head link

[LLVMdev] Loop vectorizer

Hi Ralf,
> On 10/17/12 9:43 AM, Hal Finkel wrote:
>>> I am currently working on a full re-implementation of the WFV
>>> algorithm
>>> on top of the latest trunk.
>>> One part of it that is basically finished is an analysis pass that
I
>>> call "vectorization analysis", which annotates a function
(WFV works
>>> on
>>> entire functions) with metadata used during control-flow to
data-flow
>>> conversion and instruction vectorization.
>>
>> Is there a reason to use metadata here as opposed to just keeping state
in the
>> analysis pass?
>
> Yes, two practical ones:
> 1) I don't need to to maintain an additional instruction->properties
mapping and
> I don't need a map lookup every time I want to check if a
block/instruction has
> a certain property (which happens quite often).
> 2) For debugging purposes, I don't need to write my own
AssemblyAnnotationWriter
> but all information is directly visible in the IR.
I know it is tempting to stock information in the IR but I'm not sure you
realize how expensive it is.  Your (1) makes me laugh, since adding and querying
metadata is going to be many times more expensive than a map lookup; and as for
(2) that's why analysis methods have a method for printing out the info they
contain.

Ciao, Duncan.

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Oct 2012 - [LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

Apparently Analagous Threads