----- Original Message -----> From: "Ralf Karrenberg" <Chareos at gmx.de>
> To: llvmdev at cs.uiuc.edu
> Sent: Wednesday, October 17, 2012 2:13:08 AM
> Subject: Re: [LLVMdev] Loop vectorizer
>
> Hi everybody,
>
> On 10/17/12 12:32 AM, Hal Finkel wrote:
> >>> Do you have a plan for xforms to increase the amount of
> >>> vectorization?
> >>
> >> Yes. We will need to implement a predication phase and to design
> >> the
> >> interaction with other loop transformations. Also, this will have
> >> to
> >> work well with the cost model. We also need to think of a good way
> >> to
> >> detect early on if the transformations are likely to be effective,
> >> because we currently don't have a good way of undoing compiler
> >> transformations.
> >>
> >> I think that a simple if-converter will be a good place to start.
> >> What
> >> do you think ?
> >
> > Quick comment: IIRC, Ralf Karrenberg has already implemented this
> > (as part of his WVF project:
> > https://github.com/karrenberg/wfv/tree/llvm_30). It might be
> > worthwhile to work on cleaning up his implementation instead of
> > starting from scratch.
> >
> > -Hal
>
> WFV [1] does indeed include phases that correspond to full
> control-flow
> to data-flow conversion (not just if-conversion, it can flatten all
> kinds of control flow including nested loops with multiple exits
> etc.).
>
> I am currently working on a full re-implementation of the WFV
> algorithm
> on top of the latest trunk.
> One part of it that is basically finished is an analysis pass that I
> call "vectorization analysis", which annotates a function (WFV
works
> on
> entire functions) with metadata used during control-flow to data-flow
> conversion and instruction vectorization.
Is there a reason to use metadata here as opposed to just keeping state in the
analysis pass?
> To give you a broad idea, this includes information like:
> - uniform/varying operation
> - same/consecutive/random index vector (for load/store)
> - aligned/unaligned index vector (for load/store)
> - operations that can not be vectorized (marked as "split", e.g.
> non-vectorizable types etc.)
> - operations that need to be split and guarded (e.g. unknown calls,
> stores)
> - mandatory/optional blocks (renamed from
"divergent"/"non-divergent"
> in
> [2])
> - divergent/non-divergent loops
Sounds great!
>
> Generally, it would be possible to implement a loop vectorizer on top
> of
> WFV simply by running a loop dependency analysis to determine if the
> loop in question is vectorizable, extracting the loop body into a
> function, running WFV on it, and inlining the call again.
I presume that we could refactor your code in combination with Nadav's work
to directly vectorize loop bodies as well. Do you disagree?
>
> I am willing to provide all of my implementation as soon as required.
> I hope to have mostly finished the rewrite at that point.
I encourage you to do this as soon as possible, otherwise I think that we might
miss the opportunity to take advantage of your work in current development.
Thanks again,
Hal
>
> Cheers,
> Ralf
>
>
> [1] "Whole-Function Vectorization", Karrenberg and Hack,
CGO'11
> [2] "Improving Performance of OpenCL on CPUs", Karrenberg and
Hack,
> CC'12
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory