thr3ads.net - llvm dev - [LLVMdev] Loop vectorizer [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2012-Oct-16 22:32 UTC

[LLVMdev] Loop vectorizer

----- Original Message -----> Hi Preston!
> 
> >
> > I'd start by making a plan (a design!) with goals and stuff.
> > Publish it so we can see what you mean by "vectorization".
> 
> I will send a separate email later, but here is a quick overview. I
> see the vectorizer having four main components:
> 
> 1. Preparation passes: If-conversion, loop transformations, etc.
> 
> 2. Cost model - This unit decides on the best vectorization factor
> (could be 1).
> 
> 3. Legality check - This unit checks if it is *legal* (from a
> correctness point of view) to vectorize the program. This is target
> independent. Also, this unit needs to describe which transformation
> are needed to make this loop vectorizeable. For example: if-conversion
> is required if the control flow is not uniform for all iterations of
> the loop.
> 
> 4. Vectorization - This is where the actual widening of the
> instructions happen. Every time we improve #3 by detecting more
> vectorizeable loops, we need to add the mechanism for actually
> generating code for it. For example, we can detect reduction
> variables, but we also need to be able to generate code for them.
> 
> > I.e., do you mean for real vector machines, a la Cray, or for
> > the packed media instructions that are currently so popular?
> 
> I am mostly interested in generating code for the popular SIMD cpus
> (X86, ARM, PPC, etc).
> 
> > What's your idea for finding vectorizable loops?
> 
> I started with a very very basic check because I needed to start
> somewhere. We will need to design a mode advanced check.
> 
> > Do you have a plan for xforms to increase the amount of
> > vectorization?
> 
> Yes. We will need to implement a predication phase and to design the
> interaction with other loop transformations. Also, this will have to
> work well with the cost model. We also need to think of a good way to
> detect early on if the transformations are likely to be effective,
> because we currently don't have a good way of undoing compiler
> transformations.
> 
> I think that a simple if-converter will be a good place to start. What
> do you think ?
Quick comment: IIRC, Ralf Karrenberg has already implemented this (as part of
his WVF project: https://github.com/karrenberg/wfv/tree/llvm_30). It might be
worthwhile to work on cleaning up his implementation instead of starting from
scratch.

 -Hal
> 
> > I've written a dependence analyzer that ought to be suitable
> > (if it isn't, we should fix it).
> 
> I know :) Yes, this is really good. Any serious loop transformation
> and multi-dimentional array vectorization will need this. I am glad
> that you are working on this and I do want to use it.
> 
> In the near future, I plan to work on the code generation cost-model
> aspects.
> 
> The vectorization legality check that I wrote is very simple and very
> conservative. It will be great if you could contribute in that area.
> 
> Thanks,
> Nadav
> 
> >
> > Thanks,
> > Preston
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Rao, Shivarama

2012-Oct-17 02:06 UTC

head link

[LLVMdev] Loop vectorizer

Hi Nadav,

Do you have any small write-up of current design of loop vectorizer?. If so, can
you please send it across?. I want to see if there are dependencies such as
unrolling for the vectorization.

In the design we may also have to consider BB vectorizer and loop vectorizer
working well together with no ambiguous requirements/dependencies.

Regards,
Shivaram


-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Hal Finkel
Sent: Wednesday, October 17, 2012 4:02 AM
To: Nadav Rotem
Cc: preston briggs; llvmdev at cs.uiuc.edu List
Subject: Re: [LLVMdev] Loop vectorizer



----- Original Message -----> Hi Preston!
> 
> >
> > I'd start by making a plan (a design!) with goals and stuff.
> > Publish it so we can see what you mean by "vectorization".
> 
> I will send a separate email later, but here is a quick overview. I
> see the vectorizer having four main components:
> 
> 1. Preparation passes: If-conversion, loop transformations, etc.
> 
> 2. Cost model - This unit decides on the best vectorization factor
> (could be 1).
> 
> 3. Legality check - This unit checks if it is *legal* (from a
> correctness point of view) to vectorize the program. This is target
> independent. Also, this unit needs to describe which transformation
> are needed to make this loop vectorizeable. For example: if-conversion
> is required if the control flow is not uniform for all iterations of
> the loop.
> 
> 4. Vectorization - This is where the actual widening of the
> instructions happen. Every time we improve #3 by detecting more
> vectorizeable loops, we need to add the mechanism for actually
> generating code for it. For example, we can detect reduction
> variables, but we also need to be able to generate code for them.
> 
> > I.e., do you mean for real vector machines, a la Cray, or for
> > the packed media instructions that are currently so popular?
> 
> I am mostly interested in generating code for the popular SIMD cpus
> (X86, ARM, PPC, etc).
> 
> > What's your idea for finding vectorizable loops?
> 
> I started with a very very basic check because I needed to start
> somewhere. We will need to design a mode advanced check.
> 
> > Do you have a plan for xforms to increase the amount of
> > vectorization?
> 
> Yes. We will need to implement a predication phase and to design the
> interaction with other loop transformations. Also, this will have to
> work well with the cost model. We also need to think of a good way to
> detect early on if the transformations are likely to be effective,
> because we currently don't have a good way of undoing compiler
> transformations.
> 
> I think that a simple if-converter will be a good place to start. What
> do you think ?
Quick comment: IIRC, Ralf Karrenberg has already implemented this (as part of
his WVF project: https://github.com/karrenberg/wfv/tree/llvm_30). It might be
worthwhile to work on cleaning up his implementation instead of starting from
scratch.

 -Hal
> 
> > I've written a dependence analyzer that ought to be suitable
> > (if it isn't, we should fix it).
> 
> I know :) Yes, this is really good. Any serious loop transformation
> and multi-dimentional array vectorization will need this. I am glad
> that you are working on this and I do want to use it.
> 
> In the near future, I plan to work on the code generation cost-model
> aspects.
> 
> The vectorization legality check that I wrote is very simple and very
> conservative. It will be great if you could contribute in that area.
> 
> Thanks,
> Nadav
> 
> >
> > Thanks,
> > Preston
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Rao, Shivarama

2012-Oct-17 03:12 UTC

head link

[LLVMdev] Loop vectorizer

Hi Nadav,

   I went through the patch and understood the design and implementation.
It's well documented :)

Thanks,
Shivaram
 

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Rao, Shivarama
Sent: Wednesday, October 17, 2012 7:36 AM
To: Hal Finkel; Nadav Rotem
Cc: preston briggs; llvmdev at cs.uiuc.edu List
Subject: Re: [LLVMdev] Loop vectorizer

Hi Nadav,

Do you have any small write-up of current design of loop vectorizer?. If so, can
you please send it across?. I want to see if there are dependencies such as
unrolling for the vectorization.

In the design we may also have to consider BB vectorizer and loop vectorizer
working well together with no ambiguous requirements/dependencies.

Regards,
Shivaram


-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Hal Finkel
Sent: Wednesday, October 17, 2012 4:02 AM
To: Nadav Rotem
Cc: preston briggs; llvmdev at cs.uiuc.edu List
Subject: Re: [LLVMdev] Loop vectorizer



----- Original Message -----> Hi Preston!
> 
> >
> > I'd start by making a plan (a design!) with goals and stuff.
> > Publish it so we can see what you mean by "vectorization".
> 
> I will send a separate email later, but here is a quick overview. I
> see the vectorizer having four main components:
> 
> 1. Preparation passes: If-conversion, loop transformations, etc.
> 
> 2. Cost model - This unit decides on the best vectorization factor
> (could be 1).
> 
> 3. Legality check - This unit checks if it is *legal* (from a
> correctness point of view) to vectorize the program. This is target
> independent. Also, this unit needs to describe which transformation
> are needed to make this loop vectorizeable. For example: if-conversion
> is required if the control flow is not uniform for all iterations of
> the loop.
> 
> 4. Vectorization - This is where the actual widening of the
> instructions happen. Every time we improve #3 by detecting more
> vectorizeable loops, we need to add the mechanism for actually
> generating code for it. For example, we can detect reduction
> variables, but we also need to be able to generate code for them.
> 
> > I.e., do you mean for real vector machines, a la Cray, or for
> > the packed media instructions that are currently so popular?
> 
> I am mostly interested in generating code for the popular SIMD cpus
> (X86, ARM, PPC, etc).
> 
> > What's your idea for finding vectorizable loops?
> 
> I started with a very very basic check because I needed to start
> somewhere. We will need to design a mode advanced check.
> 
> > Do you have a plan for xforms to increase the amount of
> > vectorization?
> 
> Yes. We will need to implement a predication phase and to design the
> interaction with other loop transformations. Also, this will have to
> work well with the cost model. We also need to think of a good way to
> detect early on if the transformations are likely to be effective,
> because we currently don't have a good way of undoing compiler
> transformations.
> 
> I think that a simple if-converter will be a good place to start. What
> do you think ?
Quick comment: IIRC, Ralf Karrenberg has already implemented this (as part of
his WVF project: https://github.com/karrenberg/wfv/tree/llvm_30). It might be
worthwhile to work on cleaning up his implementation instead of starting from
scratch.

 -Hal
> 
> > I've written a dependence analyzer that ought to be suitable
> > (if it isn't, we should fix it).
> 
> I know :) Yes, this is really good. Any serious loop transformation
> and multi-dimentional array vectorization will need this. I am glad
> that you are working on this and I do want to use it.
> 
> In the near future, I plan to work on the code generation cost-model
> aspects.
> 
> The vectorization legality check that I wrote is very simple and very
> conservative. It will be great if you could contribute in that area.
> 
> Thanks,
> Nadav
> 
> >
> > Thanks,
> > Preston
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Ralf Karrenberg

2012-Oct-17 07:13 UTC

head link

[LLVMdev] Loop vectorizer

Hi everybody,

On 10/17/12 12:32 AM, Hal Finkel wrote:>>> Do you have a plan for xforms to increase the amount of
>>> vectorization?
>>
>> Yes. We will need to implement a predication phase and to design the
>> interaction with other loop transformations. Also, this will have to
>> work well with the cost model. We also need to think of a good way to
>> detect early on if the transformations are likely to be effective,
>> because we currently don't have a good way of undoing compiler
>> transformations.
>>
>> I think that a simple if-converter will be a good place to start. What
>> do you think ?
>
> Quick comment: IIRC, Ralf Karrenberg has already implemented this (as part
of his WVF project: https://github.com/karrenberg/wfv/tree/llvm_30). It might be
worthwhile to work on cleaning up his implementation instead of starting from
scratch.
>
>   -Hal
WFV [1] does indeed include phases that correspond to full control-flow 
to data-flow conversion (not just if-conversion, it can flatten all 
kinds of control flow including nested loops with multiple exits etc.).

I am currently working on a full re-implementation of the WFV algorithm 
on top of the latest trunk.
One part of it that is basically finished is an analysis pass that I 
call "vectorization analysis", which annotates a function (WFV works
on
entire functions) with metadata used during control-flow to data-flow 
conversion and instruction vectorization.
To give you a broad idea, this includes information like:
- uniform/varying operation
- same/consecutive/random index vector (for load/store)
- aligned/unaligned index vector (for load/store)
- operations that can not be vectorized (marked as "split", e.g. 
non-vectorizable types etc.)
- operations that need to be split and guarded (e.g. unknown calls, stores)
- mandatory/optional blocks (renamed from
"divergent"/"non-divergent" in
[2])
- divergent/non-divergent loops

Generally, it would be possible to implement a loop vectorizer on top of 
WFV simply by running a loop dependency analysis to determine if the 
loop in question is vectorizable, extracting the loop body into a 
function, running WFV on it, and inlining the call again.

I am willing to provide all of my implementation as soon as required.
I hope to have mostly finished the rewrite at that point.

Cheers,
Ralf


[1] "Whole-Function Vectorization", Karrenberg and Hack, CGO'11
[2] "Improving Performance of OpenCL on CPUs", Karrenberg and Hack,
CC'12

Hal Finkel

2012-Oct-17 07:43 UTC

head link

[LLVMdev] Loop vectorizer

----- Original Message -----> From: "Ralf Karrenberg" <Chareos at gmx.de>
> To: llvmdev at cs.uiuc.edu
> Sent: Wednesday, October 17, 2012 2:13:08 AM
> Subject: Re: [LLVMdev] Loop vectorizer
> 
> Hi everybody,
> 
> On 10/17/12 12:32 AM, Hal Finkel wrote:
> >>> Do you have a plan for xforms to increase the amount of
> >>> vectorization?
> >>
> >> Yes. We will need to implement a predication phase and to design
> >> the
> >> interaction with other loop transformations. Also, this will have
> >> to
> >> work well with the cost model. We also need to think of a good way
> >> to
> >> detect early on if the transformations are likely to be effective,
> >> because we currently don't have a good way of undoing compiler
> >> transformations.
> >>
> >> I think that a simple if-converter will be a good place to start.
> >> What
> >> do you think ?
> >
> > Quick comment: IIRC, Ralf Karrenberg has already implemented this
> > (as part of his WVF project:
> > https://github.com/karrenberg/wfv/tree/llvm_30). It might be
> > worthwhile to work on cleaning up his implementation instead of
> > starting from scratch.
> >
> >   -Hal
> 
> WFV [1] does indeed include phases that correspond to full
> control-flow
> to data-flow conversion (not just if-conversion, it can flatten all
> kinds of control flow including nested loops with multiple exits
> etc.).
> 
> I am currently working on a full re-implementation of the WFV
> algorithm
> on top of the latest trunk.
> One part of it that is basically finished is an analysis pass that I
> call "vectorization analysis", which annotates a function (WFV
works
> on
> entire functions) with metadata used during control-flow to data-flow
> conversion and instruction vectorization.
Is there a reason to use metadata here as opposed to just keeping state in the
analysis pass?
> To give you a broad idea, this includes information like:
> - uniform/varying operation
> - same/consecutive/random index vector (for load/store)
> - aligned/unaligned index vector (for load/store)
> - operations that can not be vectorized (marked as "split", e.g.
> non-vectorizable types etc.)
> - operations that need to be split and guarded (e.g. unknown calls,
> stores)
> - mandatory/optional blocks (renamed from
"divergent"/"non-divergent"
> in
> [2])
> - divergent/non-divergent loops
Sounds great!
> 
> Generally, it would be possible to implement a loop vectorizer on top
> of
> WFV simply by running a loop dependency analysis to determine if the
> loop in question is vectorizable, extracting the loop body into a
> function, running WFV on it, and inlining the call again.
I presume that we could refactor your code in combination with Nadav's work
to directly vectorize loop bodies as well. Do you disagree?
> 
> I am willing to provide all of my implementation as soon as required.
> I hope to have mostly finished the rewrite at that point.
I encourage you to do this as soon as possible, otherwise I think that we might
miss the opportunity to take advantage of your work in current development.

Thanks again,
Hal
> 
> Cheers,
> Ralf
> 
> 
> [1] "Whole-Function Vectorization", Karrenberg and Hack,
CGO'11
> [2] "Improving Performance of OpenCL on CPUs", Karrenberg and
Hack,
> CC'12
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Oct 2012 - [LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

[LLVMdev] Loop vectorizer

Maybe Matching Threads