thr3ads.net - llvm dev - [LLVMdev] Vectorization metadata [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Renato Golin

2012-Apr-18 16:30 UTC

[LLVMdev] Vectorization metadata

Hal,

I'm opening a new discussion on vectorization metadata, since it has
little to do with fp-math. ;)

What kind of metadata would you annotate in the instructions? If I
remember from your talk, you're not doing any loop or whole-function
analysis, possibly leaving it for Polly to help you along the way.

I remember discussing it with Tobias that Polly could have three main steps:

1. Early analysis and annotation: a step that wouldn't modify code,
but extensively annotate (with metadata), so that itself, and other
passes like yours, could benefit from the polyhedral model.

2. Full polyhedral code modification: use the annotation of the
previous pass to extensively modify code. This is what Polly does
today, but the result of the analysis is not benefiting anyone except
for Polly.

This step can be fused with step 1 for performance reasons, but would
be good to be able to pass only the analysis part
 for the benefit of the annotation, without the heavy modifications.
This will be fundamental for independently testing  vectorization
passes that depend on Polly's metadata.

3. Code generation steps. As you said in your talk, and we discussed
in the fp-math thread, some code-generation steps could be aware of
the optimizations done via the metadata that was left in it.

That will require some guarantees on metadata semantics and
persistence that is not available today... Anyway, not sure any
metadata-hardening will be very well accepted... ;)

-- 
cheers,
--renato

http://systemcall.org/

Hal Finkel

2012-Apr-18 16:54 UTC

head link

[LLVMdev] Vectorization metadata

On Wed, 18 Apr 2012 17:30:11 +0100
Renato Golin <rengolin at systemcall.org> wrote:
> Hal,
> 
> I'm opening a new discussion on vectorization metadata, since it has
> little to do with fp-math. ;)
Fair enough, but I was actually taking about how fp-math, etc. metadata
is updated during vectorization. When vectorization fuses
originally-independent instructions, it has the same metadata issues as
GVN, etc.

Metadata specifically for vectorization is another interesting topic,
but I don't have any specific ideas for this at the moment. That having
been said, I think that we do need to think about metadata that will
help with vectorization; we might want to tag instructions as safe for
speculative execution, for example. We might want to tag loops with a
specific unrolling factor. We might want to be able to pass along
specific alias independence results. None of these things are really
specific to vectorization, but will generally have an impact on it.

 -Hal
> 
> What kind of metadata would you annotate in the instructions? If I
> remember from your talk, you're not doing any loop or whole-function
> analysis, possibly leaving it for Polly to help you along the way.
> 
> I remember discussing it with Tobias that Polly could have three main
> steps:
> 
> 1. Early analysis and annotation: a step that wouldn't modify code,
> but extensively annotate (with metadata), so that itself, and other
> passes like yours, could benefit from the polyhedral model.
> 
> 2. Full polyhedral code modification: use the annotation of the
> previous pass to extensively modify code. This is what Polly does
> today, but the result of the analysis is not benefiting anyone except
> for Polly.
> 
> This step can be fused with step 1 for performance reasons, but would
> be good to be able to pass only the analysis part
>  for the benefit of the annotation, without the heavy modifications.
> This will be fundamental for independently testing  vectorization
> passes that depend on Polly's metadata.
> 
> 3. Code generation steps. As you said in your talk, and we discussed
> in the fp-math thread, some code-generation steps could be aware of
> the optimizations done via the metadata that was left in it.
> 
> That will require some guarantees on metadata semantics and
> persistence that is not available today... Anyway, not sure any
> metadata-hardening will be very well accepted... ;)
> 


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Hongbin Zheng

2012-Apr-18 18:11 UTC

head link

[LLVMdev] Vectorization metadata

On Thu, Apr 19, 2012 at 12:30 AM, Renato Golin <rengolin at
systemcall.org> wrote:> Hal,
>
> I'm opening a new discussion on vectorization metadata, since it has
> little to do with fp-math. ;)
>
> What kind of metadata would you annotate in the instructions? If I
> remember from your talk, you're not doing any loop or whole-function
> analysis, possibly leaving it for Polly to help you along the way.
>
> I remember discussing it with Tobias that Polly could have three main
steps:
>
> 1. Early analysis and annotation: a step that wouldn't modify code,
> but extensively annotate (with metadata), so that itself, and other
> passes like yours, could benefit from the polyhedral model.hi renato,

Instead of exporting the polyhedral model of the program with
metadata, another possible solution is designing a generic "Loop
Parallelism" analysis interface just like the AliasAnalysis group.
For a particular loop, the interface simply answer how many loop
iterations can run in parallel. With information provided by this
interface we can unroll the loop to expose vectorizable iterations and
apply vectorization to the unrolled loop with BBVectorizer.

Like AliasAnalysis we can have difference implementation of loop
parallelism analysis, i.e., we can have a light weight loop
parallelism Analysis implementation based on SCEV (or the
LoopDependency Analysis), and we can also have a Loop Parallelism
Analysis implementation based on polyhedral model analysis implemented
in polly (called polyhedral loop parallelism analysis), but analysis
result of Polly is not visible at the scope of FunctionPass/LoopPass
as all polly passes are RegionPasses right now.

To allow polly export its analysis result to FunctionPass/LoopPass, we
need to make the polyhedral loop parallelism analysis became a
FunctionPass, and schedule it before all polly passes but do nothing
in its runOnFunction method, after that we can let another pass of
polly to fill the actually analysis results into the polyhedral loop
parallelism analysis pass. By doing this, other
FunctionPasses/LoopPasses can query the parallelism information
calculated by Polly.

If the parallelism information is available outside polly, we can also
find some way to move code generation support for OpenMP, Vecorization
and CUDA from Polly to LLVM transformation library, after that we can
also generate such code base on the analysis result of the SCEV based
parallelism analysis.

best regards
ether

>
> 2. Full polyhedral code modification: use the annotation of the
> previous pass to extensively modify code. This is what Polly does
> today, but the result of the analysis is not benefiting anyone except
> for Polly.
>
> This step can be fused with step 1 for performance reasons, but would
> be good to be able to pass only the analysis part
>  for the benefit of the annotation, without the heavy modifications.
> This will be fundamental for independently testing  vectorization
> passes that depend on Polly's metadata.
>
> 3. Code generation steps. As you said in your talk, and we discussed
> in the fp-math thread, some code-generation steps could be aware of
> the optimizations done via the metadata that was left in it.
>
> That will require some guarantees on metadata semantics and
> persistence that is not available today... Anyway, not sure any
> metadata-hardening will be very well accepted... ;)
>
> --
> cheers,
> --renato
>
> http://systemcall.org/
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Renato Golin

2012-Apr-18 19:17 UTC

head link

[LLVMdev] Vectorization metadata

Hi Ether,

On 18 April 2012 19:11, Hongbin Zheng <etherzhhb at gmail.com>
wrote:> Instead of exporting the polyhedral model of the program with
> metadata, another possible solution is designing a generic "Loop
> Parallelism" analysis interface just like the AliasAnalysis group.
> For a particular loop, the interface simply answer how many loop
> iterations can run in parallel. With information provided by this
> interface we can unroll the loop to expose vectorizable iterations and
> apply vectorization to the unrolled loop with BBVectorizer.
In the long run, this kind of paralellism detector should be replaced
by Polly, but it could be a starting point. I only fear that the
burden might out-weight the benefits in the short term.

> To allow polly export its analysis result to FunctionPass/LoopPass, we
> need to make the polyhedral loop parallelism analysis became a
> FunctionPass, and schedule it before all polly passes but do nothing
> in its runOnFunction method, after that we can let another pass of
> polly to fill the actually analysis results into the polyhedral loop
> parallelism analysis pass. By doing this, other
> FunctionPasses/LoopPasses can query the parallelism information
> calculated by Polly.
That's the idea. Would be good if you could have both
analysis+transform AND analysis-only pre-passes, to allow a more fine
grained control over vectorization (and ease tests of other passes
that use Polly's info).

> If the parallelism information is available outside polly, we can also
> find some way to move code generation support for OpenMP, Vecorization
> and CUDA from Polly to LLVM transformation library, after that we can
> also generate such code base on the analysis result of the SCEV based
> parallelism analysis.
LLVM already has OpenMP support, maybe we should follow a similar
standard, or common them up.

CUDA would be closer to OpenCL than OpenMP or Polly, I'm not sure
there is a feasible way to make sure the semantics remains the same on
such drastic changes of paradigm.

-- 
cheers,
--renato

http://systemcall.org/

Renato Golin

2012-Apr-18 19:34 UTC

head link

[LLVMdev] Vectorization metadata

On 18 April 2012 17:54, Hal Finkel <hfinkel at anl.gov>
wrote:> Metadata specifically for vectorization is another interesting topic,
> but I don't have any specific ideas for this at the moment. That having
> been said, I think that we do need to think about metadata that will
> help with vectorization; we might want to tag instructions as safe for
> speculative execution, for example. We might want to tag loops with a
> specific unrolling factor. We might want to be able to pass along
> specific alias independence results. None of these things are really
> specific to vectorization, but will generally have an impact on it.
I think this is a very important feature for vectorization. If we
start building small passes for small vectorization steps (like one
for hoisting loop constants, other to simplify the induction range,
other to unroll loops), we might not be able to predict the best
strategy, since early changes might shadow better strategies later.

Having metadata allows one to infer what's the best strategy as a
whole, and apply it, rather than hoping for a good sequence of
passes... We still can have separate passes for each task, but not run
them all on all code all the time.

So, if an early analysis pass annotate saying in a particular loop,
you should only hoist the loop-constants (aggressive inlining is
possible, for ex.), while on another you should actually unroll, then
each pass can run independently and trust the metadata on each
loop/block/instruction.

-- 
cheers,
--renato

http://systemcall.org/

Pekka Jääskeläinen

2012-Apr-19 06:25 UTC

head link

[LLVMdev] Vectorization metadata

Hi Hal,

On 04/18/2012 07:54 PM, Hal Finkel wrote:>   We might want to be able to pass along
> specific alias independence results. None of these things are really
> specific to vectorization, but will generally have an impact on it.
For what it's worth, in pocl we do something along this lines now [1].

We annotate the OpenCL C kernel instructions with the OpenCL work
item id and the "parallel region id" (region between barriers).
As you probably know, in OpenCL C the work items are fully independent
"threads of execution" between the barrier regions which is useful
information to pass along.

This metadata is used to both guide a (modified) bb-vectorizer to
perform the work group auto-vectorization (whole function vectorization,
if you will) more efficiently and to improve the alias analysis for
instruction scheduling (and other optimizations that might benefit).

The benefit of not just vectorizing directly the parallel regions is that
we can choose to wg-vectorize and/or to statically instruction parallelize
using the same input from pocl.

It would be really nice to have a set of "standard independence
metadata" in LLVM that would cover also this scenario.

[1] http://bazaar.launchpad.net/~pocl/pocl/trunk/revision/237

BR,
-- 
--Pekka

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Apr 2012 - [LLVMdev] Vectorization metadata

[LLVMdev] Vectorization metadata

[LLVMdev] Vectorization metadata

[LLVMdev] Vectorization metadata

[LLVMdev] Vectorization metadata

[LLVMdev] Vectorization metadata

[LLVMdev] Vectorization metadata

Reasonably Related Threads