Displaying 3 results from an estimated 3 matches for "r149761".
2012 Feb 04
0
[LLVMdev] [BBVectorizer] Obvious vectorization benefit, but req-chain is too short
...tcast ([1024 x float]* @B to <4
> x float>*), align 16
> ret i32 0
> }
>
> Is there any way, we can make this case work by default? Maybe we can
> decrease the req-chain to 2, and increase the cost for non stride one
> loads or stores?
Try it now (after r149761). If this "solution" causes other problems,
then we may need to think of something more sophisticated.
-Hal
>
> Another probably unrelated point. I tried also a run with
> -bb-vectorize-req-chain-depth=1. The generated code is full of
> shufflevector instructions and eig...
2012 Feb 04
1
[LLVMdev] [BBVectorizer] Obvious vectorization benefit, but req-chain is too short
Hello,
Thanks for your work on the bb-vectorizer. It looks like a
promising pass to be used for multi-work-item-vectorization in
pocl.
On 02/04/2012 06:21 AM, Hal Finkel wrote:
> Try it now (after r149761). If this "solution" causes other problems,
> then we may need to think of something more sophisticated.
I wonder if the case where a store is the last user of the value could be
treated as a special case. The case where the code reads, computes
and writes values in a fully paralleliz...
2012 Feb 03
3
[LLVMdev] [BBVectorizer] Obvious vectorization benefit, but req-chain is too short
Hi Hal,
this is one of the first test cases, I would love to have improved
vectorizer support. I sent it out earlier, but I think it is a good time
to look into it again, after the vectorizer was committed.
The basic examples is a set of scalar loads that load for consecutive
elements and store them back right ahead. For me this is an obvious case
where vectorization is beneficial