Displaying 6 results from an estimated 6 matches for "canvectorizeinstrs".
2017 Apr 14
2
Separate LoopVectorize LLVM pass
Hello.
I am trying to create my own LoopVectorize.cpp pass as a separate pass from the LLVM
trunk, as described in http://llvm.org/docs/CMake.html#embedding-llvm-in-your-project. Did
anybody try something like this?
I added close to the end of the .cpp file:
/* this line seems to be required - it allows to run this pass
as an embedded pass by giving opt -my-loop-vectorize
2013 Nov 08
0
[LLVMdev] loop vectorizer and storing to uniform addresses
On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org> wrote:
> LV: We don't allow storing to uniform addresses
>
This is triggering because it didn't recognize as a reduction variable
during the canVectorizeInstrs() but did recognize that sum[q] is loop
invariant in canVectorizeMemory().
I'm guessing the nested loop was unrolled because of the low trip-count,
and removed, so it ended up as:
float foo( int start , int end , float * A )
{
float sum[4] = {0.,0.,0.,0.};
for (int i = start ; i < end...
2013 Nov 08
3
[LLVMdev] loop vectorizer and storing to uniform addresses
I am trying my luck on this global reduction kernel:
float foo( int start , int end , float * A )
{
float sum[4] = {0.,0.,0.,0.};
for (int i = start ; i < end ; ++i ) {
for (int q = 0 ; q < 4 ; ++q )
sum[q] += A[i*4+q];
}
return sum[0]+sum[1]+sum[2]+sum[3];
}
LV: Checking a loop in "foo"
LV: Found a loop: for.cond1
LV: Found an induction variable.
LV: We
2013 Nov 08
1
[LLVMdev] loop vectorizer and storing to uniform addresses
...wrote:
> On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org
> <mailto:fwinter at jlab.org>> wrote:
>
> LV: We don't allow storing to uniform addresses
>
>
> This is triggering because it didn't recognize as a reduction variable
> during the canVectorizeInstrs() but did recognize that sum[q] is loop
> invariant in canVectorizeMemory().
>
> I'm guessing the nested loop was unrolled because of the low
> trip-count, and removed, so it ended up as:
>
> float foo( int start , int end , float * A )
> {
> float sum[4] = {0.,0.,0...
2018 May 14
1
Query on unswitching + vectorization
* Looks like some sort of pass ordering issue; it will vectorize if indvars runs sometime between loop unswitch and the vectorizer.
That insight is helpful. I scheduled Canonicalization of induction variable before loop vectorization and could get the loop vectorized.
The indvars are heavily dependent on SCEV. If there a scalar like tmp which is of real type, we may not be able to get the
2020 May 05
2
Missing vectorization of loop due to load late in the loop
Hi,
TL;DR: A loop doesn't get vectorized due to the interaction of loop-
rotate, licm and instcombine. What to do about it?
Full story:
In the benchmarks for our out-of-tree target we have a case that we
would like to get vectorized, but currently it isn't. I've done some
digging to see why and have some kind of idea what prevents it, but I
don't know what the best way to fix