Displaying 12 results from an estimated 12 matches for "_z3barmmpfs_s_".
2013 Oct 30
2
[LLVMdev] loop vectorizer
...logue of the function (where storing of arguments on the stack happens) which then got eliminated later on (since I don't see any vector instructions in the final IR). Below the debug output of the SLP pass:
>
> Args: opt -O1 -vectorize-slp -debug loop.ll -S
> SLP: Analyzing blocks in _Z3barmmPfS_S_.
> SLP: Found 2 stores to vectorize.
> SLP: Analyzing a store chain of length 2.
> SLP: Trying to vectorize starting at PHIs (1)
> SLP: Vectorizing a list of length = 2.
> SLP: Vectorizing a list of length = 2.
> SLP: Vectorizing a list of length = 2.
>
> IR produced:
>...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...g in the prologue of the
function (where storing of arguments on the stack happens) which then
got eliminated later on (since I don't see any vector instructions in
the final IR). Below the debug output of the SLP pass:
Args: opt -O1 -vectorize-slp -debug loop.ll -S
SLP: Analyzing blocks in _Z3barmmPfS_S_.
SLP: Found 2 stores to vectorize.
SLP: Analyzing a store chain of length 2.
SLP: Trying to vectorize starting at PHIs (1)
SLP: Vectorizing a list of length = 2.
SLP: Vectorizing a list of length = 2.
SLP: Vectorizing a list of length = 2.
IR produced:
define void @_Z3barmmPfS_S_(i64 %start, i64...
2013 Oct 30
3
[LLVMdev] loop vectorizer
...s this is the SLP vectorizer.
No, while the BB vectorizer is doing a form of SLP vectorization, there is a separate SLP vectorization pass which uses a different algorithm. You can pass -vectorize-slp to opt.
-Hal
>
> BBV: using target information
> BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_...
> BBV: found 2 instructions with candidate pairs
> BBV: found 0 pair connections.
> BBV: done!
>
> However, this was run on the unrolled loop (I guess).
>
> Here is the IR printed by 'opt':
>
> entry:
> %cmp9 = icmp ult i64 %start, %end
> br i1 %cmp9,...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...int64_t ir0 = ( (i/inner) * 2 + 0 ) * inner + i%4;
const std::uint64_t ir1 = ir0 + 4;
c[ ir0 ] = a[ ir0 ] + b[ ir0 ];
c[ ir1 ] = a[ ir1 ] + b[ ir1 ];
}
}
still neither the SLP nor the loop vectorizer do anything of effect:
SLP: Analyzing blocks in _Z3barmmPfS_S_.
SLP: Found 2 stores to vectorize.
SLP: Analyzing a store chain of length 2.
SLP: Trying to vectorize starting at PHIs (1)
SLP: Vectorizing a list of length = 2.
SLP: Vectorizing a list of length = 2.
SLP: Vectorizing a list of length = 2.
LV: Checking a loop in "_Z3barmmPfS_S_"
LV: Foun...
2013 Oct 31
0
[LLVMdev] loop vectorizer
...+
(i+3)%4;
c[ ir0 ] = a[ ir0 ] + b[ ir0 ];
c[ ir1 ] = a[ ir1 ] + b[ ir1 ];
}
}
}
This should be an ideal test case for the SLP vectorizer, right?
It seems, I am out of luck:
opt -O3 -vectorize-slp -debug loop.ll -S
SLP: Analyzing blocks in _Z3barmmPfS_S_.
SLP: Found 8 stores to vectorize.
SLP: Analyzing a store chain of length 8.
SLP: Trying to vectorize starting at PHIs (1)
SLP: Vectorizing a list of length = 2.
SLP: Vectorizing a list of length = 2.
SLP: Vectorizing a list of length = 2.
But the resulting IR is not showing any vector instruction...
2013 Nov 01
2
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
...er + q;
const std::uint64_t ir1 = ( i * 2 + 1 ) * inner + q;
c[ ir0 ] = a[ ir0 ] + b[ ir0 ];
c[ ir1 ] = a[ ir1 ] + b[ ir1 ];
}
}
}
the loop vectorizer complains as well, but the produced code is vectorized:
LV: Checking a loop in "_Z3barmmPfS_S_"
LV: Found a loop: for.body4
LV: Found an induction variable.
LV: Found unvectorizable type.
LV: Can't vectorize the instructions or CFG
LV: Not vectorizing.
; Function Attrs: nounwind uwtable
define void @_Z3barmmPfS_S_(i64 %start, i64 %end, float* noalias %c,
float* noalias %a, float*...
2013 Nov 01
0
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
...+ 1 ) * inner + q;
>
> c[ ir0 ] = a[ ir0 ] + b[ ir0 ];
> c[ ir1 ] = a[ ir1 ] + b[ ir1 ];
> }
> }
> }
>
> the loop vectorizer complains as well, but the produced code is
> vectorized:
>
> LV: Checking a loop in "_Z3barmmPfS_S_"
> LV: Found a loop: for.body4
> LV: Found an induction variable.
> LV: Found unvectorizable type.
> LV: Can't vectorize the instructions or CFG
> LV: Not vectorizing.
>
> ; Function Attrs: nounwind uwtable
> define void @_Z3barmmPfS_S_(i64 %start, i64 %end, float*...
2013 Oct 30
0
[LLVMdev] loop vectorizer
I ran the BB vectorizer as I guess this is the SLP vectorizer.
BBV: using target information
BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_...
BBV: found 2 instructions with candidate pairs
BBV: found 0 pair connections.
BBV: done!
However, this was run on the unrolled loop (I guess).
Here is the IR printed by 'opt':
entry:
%cmp9 = icmp ult i64 %start, %end
br i1 %cmp9, label %for.body, label %for.end
for.body:...
2013 Oct 31
2
[LLVMdev] loop vectorizer
...ir0 ];
> c[ ir1 ] = a[ ir1 ] + b[ ir1 ];
> }
> }
> }
>
>
> This should be an ideal test case for the SLP vectorizer, right?
>
> It seems, I am out of luck:
>
> opt -O3 -vectorize-slp -debug loop.ll -S
>
> SLP: Analyzing blocks in _Z3barmmPfS_S_.
> SLP: Found 8 stores to vectorize.
> SLP: Analyzing a store chain of length 8.
> SLP: Trying to vectorize starting at PHIs (1)
> SLP: Vectorizing a list of length = 2.
> SLP: Vectorizing a list of length = 2.
> SLP: Vectorizing a list of length = 2.
-------------- next part ---...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote:
> The access pattern to arrays a and b is non-linear. Unrolled loops are
> usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all
> values for i ?
>
Based on his list of values, it seems that the induction stride is linear
within each block of 4 iterations, but it's not a clear
2013 Oct 30
3
[LLVMdev] loop vectorizer
Hi Frank,
> We are looking at a variety of target architectures. Ultimately we aim to run on BG/Q and Intel Xeon Phi (native). However, running on those architectures with the LLVM technology is planned in some future. As a first step we would target vanilla x86 with SSE/AVX 128/256 as a proof-of-concept.
Great! It should be easy to support these targets. When you said wide-vectors I assumed
2013 Oct 31
0
[LLVMdev] loop vectorizer
...] + b[ ir1 ];
>> }
>> }
>> }
>>
>>
>> This should be an ideal test case for the SLP vectorizer, right?
>>
>> It seems, I am out of luck:
>>
>> opt -O3 -vectorize-slp -debug loop.ll -S
>>
>> SLP: Analyzing blocks in _Z3barmmPfS_S_.
>> SLP: Found 8 stores to vectorize.
>> SLP: Analyzing a store chain of length 8.
>> SLP: Trying to vectorize starting at PHIs (1)
>> SLP: Vectorizing a list of length = 2.
>> SLP: Vectorizing a list of length = 2.
>> SLP: Vectorizing a list of length = 2.
>...