search for: _z3barmmpfs_s_

Displaying 12 results from an estimated 12 matches for "_z3barmmpfs_s_".

2013 Oct 30
2
[LLVMdev] loop vectorizer
...logue of the function (where storing of arguments on the stack happens) which then got eliminated later on (since I don't see any vector instructions in the final IR). Below the debug output of the SLP pass: > > Args: opt -O1 -vectorize-slp -debug loop.ll -S > SLP: Analyzing blocks in _Z3barmmPfS_S_. > SLP: Found 2 stores to vectorize. > SLP: Analyzing a store chain of length 2. > SLP: Trying to vectorize starting at PHIs (1) > SLP: Vectorizing a list of length = 2. > SLP: Vectorizing a list of length = 2. > SLP: Vectorizing a list of length = 2. > > IR produced: >...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...g in the prologue of the function (where storing of arguments on the stack happens) which then got eliminated later on (since I don't see any vector instructions in the final IR). Below the debug output of the SLP pass: Args: opt -O1 -vectorize-slp -debug loop.ll -S SLP: Analyzing blocks in _Z3barmmPfS_S_. SLP: Found 2 stores to vectorize. SLP: Analyzing a store chain of length 2. SLP: Trying to vectorize starting at PHIs (1) SLP: Vectorizing a list of length = 2. SLP: Vectorizing a list of length = 2. SLP: Vectorizing a list of length = 2. IR produced: define void @_Z3barmmPfS_S_(i64 %start, i64...
2013 Oct 30
3
[LLVMdev] loop vectorizer
...s this is the SLP vectorizer. No, while the BB vectorizer is doing a form of SLP vectorization, there is a separate SLP vectorization pass which uses a different algorithm. You can pass -vectorize-slp to opt. -Hal > > BBV: using target information > BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_... > BBV: found 2 instructions with candidate pairs > BBV: found 0 pair connections. > BBV: done! > > However, this was run on the unrolled loop (I guess). > > Here is the IR printed by 'opt': > > entry: > %cmp9 = icmp ult i64 %start, %end > br i1 %cmp9,...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...int64_t ir0 = ( (i/inner) * 2 + 0 ) * inner + i%4; const std::uint64_t ir1 = ir0 + 4; c[ ir0 ] = a[ ir0 ] + b[ ir0 ]; c[ ir1 ] = a[ ir1 ] + b[ ir1 ]; } } still neither the SLP nor the loop vectorizer do anything of effect: SLP: Analyzing blocks in _Z3barmmPfS_S_. SLP: Found 2 stores to vectorize. SLP: Analyzing a store chain of length 2. SLP: Trying to vectorize starting at PHIs (1) SLP: Vectorizing a list of length = 2. SLP: Vectorizing a list of length = 2. SLP: Vectorizing a list of length = 2. LV: Checking a loop in "_Z3barmmPfS_S_" LV: Foun...
2013 Oct 31
0
[LLVMdev] loop vectorizer
...+ (i+3)%4; c[ ir0 ] = a[ ir0 ] + b[ ir0 ]; c[ ir1 ] = a[ ir1 ] + b[ ir1 ]; } } } This should be an ideal test case for the SLP vectorizer, right? It seems, I am out of luck: opt -O3 -vectorize-slp -debug loop.ll -S SLP: Analyzing blocks in _Z3barmmPfS_S_. SLP: Found 8 stores to vectorize. SLP: Analyzing a store chain of length 8. SLP: Trying to vectorize starting at PHIs (1) SLP: Vectorizing a list of length = 2. SLP: Vectorizing a list of length = 2. SLP: Vectorizing a list of length = 2. But the resulting IR is not showing any vector instruction...
2013 Nov 01
2
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
...er + q; const std::uint64_t ir1 = ( i * 2 + 1 ) * inner + q; c[ ir0 ] = a[ ir0 ] + b[ ir0 ]; c[ ir1 ] = a[ ir1 ] + b[ ir1 ]; } } } the loop vectorizer complains as well, but the produced code is vectorized: LV: Checking a loop in "_Z3barmmPfS_S_" LV: Found a loop: for.body4 LV: Found an induction variable. LV: Found unvectorizable type. LV: Can't vectorize the instructions or CFG LV: Not vectorizing. ; Function Attrs: nounwind uwtable define void @_Z3barmmPfS_S_(i64 %start, i64 %end, float* noalias %c, float* noalias %a, float*...
2013 Nov 01
0
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
...+ 1 ) * inner + q; > > c[ ir0 ] = a[ ir0 ] + b[ ir0 ]; > c[ ir1 ] = a[ ir1 ] + b[ ir1 ]; > } > } > } > > the loop vectorizer complains as well, but the produced code is > vectorized: > > LV: Checking a loop in "_Z3barmmPfS_S_" > LV: Found a loop: for.body4 > LV: Found an induction variable. > LV: Found unvectorizable type. > LV: Can't vectorize the instructions or CFG > LV: Not vectorizing. > > ; Function Attrs: nounwind uwtable > define void @_Z3barmmPfS_S_(i64 %start, i64 %end, float*...
2013 Oct 30
0
[LLVMdev] loop vectorizer
I ran the BB vectorizer as I guess this is the SLP vectorizer. BBV: using target information BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_... BBV: found 2 instructions with candidate pairs BBV: found 0 pair connections. BBV: done! However, this was run on the unrolled loop (I guess). Here is the IR printed by 'opt': entry: %cmp9 = icmp ult i64 %start, %end br i1 %cmp9, label %for.body, label %for.end for.body:...
2013 Oct 31
2
[LLVMdev] loop vectorizer
...ir0 ]; > c[ ir1 ] = a[ ir1 ] + b[ ir1 ]; > } > } > } > > > This should be an ideal test case for the SLP vectorizer, right? > > It seems, I am out of luck: > > opt -O3 -vectorize-slp -debug loop.ll -S > > SLP: Analyzing blocks in _Z3barmmPfS_S_. > SLP: Found 8 stores to vectorize. > SLP: Analyzing a store chain of length 8. > SLP: Trying to vectorize starting at PHIs (1) > SLP: Vectorizing a list of length = 2. > SLP: Vectorizing a list of length = 2. > SLP: Vectorizing a list of length = 2. -------------- next part ---...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote: > The access pattern to arrays a and b is non-linear. Unrolled loops are > usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all > values for i ? > Based on his list of values, it seems that the induction stride is linear within each block of 4 iterations, but it's not a clear
2013 Oct 30
3
[LLVMdev] loop vectorizer
Hi Frank, > We are looking at a variety of target architectures. Ultimately we aim to run on BG/Q and Intel Xeon Phi (native). However, running on those architectures with the LLVM technology is planned in some future. As a first step we would target vanilla x86 with SSE/AVX 128/256 as a proof-of-concept. Great! It should be easy to support these targets. When you said wide-vectors I assumed
2013 Oct 31
0
[LLVMdev] loop vectorizer
...] + b[ ir1 ]; >> } >> } >> } >> >> >> This should be an ideal test case for the SLP vectorizer, right? >> >> It seems, I am out of luck: >> >> opt -O3 -vectorize-slp -debug loop.ll -S >> >> SLP: Analyzing blocks in _Z3barmmPfS_S_. >> SLP: Found 8 stores to vectorize. >> SLP: Analyzing a store chain of length 8. >> SLP: Trying to vectorize starting at PHIs (1) >> SLP: Vectorizing a list of length = 2. >> SLP: Vectorizing a list of length = 2. >> SLP: Vectorizing a list of length = 2. >...