Displaying 6 results from an estimated 6 matches for "add51".
Did you mean:
add1
2013 Oct 30
0
[LLVMdev] loop vectorizer
...; preds = %entry,
%for.body
%storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
%div = lshr i64 %storemerge10, 2
%mul1 = shl i64 %div, 3
%rem = and i64 %storemerge10, 3
%add2 = or i64 %mul1, %rem
%0 = lshr i64 %storemerge10, 1
%add51 = shl i64 %0, 2
%mul6 = or i64 %rem, %add51
%add8 = or i64 %mul6, 4
%arrayidx = getelementptr inbounds float* %a, i64 %add2
%1 = load float* %arrayidx, align 4
%arrayidx9 = getelementptr inbounds float* %b, i64 %add2
%2 = load float* %arrayidx9, align 4
%add10 = fadd float %1,...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote:
> The access pattern to arrays a and b is non-linear. Unrolled loops are
> usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all
> values for i ?
>
Based on his list of values, it seems that the induction stride is linear
within each block of 4 iterations, but it's not a clear
2013 Oct 30
3
[LLVMdev] loop vectorizer
...end
>
> for.body: ; preds = %entry, %for.body
> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
> %div = lshr i64 %storemerge10, 2
> %mul1 = shl i64 %div, 3
> %rem = and i64 %storemerge10, 3
> %add2 = or i64 %mul1, %rem
> %0 = lshr i64 %storemerge10, 1
> %add51 = shl i64 %0, 2
> %mul6 = or i64 %rem, %add51
> %add8 = or i64 %mul6, 4
> %arrayidx = getelementptr inbounds float* %a, i64 %add2
> %1 = load float* %arrayidx, align 4
> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
> %2 = load float* %arrayidx9, align 4
> %add10 = f...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...reds = %entry, %for.body
>> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
>> %div = lshr i64 %storemerge10, 2
>> %mul1 = shl i64 %div, 3
>> %rem = and i64 %storemerge10, 3
>> %add2 = or i64 %mul1, %rem
>> %0 = lshr i64 %storemerge10, 1
>> %add51 = shl i64 %0, 2
>> %mul6 = or i64 %rem, %add51
>> %add8 = or i64 %mul6, 4
>> %arrayidx = getelementptr inbounds float* %a, i64 %add2
>> %1 = load float* %arrayidx, align 4
>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
>> %2 = load float* %arrayidx9,...
2013 Oct 30
2
[LLVMdev] loop vectorizer
...;>> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
>>> %div = lshr i64 %storemerge10, 2
>>> %mul1 = shl i64 %div, 3
>>> %rem = and i64 %storemerge10, 3
>>> %add2 = or i64 %mul1, %rem
>>> %0 = lshr i64 %storemerge10, 1
>>> %add51 = shl i64 %0, 2
>>> %mul6 = or i64 %rem, %add51
>>> %add8 = or i64 %mul6, 4
>>> %arrayidx = getelementptr inbounds float* %a, i64 %add2
>>> %1 = load float* %arrayidx, align 4
>>> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2
>>> %2 =...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...= phi i64 [ %inc, %for.body ], [ %start, %entry ]
>>>> %div = lshr i64 %storemerge10, 2
>>>> %mul1 = shl i64 %div, 3
>>>> %rem = and i64 %storemerge10, 3
>>>> %add2 = or i64 %mul1, %rem
>>>> %0 = lshr i64 %storemerge10, 1
>>>> %add51 = shl i64 %0, 2
>>>> %mul6 = or i64 %rem, %add51
>>>> %add8 = or i64 %mul6, 4
>>>> %arrayidx = getelementptr inbounds float* %a, i64 %add2
>>>> %1 = load float* %arrayidx, align 4
>>>> %arrayidx9 = getelementptr inbounds float* %b, i64 %ad...