Displaying 8 results from an estimated 8 matches for "storemerge10".
2013 Oct 30
0
[LLVMdev] loop vectorizer
...r connections.
BBV: done!
However, this was run on the unrolled loop (I guess).
Here is the IR printed by 'opt':
entry:
%cmp9 = icmp ult i64 %start, %end
br i1 %cmp9, label %for.body, label %for.end
for.body: ; preds = %entry,
%for.body
%storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
%div = lshr i64 %storemerge10, 2
%mul1 = shl i64 %div, 3
%rem = and i64 %storemerge10, 3
%add2 = or i64 %mul1, %rem
%0 = lshr i64 %storemerge10, 1
%add51 = shl i64 %0, 2
%mul6 = or i64 %rem, %add51
%add8 = or i64 %mul6, 4...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote:
> The access pattern to arrays a and b is non-linear. Unrolled loops are
> usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all
> values for i ?
>
Based on his list of values, it seems that the induction stride is linear
within each block of 4 iterations, but it's not a clear
2013 Oct 30
3
[LLVMdev] loop vectorizer
...ons.
> BBV: done!
>
> However, this was run on the unrolled loop (I guess).
>
> Here is the IR printed by 'opt':
>
> entry:
> %cmp9 = icmp ult i64 %start, %end
> br i1 %cmp9, label %for.body, label %for.end
>
> for.body: ; preds = %entry, %for.body
> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
> %div = lshr i64 %storemerge10, 2
> %mul1 = shl i64 %div, 3
> %rem = and i64 %storemerge10, 3
> %add2 = or i64 %mul1, %rem
> %0 = lshr i64 %storemerge10, 1
> %add51 = shl i64 %0, 2
> %mul6 = or i64 %rem, %add51
> %add8 = or...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...owever, this was run on the unrolled loop (I guess).
>>
>> Here is the IR printed by 'opt':
>>
>> entry:
>> %cmp9 = icmp ult i64 %start, %end
>> br i1 %cmp9, label %for.body, label %for.end
>>
>> for.body: ; preds = %entry, %for.body
>> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
>> %div = lshr i64 %storemerge10, 2
>> %mul1 = shl i64 %div, 3
>> %rem = and i64 %storemerge10, 3
>> %add2 = or i64 %mul1, %rem
>> %0 = lshr i64 %storemerge10, 1
>> %add51 = shl i64 %0, 2
>> %mul6 = or i64 %...
2013 Oct 30
2
[LLVMdev] loop vectorizer
...op (I guess).
>>>
>>> Here is the IR printed by 'opt':
>>>
>>> entry:
>>> %cmp9 = icmp ult i64 %start, %end
>>> br i1 %cmp9, label %for.body, label %for.end
>>>
>>> for.body: ; preds = %entry, %for.body
>>> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
>>> %div = lshr i64 %storemerge10, 2
>>> %mul1 = shl i64 %div, 3
>>> %rem = and i64 %storemerge10, 3
>>> %add2 = or i64 %mul1, %rem
>>> %0 = lshr i64 %storemerge10, 1
>>> %add51 = shl i64 %0, 2
&...
2013 Nov 01
2
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
...i64 %end, 2
%cmp9 = icmp ult i64 %div, %div1
br i1 %cmp9, label %for.body4.preheader, label %for.end20
for.body4.preheader: ; preds = %entry
br label %for.body4
for.body4: ; preds =
%for.body4.preheader, %for.body4
%storemerge10 = phi i64 [ %inc19, %for.body4 ], [ %div,
%for.body4.preheader ]
%mul5 = shl i64 %storemerge10, 3
%add82 = or i64 %mul5, 4
%arrayidx = getelementptr inbounds float* %a, i64 %mul5
%arrayidx11 = getelementptr inbounds float* %b, i64 %mul5
%arrayidx13 = getelementptr inbounds float* %c...
2013 Nov 01
0
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
...4 %div, %div1
> br i1 %cmp9, label %for.body4.preheader, label %for.end20
>
> for.body4.preheader: ; preds = %entry
> br label %for.body4
>
> for.body4: ; preds =
> %for.body4.preheader, %for.body4
> %storemerge10 = phi i64 [ %inc19, %for.body4 ], [ %div,
> %for.body4.preheader ]
> %mul5 = shl i64 %storemerge10, 3
> %add82 = or i64 %mul5, 4
> %arrayidx = getelementptr inbounds float* %a, i64 %mul5
> %arrayidx11 = getelementptr inbounds float* %b, i64 %mul5
> %arrayidx13 = getelem...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...t;>>> Here is the IR printed by 'opt':
>>>>
>>>> entry:
>>>> %cmp9 = icmp ult i64 %start, %end
>>>> br i1 %cmp9, label %for.body, label %for.end
>>>>
>>>> for.body: ; preds = %entry, %for.body
>>>> %storemerge10 = phi i64 [ %inc, %for.body ], [ %start, %entry ]
>>>> %div = lshr i64 %storemerge10, 2
>>>> %mul1 = shl i64 %div, 3
>>>> %rem = and i64 %storemerge10, 3
>>>> %add2 = or i64 %mul1, %rem
>>>> %0 = lshr i64 %storemerge10, 1
>>>>...