Displaying 13 results from an estimated 13 matches for "add14".
Did you mean:
add1
2012 Jan 17
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
Hi,
On Fri, Dec 30, 2011 at 3:09 AM, Tobias Grosser <tobias at grosser.es> wrote:
> As it seems my intuition is wrong, I am very eager to see and understand
> an example where a search limit of 4000 is really needed.
>
To make the ball roll again, I attached a testcase that can be tuned
to understand the impact on compile time for different sizes of a
basic block. One can also
2011 Dec 30
3
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On 12/29/2011 06:32 PM, Hal Finkel wrote:
> On Thu, 2011-12-29 at 15:00 +0100, Tobias Grosser wrote:
>> On 12/14/2011 01:25 AM, Hal Finkel wrote:
>> One thing that I would still like to have is a test case where
>> bb-vectorize-search-limit is needed to avoid exponential compile time
>> growth and another test case that is not optimized, if
>>
2012 Jan 24
4
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
...ars.iv.next, %do.body9 ], [ 0, %for.body ]
%arrayidx11 = getelementptr inbounds [100 x i32]* %B, i64 0, i64 %indvars.iv
%8 = load i32* %arrayidx11, align 4, !tbaa !0
%arrayidx13 = getelementptr inbounds [100 x i32]* %C, i64 0, i64 %indvars.iv
%9 = load i32* %arrayidx13, align 4, !tbaa !0
%add14 = add nsw i32 %9, %8
%arrayidx16 = getelementptr inbounds [100 x i32]* %A, i64 0, i64 %indvars.iv
%mul21 = mul nsw i32 %add14, %9
%sub = sub nsw i32 %add14, %mul21
%mul41 = mul nsw i32 %add14, %sub
%sub48 = sub nsw i32 %add14, %mul41
%mul62 = mul nsw i32 %add14, %sub48
%sub69 = sub ns...
2013 Oct 30
3
[LLVMdev] loop vectorizer
...nbounds float* %c, i64 %add2
> store float %add10, float* %arrayidx11, align 4
> %arrayidx12 = getelementptr inbounds float* %a, i64 %add8
> %3 = load float* %arrayidx12, align 4
> %arrayidx13 = getelementptr inbounds float* %b, i64 %add8
> %4 = load float* %arrayidx13, align 4
> %add14 = fadd float %3, %4
> %arrayidx15 = getelementptr inbounds float* %c, i64 %add8
> store float %add14, float* %arrayidx15, align 4
> %inc = add i64 %storemerge10, 1
> %exitcond = icmp eq i64 %inc, %end
> br i1 %exitcond, label %for.end, label %for.body
>
> for.end: ; preds = %f...
2013 Oct 30
2
[LLVMdev] loop vectorizer
...lementptr
inbounds float* %c, i64 %add8 SCEV: ((4 * %add8)<nsw> + %c)<nsw>
LV: Src Scev: ((4 * %add2)<nsw> + %c)<nsw>Sink Scev: ((4 * %add8)<nsw> +
%c)<nsw>(Induction step: 0)
LV: Distance for store float %add10, float* %arrayidx11, align 4 to
store float %add14, float* %arrayidx15, align 4: ((4 * %add8)<nsw> + (-4
* %add2))
Non-consecutive pointer access
LV: We don't need a runtime memory check.
LV: Can't vectorize due to memory conflicts
LV: Not vectorizing.
Here the code:
entry:
%cmp14 = icmp ult i64 %start, %end
br i1 %cmp14, lab...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...elementptr inbounds float* %c, i64 %add2
store float %add10, float* %arrayidx11, align 4
%arrayidx12 = getelementptr inbounds float* %a, i64 %add8
%2 = load float* %arrayidx12, align 4
%arrayidx13 = getelementptr inbounds float* %b, i64 %add8
%3 = load float* %arrayidx13, align 4
%add14 = fadd float %2, %3
%arrayidx15 = getelementptr inbounds float* %c, i64 %add8
store float %add14, float* %arrayidx15, align 4
%inc = add i64 %i.015, 1
%exitcond = icmp eq i64 %inc, %end
br i1 %exitcond, label %for.end, label %for.body
for.end:...
2013 Oct 30
2
[LLVMdev] loop vectorizer
...s float* %c, i64 %add2
> store float %add10, float* %arrayidx11, align 4
> %arrayidx12 = getelementptr inbounds float* %a, i64 %add8
> %2 = load float* %arrayidx12, align 4
> %arrayidx13 = getelementptr inbounds float* %b, i64 %add8
> %3 = load float* %arrayidx13, align 4
> %add14 = fadd float %2, %3
> %arrayidx15 = getelementptr inbounds float* %c, i64 %add8
> store float %add14, float* %arrayidx15, align 4
> %inc = add i64 %i.015, 1
> %exitcond = icmp eq i64 %inc, %end
> br i1 %exitcond, label %for.end, label %for.body
>
> for.end:...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...elementptr inbounds float* %c, i64 %add2
store float %add10, float* %arrayidx11, align 4
%arrayidx12 = getelementptr inbounds float* %a, i64 %add8
%3 = load float* %arrayidx12, align 4
%arrayidx13 = getelementptr inbounds float* %b, i64 %add8
%4 = load float* %arrayidx13, align 4
%add14 = fadd float %3, %4
%arrayidx15 = getelementptr inbounds float* %c, i64 %add8
store float %add14, float* %arrayidx15, align 4
%inc = add i64 %storemerge10, 1
%exitcond = icmp eq i64 %inc, %end
br i1 %exitcond, label %for.end, label %for.body
for.end:...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote:
> The access pattern to arrays a and b is non-linear. Unrolled loops are
> usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all
> values for i ?
>
Based on his list of values, it seems that the induction stride is linear
within each block of 4 iterations, but it's not a clear
2013 Oct 30
0
[LLVMdev] loop vectorizer
...tr inbounds float* %c, i64 %add8 SCEV: ((4 * %add8)<nsw> + %c)<nsw>
> LV: Src Scev: ((4 * %add2)<nsw> + %c)<nsw>Sink Scev: ((4 * %add8)<nsw> + %c)<nsw>(Induction step: 0)
> LV: Distance for store float %add10, float* %arrayidx11, align 4 to store float %add14, float* %arrayidx15, align 4: ((4 * %add8)<nsw> + (-4 * %add2))
> Non-consecutive pointer access
> LV: We don't need a runtime memory check.
> LV: Can't vectorize due to memory conflicts
> LV: Not vectorizing.
>
> Here the code:
>
> entry:
> %cmp14 = icmp...
2018 Jul 06
2
Verify that we only get loop metadata on latches
...; preds = %do.body
br label %do.body6, !llvm.loop !8
do.body6: ; preds = %do.body6, %do.end
...
br i1 %cmp17, label %do.body6, label %do.end18, !llvm.loop !8
do.end18: ; preds = %do.body6
ret i32 %add14
}
*** IR Dump After Simplify the CFG ***
; Function Attrs: nounwind
define i32 @test(i32* %a, i32 %n) local_unnamed_addr #0 {
entry:
br label %do.body, !llvm.loop !2
do.body: ; preds = %do.body, %entry
...
br i1 %cmp, label %do.body, label %do.body6, !...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...; store float %add10, float* %arrayidx11, align 4
>> %arrayidx12 = getelementptr inbounds float* %a, i64 %add8
>> %2 = load float* %arrayidx12, align 4
>> %arrayidx13 = getelementptr inbounds float* %b, i64 %add8
>> %3 = load float* %arrayidx13, align 4
>> %add14 = fadd float %2, %3
>> %arrayidx15 = getelementptr inbounds float* %c, i64 %add8
>> store float %add14, float* %arrayidx15, align 4
>> %inc = add i64 %i.015, 1
>> %exitcond = icmp eq i64 %inc, %end
>> br i1 %exitcond, label %for.end, label %for.body
>>...
2013 Feb 14
1
[LLVMdev] LiveIntervals analysis problem
...; preds = %if.end12.i
%shl.i = shl nuw nsw i32 %conv5.i, 1
%conv14.i = trunc i32 %shl.i to i16
%.pre338 = load i16* %incdec.ptr.i, align 2, !tbaa !5
%phitmp = add i32 %i.025.i, 1
br label %for.body.i
eshdn1.exit: ; preds = %if.end12.i
%add147 = add nsw i32 %sub, 1
br label %mdfin
mdfin: ; preds = %if.end141, %eshdn1.exit, %if.end10
%exp.addr.0 = phi i32 [ %sub, %if.end10 ], [ %add147, %eshdn1.exit ], [ %sub, %if.end141 ]
%arrayidx149 = getelementptr inbounds i16* %s, i32 12
store i16...