thr3ads.net - search: "add14"

Displaying 13 results from an estimated 13 matches for "add14".

Did you mean: add1

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2012 Jan 17

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Hi, On Fri, Dec 30, 2011 at 3:09 AM, Tobias Grosser <tobias at grosser.es> wrote: > As it seems my intuition is wrong, I am very eager to see and understand > an example where a search limit of 4000 is really needed. > To make the ball roll again, I attached a testcase that can be tuned to understand the impact on compile time for different sizes of a basic block. One can also

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Dec 30

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On 12/29/2011 06:32 PM, Hal Finkel wrote: > On Thu, 2011-12-29 at 15:00 +0100, Tobias Grosser wrote: >> On 12/14/2011 01:25 AM, Hal Finkel wrote: >> One thing that I would still like to have is a test case where >> bb-vectorize-search-limit is needed to avoid exponential compile time >> growth and another test case that is not optimized, if >>

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2012 Jan 24

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...ars.iv.next, %do.body9 ], [ 0, %for.body ] %arrayidx11 = getelementptr inbounds [100 x i32]* %B, i64 0, i64 %indvars.iv %8 = load i32* %arrayidx11, align 4, !tbaa !0 %arrayidx13 = getelementptr inbounds [100 x i32]* %C, i64 0, i64 %indvars.iv %9 = load i32* %arrayidx13, align 4, !tbaa !0 %add14 = add nsw i32 %9, %8 %arrayidx16 = getelementptr inbounds [100 x i32]* %A, i64 0, i64 %indvars.iv %mul21 = mul nsw i32 %add14, %9 %sub = sub nsw i32 %add14, %mul21 %mul41 = mul nsw i32 %add14, %sub %sub48 = sub nsw i32 %add14, %mul41 %mul62 = mul nsw i32 %add14, %sub48 %sub69 = sub ns...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

...nbounds float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 > %3 = load float* %arrayidx12, align 4 > %arrayidx13 = getelementptr inbounds float* %b, i64 %add8 > %4 = load float* %arrayidx13, align 4 > %add14 = fadd float %3, %4 > %arrayidx15 = getelementptr inbounds float* %c, i64 %add8 > store float %add14, float* %arrayidx15, align 4 > %inc = add i64 %storemerge10, 1 > %exitcond = icmp eq i64 %inc, %end > br i1 %exitcond, label %for.end, label %for.body > > for.end: ; preds = %f...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

...lementptr inbounds float* %c, i64 %add8 SCEV: ((4 * %add8)<nsw> + %c)<nsw> LV: Src Scev: ((4 * %add2)<nsw> + %c)<nsw>Sink Scev: ((4 * %add8)<nsw> + %c)<nsw>(Induction step: 0) LV: Distance for store float %add10, float* %arrayidx11, align 4 to store float %add14, float* %arrayidx15, align 4: ((4 * %add8)<nsw> + (-4 * %add2)) Non-consecutive pointer access LV: We don't need a runtime memory check. LV: Can't vectorize due to memory conflicts LV: Not vectorizing. Here the code: entry: %cmp14 = icmp ult i64 %start, %end br i1 %cmp14, lab...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

...elementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %2 = load float* %arrayidx12, align 4 %arrayidx13 = getelementptr inbounds float* %b, i64 %add8 %3 = load float* %arrayidx13, align 4 %add14 = fadd float %2, %3 %arrayidx15 = getelementptr inbounds float* %c, i64 %add8 store float %add14, float* %arrayidx15, align 4 %inc = add i64 %i.015, 1 %exitcond = icmp eq i64 %inc, %end br i1 %exitcond, label %for.end, label %for.body for.end:...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

...s float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 > %2 = load float* %arrayidx12, align 4 > %arrayidx13 = getelementptr inbounds float* %b, i64 %add8 > %3 = load float* %arrayidx13, align 4 > %add14 = fadd float %2, %3 > %arrayidx15 = getelementptr inbounds float* %c, i64 %add8 > store float %add14, float* %arrayidx15, align 4 > %inc = add i64 %i.015, 1 > %exitcond = icmp eq i64 %inc, %end > br i1 %exitcond, label %for.end, label %for.body > > for.end:...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

...elementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %3 = load float* %arrayidx12, align 4 %arrayidx13 = getelementptr inbounds float* %b, i64 %add8 %4 = load float* %arrayidx13, align 4 %add14 = fadd float %3, %4 %arrayidx15 = getelementptr inbounds float* %c, i64 %add8 store float %add14, float* %arrayidx15, align 4 %inc = add i64 %storemerge10, 1 %exitcond = icmp eq i64 %inc, %end br i1 %exitcond, label %for.end, label %for.body for.end:...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote: > The access pattern to arrays a and b is non-linear. Unrolled loops are > usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all > values for i ? > Based on his list of values, it seems that the induction stride is linear within each block of 4 iterations, but it's not a clear

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

...tr inbounds float* %c, i64 %add8 SCEV: ((4 * %add8)<nsw> + %c)<nsw> > LV: Src Scev: ((4 * %add2)<nsw> + %c)<nsw>Sink Scev: ((4 * %add8)<nsw> + %c)<nsw>(Induction step: 0) > LV: Distance for store float %add10, float* %arrayidx11, align 4 to store float %add14, float* %arrayidx15, align 4: ((4 * %add8)<nsw> + (-4 * %add2)) > Non-consecutive pointer access > LV: We don't need a runtime memory check. > LV: Can't vectorize due to memory conflicts > LV: Not vectorizing. > > Here the code: > > entry: > %cmp14 = icmp...

Verify that we only get loop metadata on latches

2018 Jul 06

Verify that we only get loop metadata on latches

...; preds = %do.body br label %do.body6, !llvm.loop !8 do.body6: ; preds = %do.body6, %do.end ... br i1 %cmp17, label %do.body6, label %do.end18, !llvm.loop !8 do.end18: ; preds = %do.body6 ret i32 %add14 } *** IR Dump After Simplify the CFG *** ; Function Attrs: nounwind define i32 @test(i32* %a, i32 %n) local_unnamed_addr #0 { entry: br label %do.body, !llvm.loop !2 do.body: ; preds = %do.body, %entry ... br i1 %cmp, label %do.body, label %do.body6, !...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

...; store float %add10, float* %arrayidx11, align 4 >> %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 >> %2 = load float* %arrayidx12, align 4 >> %arrayidx13 = getelementptr inbounds float* %b, i64 %add8 >> %3 = load float* %arrayidx13, align 4 >> %add14 = fadd float %2, %3 >> %arrayidx15 = getelementptr inbounds float* %c, i64 %add8 >> store float %add14, float* %arrayidx15, align 4 >> %inc = add i64 %i.015, 1 >> %exitcond = icmp eq i64 %inc, %end >> br i1 %exitcond, label %for.end, label %for.body >>...

[LLVMdev] LiveIntervals analysis problem

2013 Feb 14

[LLVMdev] LiveIntervals analysis problem

...; preds = %if.end12.i %shl.i = shl nuw nsw i32 %conv5.i, 1 %conv14.i = trunc i32 %shl.i to i16 %.pre338 = load i16* %incdec.ptr.i, align 2, !tbaa !5 %phitmp = add i32 %i.025.i, 1 br label %for.body.i eshdn1.exit: ; preds = %if.end12.i %add147 = add nsw i32 %sub, 1 br label %mdfin mdfin: ; preds = %if.end141, %eshdn1.exit, %if.end10 %exp.addr.0 = phi i32 [ %sub, %if.end10 ], [ %add147, %eshdn1.exit ], [ %sub, %if.end141 ] %arrayidx149 = getelementptr inbounds i16* %s, i32 12 store i16...

search for: add14