thr3ads.net - search: "mul5"

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

2013 Nov 01

2

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

...reheader, label %for.end20 for.body4.preheader: ; preds = %entry br label %for.body4 for.body4: ; preds = %for.body4.preheader, %for.body4 %storemerge10 = phi i64 [ %inc19, %for.body4 ], [ %div, %for.body4.preheader ] %mul5 = shl i64 %storemerge10, 3 %add82 = or i64 %mul5, 4 %arrayidx = getelementptr inbounds float* %a, i64 %mul5 %arrayidx11 = getelementptr inbounds float* %b, i64 %mul5 %arrayidx13 = getelementptr inbounds float* %c, i64 %mul5 %arrayidx14 = getelementptr inbounds float* %a, i64 %add82...

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

2013 Nov 01

0

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

....body4.preheader: ; preds = %entry > br label %for.body4 > > for.body4: ; preds = > %for.body4.preheader, %for.body4 > %storemerge10 = phi i64 [ %inc19, %for.body4 ], [ %div, > %for.body4.preheader ] > %mul5 = shl i64 %storemerge10, 3 > %add82 = or i64 %mul5, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %mul5 > %arrayidx11 = getelementptr inbounds float* %b, i64 %mul5 > %arrayidx13 = getelementptr inbounds float* %c, i64 %mul5 > %arrayidx14 = getelementptr inbounds flo...

[RFC] New pass: LoopExitValues

2015 Sep 03

2

[RFC] New pass: LoopExitValues

...ph %lsr.iv8 = phi i32* [ %scevgep9, %for.body.4 ], [ %lsr.iv5, %for.body.4.lr.ph ] %lsr.iv3 = phi i32* [ %scevgep4, %for.body.4 ], [ %lsr.iv1, %for.body.4.lr.ph ] %lsr.iv = phi i32 [ %lsr.iv.next, %for.body.4 ], [ %Size, %for.body.4.lr.ph ] %3 = load i32, i32* %lsr.iv8, align 4, !tbaa !1 %mul5 = mul i32 %3, %Val store i32 %mul5, i32* %lsr.iv3, align 4, !tbaa !1 %lsr.iv.next = add i32 %lsr.iv, -1 %scevgep4 = getelementptr i32, i32* %lsr.iv3, i32 1 %scevgep9 = getelementptr i32, i32* %lsr.iv8, i32 1 %exitcond = icmp eq i32 %lsr.iv.next, 0 br i1 %exitcond, label %for.cond.cleanu...

[RFC] New pass: LoopExitValues

2015 Sep 26

2

[RFC] New pass: LoopExitValues

...iv8 = phi i32* [ %scevgep9, %for.body.4 ], [ %uglygep13, %for.body.4.lr.ph ] %lsr.iv3 = phi i32* [ %scevgep4, %for.body.4 ], [ %uglygep1516, %for.body.4.lr.ph ] %lsr.iv = phi i32 [ %lsr.iv.next, %for.body.4 ], [ %Size, %for.body.4.lr.ph ] %1 = load i32, i32* %lsr.iv8, align 4, !tbaa !0 %mul5 = mul i32 %1, %Val store i32 %mul5, i32* %lsr.iv3, align 4, !tbaa !0 %lsr.iv.next = add i32 %lsr.iv, -1 %scevgep4 = getelementptr i32, i32* %lsr.iv3, i32 1 %scevgep9 = getelementptr i32, i32* %lsr.iv8, i32 1 %exitcond = icmp eq i32 %lsr.iv.next, 0 br i1 %exitcond, label %for.cond....

[RFC] New pass: LoopExitValues

2015 Sep 10

2

[RFC] New pass: LoopExitValues

...gt; > %for.body.4.lr.ph ] > > %lsr.iv3 = phi i32* [ %scevgep4, %for.body.4 ], [ %lsr.iv1, > > %for.body.4.lr.ph ] > > %lsr.iv = phi i32 [ %lsr.iv.next, %for.body.4 ], [ %Size, % > for.body.4.lr.ph ] > > %3 = load i32, i32* %lsr.iv8, align 4, !tbaa !1 > > %mul5 = mul i32 %3, %Val > > store i32 %mul5, i32* %lsr.iv3, align 4, !tbaa !1 > > %lsr.iv.next = add i32 %lsr.iv, -1 > > %scevgep4 = getelementptr i32, i32* %lsr.iv3, i32 1 > > %scevgep9 = getelementptr i32, i32* %lsr.iv8, i32 1 > > %exitcond = icmp eq i32 %lsr.iv...

[RFC] New pass: LoopExitValues

2015 Sep 01

2

[RFC] New pass: LoopExitValues

On Mon, Aug 31, 2015 at 5:52 PM, Jake VanAdrighem <jvanadrighem at gmail.com> wrote: > Do you have some specific performance measurements? Averaging 4 runs of 10000 iterations each of Coremark on my X86_64 desktop showed: -O2 performance: +2.9% faster with the L.E.V. pass -Os size: 1.5% smaller with the L.E.V. pass In the case of Coremark, the benefit comes mainly from the matrix

[LLVMdev] Issue with Machine Verifier and earlyclobber

2012 Jul 15

0

[LLVMdev] Issue with Machine Verifier and earlyclobber

...18776A0000000 %mul1 = fmul float %days, 0x3FEF8A09A0000000 %add2 = fadd float %mul1, 0x4076587740000000 %mul3 = fmul float %days, 0x3E81B35CC0000000 %sub = fsub float 0x3FFEA235C0000000, %mul3 %call = tail call float @dsin(float %add2) nounwind readnone %mul4 = fmul float %sub, %call %mul5 = fmul float %days, 0x3E27C04CA0000000 %sub6 = fsub float 0x3F94790B80000000, %mul5 %mul7 = fmul float %add2, 2.000000e+00 %call8 = tail call float @dsin(float %mul7) nounwind readnone %mul9 = fmul float %sub6, %call8 %add10 = fadd float %mul4, %mul9 %add11 = fadd float %add, %add10 %...

[RFC] New pass: LoopExitValues

2015 Sep 23

3

[RFC] New pass: LoopExitValues

On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >> Should we try the patch in it's current location, namely after LSR? > > Sure; post the patch as you have it so we can look at what's going on. > http://reviews.llvm.org/D12494 One particular point: The algorithm checks that SCEV's are equal when their raw pointers are equal. Is

[LLVMdev] Issue with Machine Verifier and earlyclobber

2012 Jul 15

2

[LLVMdev] Issue with Machine Verifier and earlyclobber

On Jul 15, 2012, at 9:20 AM, Borja Ferrer <borja.ferav at gmail.com> wrote: > Jakob, one more hint, I've placed some asserts around the code you added and noticed that the InlineSpiller::insertReload() function is not being called. > > 2012/7/14 Borja Ferrer <borja.ferav at gmail.com> > Hello Jakob, > > I'm still getting the error, I can give you any other

[RFC] New pass: LoopExitValues

2015 Sep 11

5

[RFC] New pass: LoopExitValues

Hi Steve it seems the general consensus is that the patch feels like a work-around for a problem with LSR (and possibly other loop transformations) that introduces redundant instructions. It is probably best to file a bug and a few of your test cases. Thanks Gerolf > On Sep 10, 2015, at 4:37 PM, Steve King via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Thu, Sep 10, 2015

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Dec 02

5

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...; + %i2 = load double* %c, align 8 > + %add = fadd double %mul, %i2 > + %arrayidx3 = getelementptr inbounds double* %a, i64 1 > + %i3 = load double* %arrayidx3, align 8 > + %arrayidx4 = getelementptr inbounds double* %b, i64 1 > + %i4 = load double* %arrayidx4, align 8 > + %mul5 = fmul double %i3, %i4 > + %arrayidx6 = getelementptr inbounds double* %c, i64 1 > + %i5 = load double* %arrayidx6, align 8 > + %add7 = fadd double %mul5, %i5 > + %mul9 = fmul double %add, %i1 > + %add11 = fadd double %mul9, %i2 > + %mul13 = fmul double %add7, %i4 > + %a...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Dec 14

0

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...ign 8 > > + %add = fadd double %mul, %i2 > > + %arrayidx3 = getelementptr inbounds double* %a, i64 1 > > + %i3 = load double* %arrayidx3, align 8 > > + %arrayidx4 = getelementptr inbounds double* %b, i64 1 > > + %i4 = load double* %arrayidx4, align 8 > > + %mul5 = fmul double %i3, %i4 > > + %arrayidx6 = getelementptr inbounds double* %c, i64 1 > > + %i5 = load double* %arrayidx6, align 8 > > + %add7 = fadd double %mul5, %i5 > > + %mul9 = fmul double %add, %i1 > > + %add11 = fadd double %mul9, %i2 > > + %mul13 = fmu...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 23

0

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Mon, 2011-11-21 at 21:22 -0600, Hal Finkel wrote: > On Mon, 2011-11-21 at 11:55 -0600, Hal Finkel wrote: > > Tobias, > > > > I've attached an updated patch. It contains a few bug fixes and many > > (refactoring and coding-convention) changes inspired by your comments. > > > > I'm currently trying to fix the bug responsible for causing a compile

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Dec 02

0

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...ign 8 > > + %add = fadd double %mul, %i2 > > + %arrayidx3 = getelementptr inbounds double* %a, i64 1 > > + %i3 = load double* %arrayidx3, align 8 > > + %arrayidx4 = getelementptr inbounds double* %b, i64 1 > > + %i4 = load double* %arrayidx4, align 8 > > + %mul5 = fmul double %i3, %i4 > > + %arrayidx6 = getelementptr inbounds double* %c, i64 1 > > + %i5 = load double* %arrayidx6, align 8 > > + %add7 = fadd double %mul5, %i5 > > + %mul9 = fmul double %add, %i1 > > + %add11 = fadd double %mul9, %i2 > > + %mul13 = fmu...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 22

5

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Mon, 2011-11-21 at 11:55 -0600, Hal Finkel wrote: > Tobias, > > I've attached an updated patch. It contains a few bug fixes and many > (refactoring and coding-convention) changes inspired by your comments. > > I'm currently trying to fix the bug responsible for causing a compile > failure when compiling >

search for: mul5