thr3ads.net - search: "scevgep4"

2015 Sep 03

2

[RFC] New pass: LoopExitValues

...eq i32 %inc10, %Size br i1 %exitcond27, label %for.cond.cleanup.loopexit, label %for.body.4.lr.ph for.body.4: ; preds = %for.body.4, %for.body.4.lr.ph %lsr.iv8 = phi i32* [ %scevgep9, %for.body.4 ], [ %lsr.iv5, %for.body.4.lr.ph ] %lsr.iv3 = phi i32* [ %scevgep4, %for.body.4 ], [ %lsr.iv1, %for.body.4.lr.ph ] %lsr.iv = phi i32 [ %lsr.iv.next, %for.body.4 ], [ %Size, %for.body.4.lr.ph ] %3 = load i32, i32* %lsr.iv8, align 4, !tbaa !1 %mul5 = mul i32 %3, %Val store i32 %mul5, i32* %lsr.iv3, align 4, !tbaa !1 %lsr.iv.next = add i32 %lsr.iv, -1 %sc...

[LLVMdev] Unrolling an arithmetic expression inside a loop

2010 Nov 23

2

[LLVMdev] Unrolling an arithmetic expression inside a loop

...!tbaa !0 %5 = mul nsw i32 %4, %3 store i32 %5, i32* %scevgep9, align 4, !tbaa !0 2) In exec1 however it fails to recognize that the temporary variables are not reused anywhere and fails to simplify the arithmetic expression, producing: %scevgep = getelementptr i32* %X, i64 %indvar %scevgep4 = getelementptr i32* %Y, i64 %indvar %scevgep5 = getelementptr i32* %res, i64 %indvar %3 = load i32* %scevgep, align 4, !tbaa !0 %4 = load i32* %scevgep4, align 4, !tbaa !0 %5 = add nsw i32 %4, %3 %6 = mul nsw i32 %5, %4 %7 = mul nsw i32 %4, %4 %8 = sub i32 %6, %7 st...

[RFC] New pass: LoopExitValues

2015 Sep 10

2

[RFC] New pass: LoopExitValues

...p.loopexit, label % > for.body.4.lr.ph > > > > for.body.4: ; preds = > > %for.body.4, %for.body.4.lr.ph > > %lsr.iv8 = phi i32* [ %scevgep9, %for.body.4 ], [ %lsr.iv5, > > %for.body.4.lr.ph ] > > %lsr.iv3 = phi i32* [ %scevgep4, %for.body.4 ], [ %lsr.iv1, > > %for.body.4.lr.ph ] > > %lsr.iv = phi i32 [ %lsr.iv.next, %for.body.4 ], [ %Size, % > for.body.4.lr.ph ] > > %3 = load i32, i32* %lsr.iv8, align 4, !tbaa !1 > > %mul5 = mul i32 %3, %Val > > store i32 %mul5, i32* %lsr.iv3, alig...

[RFC] New pass: LoopExitValues

2015 Sep 26

2

[RFC] New pass: LoopExitValues

...* %Dst14, i32 %lsr.iv10 %uglygep1516 = bitcast i8* %uglygep15 to i32* br label %for.body.4 for.body.4: ; preds = %for.body.4, %for.body.4.lr.ph %lsr.iv8 = phi i32* [ %scevgep9, %for.body.4 ], [ %uglygep13, %for.body.4.lr.ph ] %lsr.iv3 = phi i32* [ %scevgep4, %for.body.4 ], [ %uglygep1516, %for.body.4.lr.ph ] %lsr.iv = phi i32 [ %lsr.iv.next, %for.body.4 ], [ %Size, %for.body.4.lr.ph ] %1 = load i32, i32* %lsr.iv8, align 4, !tbaa !0 %mul5 = mul i32 %1, %Val store i32 %mul5, i32* %lsr.iv3, align 4, !tbaa !0 %lsr.iv.next = add i32 %lsr.iv,...

[RFC] New pass: LoopExitValues

2015 Sep 01

2

[RFC] New pass: LoopExitValues

On Mon, Aug 31, 2015 at 5:52 PM, Jake VanAdrighem <jvanadrighem at gmail.com> wrote: > Do you have some specific performance measurements? Averaging 4 runs of 10000 iterations each of Coremark on my X86_64 desktop showed: -O2 performance: +2.9% faster with the L.E.V. pass -Os size: 1.5% smaller with the L.E.V. pass In the case of Coremark, the benefit comes mainly from the matrix

[RFC] New pass: LoopExitValues

2015 Sep 23

3

[RFC] New pass: LoopExitValues

On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >> Should we try the patch in it's current location, namely after LSR? > > Sure; post the patch as you have it so we can look at what's going on. > http://reviews.llvm.org/D12494 One particular point: The algorithm checks that SCEV's are equal when their raw pointers are equal. Is

[LLVMdev] Bignum development

2010 Jun 13

2

[LLVMdev] Bignum development

... ; <i128> [#uses=1] > %27 = add i128 %23, %26 ; <i128> [#uses=1] > %28 = add i128 %27, %25 ; <i128> [#uses=2] > %29 = trunc i128 %28 to i64 ; <i64> [#uses=1] > store i64 %29, i64* %scevgep4.i, align 8 > %30 = lshr i128 %28, 64 ; <i128> [#uses=1] > %31 = trunc i128 %30 to i64 ; <i64> [#uses=1] > %exitcond = icmp eq i64 %tmp.i, 999 ; <i1> [#uses=1] > > In other words, it just extends everything t...

[LLVMdev] Bignum development

2010 Jun 12

0

[LLVMdev] Bignum development

...2.i to i128 ; <i128> [#uses=1] %27 = add i128 %23, %26 ; <i128> [#uses=1] %28 = add i128 %27, %25 ; <i128> [#uses=2] %29 = trunc i128 %28 to i64 ; <i64> [#uses=1] store i64 %29, i64* %scevgep4.i, align 8 %30 = lshr i128 %28, 64 ; <i128> [#uses=1] %31 = trunc i128 %30 to i64 ; <i64> [#uses=1] %exitcond = icmp eq i64 %tmp.i, 999 ; <i1> [#uses=1] In other words, it just extends everything to an i128 and adds. T...

[LLVMdev] Bignum development

2010 Jun 13

0

[LLVMdev] Bignum development

...t;i128> [#uses=1] >> %27 = add i128 %23, %26 ; <i128> [#uses=1] >> %28 = add i128 %27, %25 ; <i128> [#uses=2] >> %29 = trunc i128 %28 to i64 ; <i64> [#uses=1] >> store i64 %29, i64* %scevgep4.i, align 8 >> %30 = lshr i128 %28, 64 ; <i128> [#uses=1] >> %31 = trunc i128 %30 to i64 ; <i64> [#uses=1] >> %exitcond = icmp eq i64 %tmp.i, 999 ; <i1> [#uses=1] >> >> In other words, it just...

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

0

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On 07/04/2013 01:39 PM, Stéphane Letz wrote: > Hi, > > Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 04

3

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Hi, Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to

[LLVMdev] Bignum development

2010 Jun 13

2

[LLVMdev] Bignum development

...s=1] >>> %27 = add i128 %23, %26 ; <i128> [#uses=1] >>> %28 = add i128 %27, %25 ; <i128> [#uses=2] >>> %29 = trunc i128 %28 to i64 ; <i64> [#uses=1] >>> store i64 %29, i64* %scevgep4.i, align 8 >>> %30 = lshr i128 %28, 64 ; <i128> [#uses=1] >>> %31 = trunc i128 %30 to i64 ; <i64> [#uses=1] >>> %exitcond = icmp eq i64 %tmp.i, 999 ; <i1> [#uses=1] >>> >>> In o...

[LLVMdev] Bignum development

2010 Jun 11

3

[LLVMdev] Bignum development

On Fri, Jun 11, 2010 at 3:28 PM, Bill Hart <goodwillhart at googlemail.com> wrote: > Hi Eli, > > On 11 June 2010 22:44, Eli Friedman <eli.friedman at gmail.com> wrote: >> On Fri, Jun 11, 2010 at 10:37 AM, Bill Hart <goodwillhart at googlemail.com> wrote: >>> a) What plans are there to support addition, subtraction, >>> multiplication, division,

[LLVMdev] Bignum development

2010 Jun 11

4

[LLVMdev] Bignum development

Hi all, After searching for a decent compiler backend for ages (google sometimes isn't helpful), I recently stumbled upon LLVM. Woot!! I work on bignum arithmetic (I'm a professional mathematician) and have recently decided to switch from developing GPL'd bignum code to BSD licensed code. (See http://www.mpir.org/ which I contributed to for a while - a fork of GMP). Please bear with

search for: scevgep4