thr3ads.net - similar to: "[LLVMdev] Loop unrolling analysis next steps"

Displaying 20 results from an estimated 30000 matches similar to: "[LLVMdev] Loop unrolling analysis next steps"

[SCEV] getMulExpr could be extremely slow when creating SCEVs for a long chain of add/mul instructions

2016 Aug 03

[SCEV] getMulExpr could be extremely slow when creating SCEVs for a long chain of add/mul instructions

Hi, I'm working on a slow-compile problem caused by SCEV (PR28830), and I need your suggestions on how to fix it. The loop below causes ScalarEvolution::getMulExpr to hang. int get(unsigned n) { unsigned i, j, mult = 1; for (i = 0; i < 1; i++) { for (j = 0; j < 30; j++) { mult *= n++; } } return mult; } the inner loop is completed unrolled

llvm is getting slower, January edition

2017 Jan 18

llvm is getting slower, January edition

On 1/18/17 3:55 PM, Davide Italiano via llvm-dev wrote: > On Tue, Jan 17, 2017 at 6:02 PM, Mikhail Zolotukhin > <mzolotukhin at apple.com> wrote: >> Hi, >> >> Continuing recent efforts in understanding compile time slowdowns, I looked at some historical data: I picked one test and tried to pin-point commits that affected its compile-time. The data I have is not 100%

llvm is getting slower, January edition

2017 Jan 18

llvm is getting slower, January edition

Hi, Continuing recent efforts in understanding compile time slowdowns, I looked at some historical data: I picked one test and tried to pin-point commits that affected its compile-time. The data I have is not 100% accurate, but hopefully it helps to provide an overview of what's going on with compile time in LLVM and give a better understanding of what changes usually impact compile time.

llvm is getting slower, January edition

2017 Jan 20

llvm is getting slower, January edition

Ah but how did you compile the clang-4.0 you were using? Does it run faster if you compile it with clang-4.0? :) On Fri, Jan 20, 2017 at 4:09 AM, Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > On this topic, I just tried to build ToT with clang-3.9.1 and clang-4.0 > and the total time to complete `ninja clang` on this machine went from > 12m54s to

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 13

Loop Unrolling Fail in Simple Vectorized loop

Thanks for the explanation. But I am a little confused with the following fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert the loop to say; for(unsigned i = 0; i < vectorizable_elements ; i += 2){ //main loop } for(unsigned i=0 ; i < vectorizable_elements % 2; i++){ //fix up } Why does it have to reason about the range of vectorizable_elements? Even

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 13

Loop Unrolling Fail in Simple Vectorized loop

If count > MAX_UINT-4 your loop loops indefinitely with an increment of 4, I think. On Thu, Oct 13, 2016 at 4:42 PM, Charith Mendis via llvm-dev < llvm-dev at lists.llvm.org> wrote: > So, I tried unrolling the following simple loop. > > int unroll(unsigned * a, unsigned * b, unsigned *c, unsigned count){ > > for(unsigned i=0; i<count; i++){ > > a[i] =

question about llvm partial unrolling/runtime unrolling

2015 Oct 12

question about llvm partial unrolling/runtime unrolling

Hi, I am trying to do loop unrolling with loops that don't have constant loop counter. It is highly appreciated if anyone can help me on this. What I want to do is to turn loop (n) { <loop body> } into loop (n/4) { <loop body> <loop body> <loop body> <loop body> } loop (n%4) { <loop

question about llvm partial unrolling/runtime unrolling

2015 Oct 16

question about llvm partial unrolling/runtime unrolling

Hi Hal, I did opt.exe -S -debug -loop-unroll -unroll-runtime=true -unroll-count=4 csShader.ll and it prints out: Args: opt.exe -S -debug -loop-unroll -unroll-runtime=true -unroll-count=4 csShader.ll Loop Unroll: F[build_cs_5_0] Loop %loop_entry Loop Size = 82 partially unrolling with count: 1 Thanks, Frances On Thu, Oct 15, 2015 at 9:35 PM, Hal Finkel <hfinkel at anl.gov>

LoopSimplify pass prevents loop unrolling

2017 Jun 30

LoopSimplify pass prevents loop unrolling

Hi All, In the attached test case there, is an unnested loop with 2 iterations. The loop latch block is terminated by an unconditional branch, so simplifycfg folds the almost empty latch block into its predecessor which is the loop header. This results in an additional backedge in the CFG, so when LoopRotate pass is called it canonicalizes the loop into a nested loop. However, now the loop

[IndVarSimplify] Narrow IV's are not eliminated resulting in inefficient code

2016 Apr 23

[IndVarSimplify] Narrow IV's are not eliminated resulting in inefficient code

Hi Sanjoy, Thank you for looking into this! Yes, your patch does fix my larger test case too. My algorithm gets double performance improvement with the patch, as the loop now has a smaller instruction set and succeeds to unroll w/o any extra #pragma's. I also ran the LLVM tests against the patch. There are 6 new failures: Analysis/LoopAccessAnalysis/number-of-memchecks.ll

[LLVMdev] alloc_size metadata

2012 Jun 01

[LLVMdev] alloc_size metadata

Hi, Sorry for the delay; comments below. >>>> This is actually non-trivial to accomplish. >>>> Metadata doesn't count as a user, so internal functions with no >>>> other usage will get removed. >>> >>> I thought that it is possible to have passes run before the optimizer >>> performs such deletions. Is this not practical? Another

LoopSimplify pass prevents loop unrolling

2017 Jun 30

LoopSimplify pass prevents loop unrolling

On 6/30/2017 7:48 AM, Balaram Makam via llvm-dev wrote: > > Edit. Predecessor -> successor. > > *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf > Of *Balaram Makam via llvm-dev > *Sent:* Friday, June 30, 2017 10:47 AM > *To:* llvm-dev at lists.llvm.org > *Subject:* [llvm-dev] LoopSimplify pass prevents loop unrolling > > Hi All, > >

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 12

Loop Unrolling Fail in Simple Vectorized loop

Hi all, Attached herewith is a simple vectorized function with loops performing a simple shuffle. I want all loops (inner and outer) to be unrolled by 2 and as such used -unroll-count=2 The inner loops(with k as the induction variable and having constant trip counts) unroll fully, but the outer loop with (j) fails to unroll. The llvm code is also attached with inner loops fully unrolled. To

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

2012 Dec 10

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

Hello all, I wanted to get some feedback on this patch for ScalarEvolution. It addresses a performance problem I am seeing for simple benchmark. Starting with this C code: 01: signed char foo(void) 02: { 03: const int count = 8000; 04: signed char result = 0; 05: int j; 06: 07: for (j = 0; j < count; ++j) { 08: result += (result_t)(3); 09: } 10: 11: return result; 12: } I

[LLVMdev] SCEV implementation and limitations, do we need "pow"?

2014 Feb 05

[LLVMdev] SCEV implementation and limitations, do we need "pow"?

Hi, I was looking at some bugs to play with, and I started with http://llvm.org/bugs/show_bug.cgi?id=18606 As I commented there, a loop is unrolled and exhibit this pattern: %mul.1 = mul i32 %mul, %mul %mul.2 = mul i32 %mul.1, %mul.1 .... With an unroll factor of 32, the last multiply has 2^32 terms in its SCEV expression. (I mean I expect it would have those terms if I was patient

LoopSimplify pass prevents loop unrolling

2017 Jun 30

LoopSimplify pass prevents loop unrolling

On 6/30/2017 11:38 AM, Balaram Makam wrote: > > Thanks Eli, > > I was looking at this code which keeps track of loop headers but is > checking if the destination of branch is a loop header sufficient? > This prevents merging empty preheaders into the loop headers as well. > There isn't really any reason to collapse preheaders anyway; LoopSimplify will recreate them,

[LLVMdev] SCEV implementation and limitations, do we need "pow"?

2014 Feb 08

[LLVMdev] SCEV implementation and limitations, do we need "pow"?

On 2/7/14, 10:24 AM, Andrew Trick wrote: > > On Feb 5, 2014, at 12:54 AM, Mehdi Amini <mehdi.amini at silkan.com > <mailto:mehdi.amini at silkan.com>> wrote: > >> Hi, >> >> I was looking at some bugs to play with, and I started with >> http://llvm.org/bugs/show_bug.cgi?id=18606 >> >> As I commented there, a loop is unrolled and exhibit

[LLVMdev] SCEV bottom value

2012 Oct 08

[LLVMdev] SCEV bottom value

On Sun, 7 Oct 2012 18:53:59 -0700 Preston Briggs <preston.briggs at gmail.com> wrote: > I'd like a value, call it Bottom, such that > > SE->getAddExpr(Bottom, X) => Bottom > SE->getMulExpr(Bottom, X,) => Bottom > isKnownPredicate(any, Bottom, X) => false > etc. > > > I can write code to make NULL work like I want, but it would be > simpler

[LLVMdev] SCEV bottom value

2012 Oct 08

[LLVMdev] SCEV bottom value

Hi Preston, I was wondering ... "Bottom" is a bit overloaded as far as terms go. Would SCEVNaN be a better name for this beast? Sameer. > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On > Behalf Of Sameer Sahasrabuddhe > Sent: Monday, October 08, 2012 9:16 AM > To: preston.briggs at gmail.com > Cc: LLVM

[LLVMdev] [polly] scev codegen (first step to remove the dependence on ivcanon pass)

2012 Dec 03

[LLVMdev] [polly] scev codegen (first step to remove the dependence on ivcanon pass)

Tobias Grosser wrote: > You create a map from the old_loop to a symbolic expression. What type would > this symbolic expression have? Would it be a SCEVExpr? evaluateAtIteration takes a scev, so apply will take a scev, or a map (loop->scev). You can always build a ScevUnknown from an SSA name and use that in the apply. > At the moment, we calculate at the beginning of each >

similar to: [LLVMdev] Loop unrolling analysis next steps