search for: computeexitlimitfromcond

Displaying 4 results from an estimated 4 matches for "computeexitlimitfromcond".

2013 Sep 27
2
[LLVMdev] Trip count and Loop Vectorizer
...ctorThreshold). So right now, any loop with Trip Count as 0, or with value >=16, LV with unroll. With the change to the lower bound, it will also include the loop with 0 trip count. SCEV returns 0 trip count for this case, because it identifies that there is no backedge taken. ScalarEvolution::ComputeExitLimitFromCond () { ... if (L->contains(FBB) == !CI->getZExtValue()) { } else // The backedge is never taken. return getConstant(CI->getType(), 0); } From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Friday, September 27, 2013 1:03 PM To: Murali, Sriram Cc: llvmdev at cs.uiuc.edu Subject: Re...
2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
...So right now, any loop with Trip Count as 0, or with value >=16, LV with unroll. With the change to the lower bound, it will also include the loop with 0 trip count. > SCEV returns 0 trip count for this case, because it identifies that there is no backedge taken. > > ScalarEvolution::ComputeExitLimitFromCond () { > … > if (L->contains(FBB) == !CI->getZExtValue()) > { } > else > // The backedge is never taken. > return getConstant(CI->getType(), 0); > } > From: Nadav Rotem [mailto:nrotem at apple.com] > Sent: Friday, September 27, 2013 1:03 PM > To: Mu...
2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
Hi Sriram, Thanks for performing this analysis. The problem here, both for memcpy and the vectorizer, is that we can’t predict the size of “n”, even though the only use of ’n’ is for the loop bound for the alloca [4 x [8 x i32]]. If you change the unroll condition to TC >= 0 then you will disable loop unrolling for all loops because getSmallConstantTripCount returns an unsigned number. You
2013 Sep 27
2
[LLVMdev] Trip count and Loop Vectorizer
Hi, I am trying to get a small loop to *not vectorize* for cases where it doesn't make sense. For instance, this loop: void foo(int a[4][8], int n) { int b[4][8]; for(int i = 0; i < 4; i++) { for(int j = 0; j < n; j++) { a[i][j] = b[i][j]; } } } * Has maximum of 8ints copy. LLVM tries to use Memcpy for the inner loop. It is not helpful to perform