similar to: [LLVMdev] Trip count and Loop Vectorizer

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Trip count and Loop Vectorizer"

2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
Hi Sriram, Thanks for performing this analysis. The problem here, both for memcpy and the vectorizer, is that we can’t predict the size of “n”, even though the only use of ’n’ is for the loop bound for the alloca [4 x [8 x i32]]. If you change the unroll condition to TC >= 0 then you will disable loop unrolling for all loops because getSmallConstantTripCount returns an unsigned number. You
2013 Sep 27
2
[LLVMdev] Trip count and Loop Vectorizer
Hi Nadav, Thanks for the response. I forgot to mention that there is an upper limit of 16 for the Trip Count check, TinyTripCountVectorThreshold = 16; if (TC > 0u && TC < TinyTripCountVectorThreshold). So right now, any loop with Trip Count as 0, or with value >=16, LV with unroll. With the change to the lower bound, it will also include the loop with 0 trip count. SCEV returns 0
2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
Sriram, The problem is that you want to unroll/vectorize many loops with non-constant loop count - it is a trade-off of which case you estimate as more likely. int foo(int *ptr, int n) { for ( .. i <n) ptr[i] = ... } The question is: is it more likely to have “n” such that unrolling is beneficial or not. Now, you could probably write an analysis that bounds the loop count (for the
2013 Apr 30
3
[LLVMdev] Improving the usability of LNT
Hi Daniel, I made some changes to the LNT perf reporting tool to make it more user friendly by adding some features: 1. Make the sidebar and the navigation bar stationary, so that it is easy to navigate the site 2. Have the pop-down menu for the items in the navigation bar, activate upon hovering the mouse, rather than clicking the item 3. Add a nav-link in the sidebar for the
2013 May 02
0
[LLVMdev] Improving the usability of LNT
Wow, that sounds great! Thanks for working on this, and yes, please, send the patches! --renato On 30 April 2013 16:23, Murali, Sriram <sriram.murali at intel.com> wrote: > Hi Daniel,**** > > I made some changes to the LNT perf reporting tool to make it more user > friendly by adding some features:**** > > **1. **Make the sidebar and the navigation bar stationary,
2013 Apr 15
1
[LLVMdev] State of Loop Unrolling and Vectorization in LLVM
Hi , I have a test case (and a micro benchmark made out of the test case) to check if loop unrolling and loop vectorization is efficiently done on LLVM. Here is the test case (credits: Tyler Nowicki) {code} extern float * array; extern int array_size; float g() { int i; float total = 0; for(i = 0; i < array_size; i++) { total += array[i]; } return total; } {code} When
2013 Sep 27
2
[LLVMdev] Trip count and Loop Vectorizer
On Sep 27, 2013, at 12:47 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > so you could infer that n must be smaller than 8 (because you know the range of the other dimension). The question is how often does such an example occur, where this is possible, to make such an effort justifiable? smaller equal, of course ;)
2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
Hey Arnold, I have run into this situation many times while benchmarking. I think it is best if this is addressed using a simple heuristic. For that, we need to identify the loop cost and decide if it makes sense to completely unroll the loop, or partially unroll. I am unsure of the optimal way to implement this though. I want to run it by the list to get any ideas floating around :) Thanks
2013 Jan 29
4
[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce
Hello, This patch aims to improve compile time performance by increasing the SCEV vector size in LoopStrengthReduce. It is observed that the BaseRegs vector size is 4 in most cases, and elements are frequently copied when it is initialized as SmallVector<const SCEV *, 2> BaseRegs. Our benchmark results show that the compilation time performance improved by ~0.5%. Patch by Wan Xiaofei.
2012 Apr 04
4
[LLVMdev] Disabling x87 instructions for a sub-target
Hello there, I recently started working on the LLVM backend for a target that doesn't support x87 instructions. Currently, I am in the process of completely disabling some x87 instructions such as fcomi, fcompi,... for a specific sub-target. I also do not have SSE enabled for my sub-target, and llvm resorts to fcomi* instructions for FP compare instructions. Is there a way to bypass the
2013 Jan 29
0
[LLVMdev] [Patch][Review Requested][Compilation Time] Avoid frequent copy of elements in LoopStrengthReduce
On Tue, Jan 29, 2013 at 3:59 PM, Murali, Sriram <sriram.murali at intel.com> wrote: > Our benchmark results show that the compilation time performance improved by > ~0.5%. That's fairly small; what was the standard deviation, confidence interval, etc? -- Sean Silva
2013 Feb 28
2
[LLVMdev] Calling with register indirect reference instead of memory indirect reference.
Hi, I am working on a small optimization feature to replace the calls with indirect reference using a memory with an indirect reference using register. The purpose of this feature is to improve the performance of calls to functions referred to by function pointers. The motivation behind this work is that gcc does this optimization. Here is a small test case, that will generate an indirect call
2012 Apr 04
0
[LLVMdev] Disabling x87 instructions for a sub-target
Hi Sriram, I'm not sure if I understand your question correctly: Do you need to generate code that contains no x87 floating-point instructions altogether, but uses calls into a soft-float library instead? That behaviour can be enabled using the "-soft-float" flag, as far as I know. Or is it only about the fcomi* instructions, which are not supported by pre-Pentium Pro chips? Then I
2016 Sep 16
4
SCEV cannot compute the trip count of Simple loop
Hi Deepali, SCEV reports the backedge taken count as "((-1 * (sext i32 (3 + %x) to i64))<nsw> + ((sext i32 (3 + %x) to i64) smax (sext i32 (6 + %x) to i64)))", so symbolically it does have an answer. Ideally SCEV should be able to exploit <nsw> on (3 + %x) and (6 + %x) to fold the expression above to "3", but due to some systemic issues SCEV can't exploit
2010 Apr 06
2
[LLVMdev] Get the loop trip count variable
Thanks a lot for your guys' help!!! I guess once I am able to get *V* (which probably is a pointer to a Value object), then, I can instrument some code at the IR level to dump V. As long as I maintain V at this pass stage, I should be able to dump the loop trip count. This is true, isn't it? Basically, what I am going to do is to add a function call before the loop body, such as:
2010 Aug 12
0
[LLVMdev] Questions about trip count
Dear guys, I am having problems to obtain good information from the LoopInfo. I am always getting a trip count of 0, even though I am clearly passing a loop with a constant bound. I am using this pass below: void testLoopInfo(const Function& F) const { const LoopInfo *LI = &getAnalysis<LoopInfo>(); Function::const_iterator BB = F.begin(), E = F.end(); for (; BB !=
2008 May 09
0
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Hi All, the attached patch performs the splitting in the proposed manner. before applying the patch, please execute svn cp lib/Transforms/Scalar/LoopUnroll.cpp lib/Transforms/Utils/UnrollLoop.cpp to make the patch apply and preserve proper history. Transforms/Utils/UnrollLoop.cpp contains the unrollLoop function, which is now used by the LoopUnroll pass. I've also moved the
2016 Sep 16
3
SCEV cannot compute the trip count of Simple loop
I have modified the example test case for UB error, still it didn’t unroll void foo(int x) { int p, i = 1; int mat[9][9][9]; for (p = (x+1) ; p < (x+3) ;p++) mat[x][p-1][i] = mat[x][p-1][i] + 5; } Regard, Deepali From: Kevin Choi [mailto:code.kchoi at gmail.com] Sent: Friday, September 16, 2016 1:20 PM To: Rai, Deepali Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] SCEV
2008 May 07
8
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Hello Matthijs, Separating mechanism from policy is a good thing for the LoopUnroll pass. Instead of moving the policy to a subclass though, I think it'd be better to move the mechanism, the unrollLoop function, out to be a standalone utility function, with the LoopInfo object passed in explicitly. FoldBlockIntoPredecessor would also be good to make into a standalone utility function, since
2013 May 23
2
[LLVMdev] LLVM Loop Vectorizer puzzle
Hi, I have the llvm loop vectorizer to complie the following sample: //================= int test(int *a, int n) { for(int i = 0; i < n; i++) { a[i] += i; } return 0; } //================ The corresponded .ll file has a loop preheader: //================ for.body.lr.ph: ; preds = %entry