similar to: LoopVectorize fails to vectorize more complex loops

Displaying 20 results from an estimated 6000 matches similar to: "LoopVectorize fails to vectorize more complex loops"

2018 Jun 11
2
LoopVectorize fails to vectorize code with condition on reduction
Hello. I'm not able to vectorize this simple C loop doing basically what could be called predicated sum-reduction: #define NMAX 1000 int colOccupied[NMAX]; void Func(int N) { int numSol = 0; for (int c = 0; c < N; c++) { if (colOccupied[c] == 0) numSol++; } return numSol; } The compiler
2017 Apr 14
2
Separate LoopVectorize LLVM pass
Hello. I am trying to create my own LoopVectorize.cpp pass as a separate pass from the LLVM trunk, as described in http://llvm.org/docs/CMake.html#embedding-llvm-in-your-project. Did anybody try something like this? I added close to the end of the .cpp file: /* this line seems to be required - it allows to run this pass as an embedded pass by giving opt -my-loop-vectorize
2016 Aug 21
2
LoopVectorize module - some possible enhancements
Hello, Michael, I'd like to ask if we can enhance the LoopVectorize LLVM module (I am currently using a version from Jul 2016). More exactly: - do you envision to support in the near future LLVM IR gather and scatter intrinsics (as described at http://llvm.org/docs/LangRef.html#llvm-masked-gather-intrinsics and scatter)? I see you have defined some methods that should
2015 Feb 26
1
[LLVMdev] RFC: Loop versioning for LICM
Hi Ashutosh, Have you been following the recent Loop Access Analysis work? LAA was split out from the Loop Vectorizer that have been performing the kind of loop versioning that you describe. The main reason was to be able to share this functionality with other passes. Loop Access Analysis is an analysis pass that computes basic memory dependence and the runtime checks. The versioning decision
2015 Feb 26
6
[LLVMdev] RFC: Loop versioning for LICM
I like to propose a new loop multi versioning optimization for LICM. For now I kept this for LICM only, but it can be used in multiple places. The main motivation is to allow optimizations stuck because of memory alias dependencies. Most of the time when alias analysis is unsure about memory access and it says may-alias. This un surety from alias analysis restrict some of the memory based
2016 Aug 01
2
LLVM Loop vectorizer - 2 vector.body blocks appear
Hello. Mikhail, with the more recent version of the LoopVectorize.cpp code (retrieved at the beginning of July 2016) I ran the following piece of C code: void foo(long *A, long *B, long *C, long N) { for (long i = 0; i < N; ++i) { C[i] = A[i] + B[i]; } } The vectorized LLVM program I obtain contains 2 vector.body blocks - one named
2016 Feb 18
3
[LLVMdev] LLVM loop vectorizer
Hi Alex, I'm not aware of efforts on loop coalescing in LLVM, but probably polly can do something like this. Also, one related thought: it might be worth making it a separate pass, not a part of loop vectorizer. LLVM already has several 'utility' passes (e.g. loop rotation), which primarily aims at enabling other passes. Thanks, Michael > On Feb 15, 2016, at 6:44 AM, RCU
2012 Dec 10
3
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
Hello all, I wanted to get some feedback on this patch for ScalarEvolution. It addresses a performance problem I am seeing for simple benchmark. Starting with this C code: 01: signed char foo(void) 02: { 03: const int count = 8000; 04: signed char result = 0; 05: int j; 06: 07: for (j = 0; j < count; ++j) { 08: result += (result_t)(3); 09: } 10: 11: return result; 12: } I
2016 Jun 04
4
[LLVMdev] LLVM loop vectorizer
Hi Alex, I think the changes you want are actually not vectorizer related. Vectorizer just uses data provided by other passes. What you probably might want is to look into routine Loop::getStartLoc() (see lib/Analysis/LoopInfo.cpp). If you find a way to improve it, patches are welcome:) Thanks, Michael > On Jun 3, 2016, at 6:13 PM, Alex Susu <alex.e.susu at gmail.com> wrote: >
2016 Jun 07
2
[LLVMdev] LLVM loop vectorizer
Hi Alex, This has been very recently fixed by Hal. See http://reviews.llvm.org/rL270771 Adam > On Jun 4, 2016, at 3:13 AM, Alex Susu via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello. > Mikhail, I come back to this older thread. > I need to do a few changes to LoopVectorize.cpp. > > One of them is related to figuring out the exact C source line
2016 Apr 20
2
[IndVarSimplify] Narrow IV's are not eliminated resulting in inefficient code
​Hi, Would you be able to kindly check and assist with the IndVarSimplify / SCEV problem I got in the latest LLVM, please? Sometimes IndVarSimplify may not eliminate narrow IV's when there actually exists such a possibility. This may affect other LLVM passes and result in inefficient code. The reproducing test 'indvar_test.cpp' is attached. The problem is with the second
2015 Jul 08
7
[LLVMdev] LLVM loop vectorizer
Hello. I am trying to vectorize a CSR SpMV (sparse matrix vector multiplication) procedure but the LLVM loop vectorizer is not able to handle such code. I am using cland and llvm version 3.4 (on Ubuntu 12.10). I use the -fvectorize option with clang and -loop-vectorize with opt-3.4 . The CSR SpMV function is inspired from
2014 Mar 18
4
[LLVMdev] E = L->begin() in LoopVectorize
Hi, I'm studying loop vectorizer. I don't understand the code yet. But it looks not right to assign L->begin() to E. Is it a typo? Thanks, Liang diff --git a/lib/Transforms/Vectorize/LoopVectorize.cpp b/lib/Transforms/Vectorize/LoopVectorize.cpp index 435c005..87b5d79 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@
2014 Mar 18
2
[LLVMdev] E = L->begin() in LoopVectorize
Looking at it now, curious why no tests failed. On Tue, Mar 18, 2014 at 2:48 PM, Jim Grosbach <grosbach at apple.com> wrote: > Almost certainly, yes. Nice catch! > > > On Mar 18, 2014, at 2:38 PM, Liang Wang <netcasper at gmail.com> wrote: > > > Hi, > > > > I'm studying loop vectorizer. I don't understand the code yet. But > > it
2013 Nov 08
1
[LLVMdev] loop vectorizer and storing to uniform addresses
I changed the input C to using a 64 bit type for the loop index (this eliminates 'sext' instructions in the IR) Here the IR produced with clang -O0 define float @foo(i64 %start, i64 %end, float* %A) #0 { entry: %start.addr = alloca i64, align 8 %end.addr = alloca i64, align 8 %A.addr = alloca float*, align 8 %sum = alloca [4 x float], align 16 %i = alloca i64, align 8
2015 Apr 24
5
[LLVMdev] Loss of precision with very large branch weights
In PR 22718, we are looking at issues with long running applications producing non-representative frequencies. For example, in these two loops: int g = 0; __attribute__((noinline)) void bar() { g++; } extern int printf(const char*, ...); int main() { int i, j, k; for (i = 0; i < 1000000; i++) bar(); printf ("g = %d\n", g); g = 0; for (i = 0; i < 500000; i++)
2015 Mar 04
2
[LLVMdev] RFC: Loop versioning for LICM
> On Mar 3, 2015, at 1:29 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com <mailto:Ashutosh.Nema at amd.com>> wrote: > > Hi Adam, > > Thanks for looking into LoopVersioning work. > > I have gone through recent LoopAccessAnalysis changes and found some of the stuff > overlaps (i.e. runtime memory check, loop access analysis etc.). LoopVersioning can > use
2016 Dec 09
5
TableGen - Help to implement a form of gather/scatter operations for Mips MSA
Hello. I read on page 4 of http://www.cs.fsu.edu/~whalley/cda5155/chap4.pdf that gather and scatter operations exist for Mips, named LVI and SVI, respectively. Did anyone think of implementing in the LLVM Mips back end (part of the MSA vector instructions) gather and scatter operations? If so, can you share with me the TableGen spec? (I tried to start from LD_DESC_BASE, but it
2015 Jun 12
4
[LLVMdev] Loop Vectorization and Store-Load Forwarding issue
I have been looking into this small test case (Part A) where loop vectorization is disabled due to possible store-load forwarding conflict (Part B). As you can see, due to the presence of dependence distance 2 the loop is vectorizable only for a width of 2. However, the presence of dependence distance 15 (due to y[j-15]) results in store-load forwarding issue as store packet of y[16:17] (iteration
2013 Apr 03
2
[LLVMdev] Packed instructions generaetd by LoopVectorize?
Hi, I have a question about LoopVectorize. I wrote a simple test case, a dot product loop and found that packed instructions are generated when input arrays are integer, but not when they are float or double. If I modify the float example in http://llvm.org/docs/Vectorizers.html by adding restrict to the input arrays packed instructions are generated. Although it should not be required I tried