thr3ads.net - similar to: "LoopVectorize fails to vectorize code with condition on reduction"

Displaying 20 results from an estimated 1000 matches similar to: "LoopVectorize fails to vectorize code with condition on reduction"

LoopVectorize fails to vectorize more complex loops

2018 Jul 07

LoopVectorize fails to vectorize more complex loops

Hello. Could you please tell me why the first loop of the following program (also maybe the commented loop) doesn't get vectorized with LoopVectorize (from a recent LLVM build from the SVN repository from Jun 2018)? typedef short TYPE; TYPE data[1400][1200]; void kernel_covariance(int m, int n, TYPE mean[1200]) { int i, j, k; for (j = 0; j < m; j++) { mean[j] =

[LLVMdev] Packed instructions generaetd by LoopVectorize?

2013 Apr 03

[LLVMdev] Packed instructions generaetd by LoopVectorize?

Hi, I have a question about LoopVectorize. I wrote a simple test case, a dot product loop and found that packed instructions are generated when input arrays are integer, but not when they are float or double. If I modify the float example in http://llvm.org/docs/Vectorizers.html by adding restrict to the input arrays packed instructions are generated. Although it should not be required I tried

[LLVMdev] Packed instructions generaetd by LoopVectorize?

2013 Apr 03

[LLVMdev] Packed instructions generaetd by LoopVectorize?

Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating point operations. Thanks, Nadav On Apr 3, 2013, at 10:29 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> wrote: > Hi, > > I have a question about LoopVectorize. I wrote a simple test case, a dot product loop and found that packed instructions are

[LLVMdev] Packed instructions generaetd by LoopVectorize?

2013 Apr 04

[LLVMdev] Packed instructions generaetd by LoopVectorize?

Thanks, that did it! Are there any plans to enable the loop vectorizer by default? From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Wednesday, April 03, 2013 13:33 PM To: Nowicki, Tyler Cc: LLVM Developers Mailing List Subject: Re: Packed instructions generaetd by LoopVectorize? Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating

Separate LoopVectorize LLVM pass

2017 Apr 14

Separate LoopVectorize LLVM pass

Hello. I am trying to create my own LoopVectorize.cpp pass as a separate pass from the LLVM trunk, as described in http://llvm.org/docs/CMake.html#embedding-llvm-in-your-project. Did anybody try something like this? I added close to the end of the .cpp file: /* this line seems to be required - it allows to run this pass as an embedded pass by giving opt -my-loop-vectorize

LoopVectorize fails to vectorize loops with induction variables with PtrToInt/IntToPtr conversions

2017 Jun 20

LoopVectorize fails to vectorize loops with induction variables with PtrToInt/IntToPtr conversions

On 06/20/2017 03:26 AM, Hal Finkel wrote: > Hi, Adrien, Hello Hal! Thanks for your answer! > Thanks for reporting this. I recommend that you file a bug report at > https://bugs.llvm.org/ Will do! > Whenever I see reports of missed optimization opportunities in the face > of ptrtoint/inttoptr, my first question is: why are these instructions > present in the first place? At

[LLVMdev] E = L->begin() in LoopVectorize

2014 Mar 18

[LLVMdev] E = L->begin() in LoopVectorize

Hi, I'm studying loop vectorizer. I don't understand the code yet. But it looks not right to assign L->begin() to E. Is it a typo? Thanks, Liang diff --git a/lib/Transforms/Vectorize/LoopVectorize.cpp b/lib/Transforms/Vectorize/LoopVectorize.cpp index 435c005..87b5d79 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@

[LLVMdev] E = L->begin() in LoopVectorize

2014 Mar 18

[LLVMdev] E = L->begin() in LoopVectorize

Looking at it now, curious why no tests failed. On Tue, Mar 18, 2014 at 2:48 PM, Jim Grosbach <grosbach at apple.com> wrote: > Almost certainly, yes. Nice catch! > > > On Mar 18, 2014, at 2:38 PM, Liang Wang <netcasper at gmail.com> wrote: > > > Hi, > > > > I'm studying loop vectorizer. I don't understand the code yet. But > > it

LoopVectorize module - some possible enhancements

2016 Aug 21

LoopVectorize module - some possible enhancements

Hello, Michael, I'd like to ask if we can enhance the LoopVectorize LLVM module (I am currently using a version from Jul 2016). More exactly: - do you envision to support in the near future LLVM IR gather and scatter intrinsics (as described at http://llvm.org/docs/LangRef.html#llvm-masked-gather-intrinsics and scatter)? I see you have defined some methods that should

[RFC] Make LoopVectorize Aware of SLP Operations

2018 Feb 08

[RFC] Make LoopVectorize Aware of SLP Operations

Hi, On 08/02/2018 04:22, Caballero, Diego wrote: > Hi Florian! > > This proposal sounds pretty exciting! Integrating SLP-aware loop vectorization (or the other way around) and SLP into the VPlan framework is definitely aligned with the long term vision and we would prefer this approach to the LoopReroll and InstCombine alternatives that you mentioned. We prefer a generic implementation

[RFC] Make LoopVectorize Aware of SLP Operations

2018 Feb 08

[RFC] Make LoopVectorize Aware of SLP Operations

Hi Florian! This proposal sounds pretty exciting! Integrating SLP-aware loop vectorization (or the other way around) and SLP into the VPlan framework is definitely aligned with the long term vision and we would prefer this approach to the LoopReroll and InstCombine alternatives that you mentioned. We prefer a generic implementation that can handle complicated cases to something ad-hoc for some

LoopVectorize fails to vectorize loops with induction variables with PtrToInt/IntToPtr conversions

2017 Jun 17

LoopVectorize fails to vectorize loops with induction variables with PtrToInt/IntToPtr conversions

Hello all, There is a missing vectorization opportunity issue with clang 4.0 with the file attached. Indeed, when compiled with -O2, the "op_distance" function get vectorized, but not the "op" one. For information, this test case has been reduced from a file generated by the Pythran compiler (https://github.com/serge-sans-paille/pythran). If we take a look at the generated

[RFC] Make LoopVectorize Aware of SLP Operations

2018 Feb 06

[RFC] Make LoopVectorize Aware of SLP Operations

Hello, We would like to propose making LoopVectorize aware of SLP operations, to improve the generated code for loops operating on struct fields or doing complex math. At the moment, LoopVectorize uses interleaving to vectorize loops that operate on values loaded/stored from consecutive addresses: vector loads/stores are generated to combine consecutive loads/stores and then shufflevector

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 24

[LoopVectorizer] Improving the performance of dot product reduction loop

On 07/24/2018 02:58 AM, Nema, Ashutosh wrote: > > > > > > *From:*Hal Finkel <hfinkel at anl.gov> > *Sent:* Tuesday, July 24, 2018 5:05 AM > *To:* Craig Topper <craig.topper at gmail.com>; hideki.saito at intel.com; > estotzer at ti.com; Nemanja Ivanovic <nemanja.i.ibm at gmail.com>; Adam > Nemet <anemet at apple.com>; graham.hunter at

[LLVMdev] LLVM loop vectorizer

2016 Feb 18

[LLVMdev] LLVM loop vectorizer

Hi Alex, I'm not aware of efforts on loop coalescing in LLVM, but probably polly can do something like this. Also, one related thought: it might be worth making it a separate pass, not a part of loop vectorizer. LLVM already has several 'utility' passes (e.g. loop rotation), which primarily aims at enabling other passes. Thanks, Michael > On Feb 15, 2016, at 6:44 AM, RCU

[LLVMdev] LLVM loop vectorizer

2016 Jun 04

[LLVMdev] LLVM loop vectorizer

Hi Alex, I think the changes you want are actually not vectorizer related. Vectorizer just uses data provided by other passes. What you probably might want is to look into routine Loop::getStartLoc() (see lib/Analysis/LoopInfo.cpp). If you find a way to improve it, patches are welcome:) Thanks, Michael > On Jun 3, 2016, at 6:13 PM, Alex Susu <alex.e.susu at gmail.com> wrote: >

[LLVMdev] LLVM loop vectorizer

2016 Jun 07

[LLVMdev] LLVM loop vectorizer

Hi Alex, This has been very recently fixed by Hal. See http://reviews.llvm.org/rL270771 Adam > On Jun 4, 2016, at 3:13 AM, Alex Susu via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello. > Mikhail, I come back to this older thread. > I need to do a few changes to LoopVectorize.cpp. > > One of them is related to figuring out the exact C source line

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 23

[LoopVectorizer] Improving the performance of dot product reduction loop

On 07/23/2018 06:23 PM, Hal Finkel via llvm-dev wrote: > > On 07/23/2018 05:22 PM, Craig Topper wrote: >> Hello all, >> >> This code https://godbolt.org/g/tTyxpf is a dot product reduction >> loop multipying sign extended 16-bit values to produce a 32-bit >> accumulated result. The x86 backend is currently not able to optimize >> it as well as gcc and icc.

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 23

[LoopVectorizer] Improving the performance of dot product reduction loop

Hello all, This code https://godbolt.org/g/tTyxpf is a dot product reduction loop multipying sign extended 16-bit values to produce a 32-bit accumulated result. The x86 backend is currently not able to optimize it as well as gcc and icc. The IR we are getting from the loop vectorizer has several v8i32 adds and muls inside the loop. These are fed by v8i16 loads and sexts from v8i16 to v8i32. The

LLVM Loop vectorizer - 2 vector.body blocks appear

2016 Aug 01

LLVM Loop vectorizer - 2 vector.body blocks appear

Hello. Mikhail, with the more recent version of the LoopVectorize.cpp code (retrieved at the beginning of July 2016) I ran the following piece of C code: void foo(long *A, long *B, long *C, long N) { for (long i = 0; i < N; ++i) { C[i] = A[i] + B[i]; } } The vectorized LLVM program I obtain contains 2 vector.body blocks - one named

similar to: LoopVectorize fails to vectorize code with condition on reduction