similar to: Loop vectorizer doesn't try to align vectors on preferred vector alignment

Displaying 20 results from an estimated 3000 matches similar to: "Loop vectorizer doesn't try to align vectors on preferred vector alignment"

2018 Apr 12
0
Loop vectorizer doesn't try to align vectors on preferred vector alignment
On Thu, Apr 12, 2018, 8:40 AM Benoit Meister <meister at reservoir.com> wrote: > Thank you, Ayal! And thanks for the quote, Mehdi. I believe it says that > it would be a normal thing for the Loop Vectorizer to conform to the > backend's preferred alignment constraints as given by the datalayout. > > On Thu, Apr 12, 2018, 3:24 AM Zaks, Ayal <ayal.zaks at intel.com>
2013 Nov 15
4
[LLVMdev] Limit loop vectorizer to SSE
Something like: index 6db7f68..68564cb 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr Type *DataTy = VectorType::get(ScalarDataTy, VF); Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand(); unsigned Alignment = LI ?
2017 Mar 14
10
[Proposal][RFC] Epilog loop vectorization
Summarizing the discussion on the implementation approaches. Discussed about two approaches, first running ‘InnerLoopVectorizer’ again on the epilog loop immediately after vectorizing the original loop within the same vectorization pass, the second approach where re-running vectorization pass and limiting vectorization factor of epilog loop by metadata. <Approach-2> Challenges with
2016 Sep 25
5
RFC: New intrinsics masked.expandload and masked.compressstore
| |Hi Elena, | |Technically speaking, this seems straightforward. | |I wonder, however, how target-independent this is in a practical |sense; will there be an efficient lowering when targeting any other |ISA? I don't want to get into the territory where, because the |vectorizer is supposed to be architecture independent, we need to |add target-independent intrinsics for all
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
----- Original Message ----- > From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > To: "Joshua Klontz" <josh.klontz at gmail.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, November 15, 2013 4:05:53 PM > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > Something like: > > index
2017 Mar 14
2
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 11:58 AM, Michael Kuperstein wrote: > I'm still not sure about this, for a few reasons: > > 1) I'd like to try to treat epilogue loops the same way regardless of > whether the main loop was vectorized by hand or automatically. So if > someone hand-wrote an avx-512 16-wide loop, with alias checks, and we > decide it's profitable to vectorize the
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
Yes, I was just about to send out: DL->getABITypeAlignment(ScalarDataTy); The question is: “… ABI alignment for the target …" is that getPrefTypeAlignment or getABITypeAlignment I would have thought the latter. On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> From: "Arnold Schwaighofer"
2016 Aug 21
2
LoopVectorize module - some possible enhancements
Hello, Michael, I'd like to ask if we can enhance the LoopVectorize LLVM module (I am currently using a version from Jul 2016). More exactly: - do you envision to support in the near future LLVM IR gather and scatter intrinsics (as described at http://llvm.org/docs/LangRef.html#llvm-masked-gather-intrinsics and scatter)? I see you have defined some methods that should
2017 Mar 14
2
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 11:21 AM, Adam Nemet wrote: > >> On Mar 14, 2017, at 6:00 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com >> <mailto:Ashutosh.Nema at amd.com>> wrote: >> >> Summarizing the discussion on the implementation approaches. >> Discussed about two approaches, first running ‘InnerLoopVectorizer’ >> again on the epilog loop immediately after
2017 Feb 28
3
[Proposal][RFC] Epilog loop vectorization
I have tried running both gvn and newgvn but it did not helped in hoisting the alias checks: Please check, maybe I have missed something. <TestCase> void foo (char *A, char *B, char *C, int len) { int i = 0; for (i=0 ; i< len; i++) A[i] = B[i] + C[i]; } <Command> $ opt –O3 –gvn test.ll –o test.opt.ll $ opt –O3 –newgvn test.ll –o test.opt.ll “test.ll” is attached, it
2017 Mar 14
4
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 12:11 PM, Adam Nemet wrote: > >> On Mar 14, 2017, at 9:49 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> >> On 03/14/2017 11:21 AM, Adam Nemet wrote: >>> >>>> On Mar 14, 2017, at 6:00 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com >>>> <mailto:Ashutosh.Nema at
2016 Sep 19
2
RFC: New intrinsics masked.expandload and masked.compressstore
Hi all, AVX-512 ISA introduces new vector instructions VCOMPRESS and VEXPAND in order to allow vectorization of the following loops with two specific types of cross-iteration dependencies: Compress: for (int i=0; i<N; ++i) If (t[i]) *A++ = expr; Expand: for (i=0; i<N; ++i) If (t[i]) X[i] = *A++; else
2016 Sep 26
2
RFC: New intrinsics masked.expandload and masked.compressstore
| |How would this work in this case? The result would need to affect the |legality and cost of the memory instruction. From your poster, it looks |like we're talking about loops with constructs like this: | |for (i =0; i < N; i++) { | if (topVal > b[i]) { | *dst = a[i]; | dst++; | } |} | |is this loop vectorizable at all without these constructs? Good
2016 Jun 04
4
[LLVMdev] LLVM loop vectorizer
Hi Alex, I think the changes you want are actually not vectorizer related. Vectorizer just uses data provided by other passes. What you probably might want is to look into routine Loop::getStartLoc() (see lib/Analysis/LoopInfo.cpp). If you find a way to improve it, patches are welcome:) Thanks, Michael > On Jun 3, 2016, at 6:13 PM, Alex Susu <alex.e.susu at gmail.com> wrote: >
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
Nadav, I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorization. I can't tell if it's the loop vectorizer or the codegen at fault, but the alignment assumption seems to sneak in somewhere. v/r, Josh [1]
2020 Feb 24
4
ORC JIT Weekly #6 -- General initializer support and JITLink optimizations
Hi All, The general initializer support patch has landed (see 85fb997659b plus follow up fixes). Some quick background: Until now ORC, like MCJIT, has handled static initializer discovery by searching for llvm.global_ctors and llvm.global_dtors arrays in the IR added to the JIT. This approach suffers from several drawbacks: 1) It provides no built-in support for other program representations:
2016 Aug 01
2
LLVM Loop vectorizer - 2 vector.body blocks appear
Hello. Mikhail, with the more recent version of the LoopVectorize.cpp code (retrieved at the beginning of July 2016) I ran the following piece of C code: void foo(long *A, long *B, long *C, long N) { for (long i = 0; i < N; ++i) { C[i] = A[i] + B[i]; } } The vectorized LLVM program I obtain contains 2 vector.body blocks - one named
2013 Oct 14
3
[LLVMdev] A weird, reproducable problem with MCJIT
I switched my Common Lisp compiler to use MCJIT on the weekend and ran into a weird problem compiling one particular function. It crashes with an EXC_BAD_ACCESS error in MCJIT::finalizeObject when calling processFDE. The weird part is that the function does not appear to do anything special and I've whittled it down to the minimum size that still causes the crash. If I remove even one
2013 Oct 14
0
[LLVMdev] A weird, reproducable problem with MCJIT
Hi Christian, Thanks for sharing this. Yaron Keren has been investigating some problems in the EH frame registration code recently, and I think this may be related. It at least sounds similar to the type of variations in behavior based on code size that Yaron was seeing. -Andy -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of
2016 Nov 30
5
[RFC] Enable "#pragma omp declare simd" in the LoopVectorizer
Dear all, I have just created a couple of differential reviews to enable the vectorisation of loops that have function calls to routines marked with “#pragma omp declare simd”. They can be (re)viewed here: * https://reviews.llvm.org/D27249 * https://reviews.llvm.org/D27250 The current implementation allows the loop vectorizer to generate vector code for source file as: #pragma omp declare