similar to: Vectorizers code ownership

Displaying 20 results from an estimated 30000 matches similar to: "Vectorizers code ownership"

2016 Nov 08
2
Vectorizers code ownership
+1 Thanks Nadav for your help over the last few years! Andrea On Mon, Nov 7, 2016 at 9:20 PM, Matthew Simpson via llvm-dev < llvm-dev at lists.llvm.org> wrote: > +1 > > -- Matt > > On Sun, Nov 6, 2016 at 1:00 AM, Nadav Rotem via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> It been a while since I worked on the vectorizers and I think that it's
2016 Nov 09
3
Vectorizers code ownership
Hi Quentin,  Thank you for bringing this up. I planned to finish the discussion on the vectorizer before starting the discussion on the X86 backed code ownership, but now is a good time. Simon, Sanjay, Craig, Elena, Bruno, Michael, Andrea, Chandler have made significant contributions to the X86 backend in the last few years. I think that Craig Topper would be a great code owner, assuming he wants
2014 Mar 12
4
[LLVMdev] Autovectorization questions
In order to vectorize code like this LLVM needs to prove that “A[i*7]” does not wrap in the address space. It fails to do so and so LLVM doesn’t vectorize this loop even if we try to force it. The following loop will be vectorized if we force it: int foo(int * A, int * B, int n, int k) { for (int i = 0; i < 1024; ++i) A[i] += B[i*k]; } So will this loop: int foo(int * restrict A, int
2016 Jun 16
2
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types
Some thoughts: o To determine the VF for a loop with mixed data sizes, choosing the smallest ensures each vector register used is full, choosing the largest will minimize the number of vector registers used. Which one’s better, or some size in between, depends on the target’s costs for the vector operations, availability of registers and possibly control/memory divergence and trip count. “This is
2016 Jun 15
8
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types
Hello, Currently the loop vectorizer will, by default, not consider vectorization factors that would make it generate types that do not fit into the target platform's vector registers. That is, if the widest scalar type in the scalar loop is i64, and the platform's largest vector register is 256-bit wide, we will not consider a VF above 4. We have a command line option (-mllvm
2016 Jun 16
2
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types
Hi Michael,  Thank you for working on this. The loop vectorizer tries a bunch of different vectorization factors and stops at the widest word size mostly because of compile time concerns. On every vectorization factors that we check we have to scan all of the instructions in the loop and make multiple calls into TTI. If you decide to increase the VF enumeration space then you will linearly
2016 Nov 10
2
X86 backend code ownership
Alright, works for me then! Q. > On Nov 10, 2016, at 10:43 AM, Michael Kuperstein via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > +1 > > On Thu, Nov 10, 2016 at 7:15 AM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > +1 - especially since I think Craig convinced Intel that LLVM isn't just a hobby
2012 Nov 17
2
[LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners
----- Original Message ----- > From: "Joe Abbey" <joe.abbey at gmail.com> > To: "Nadav Rotem" <nrotem at apple.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Saturday, November 17, 2012 1:25:04 PM > Subject: Re: [LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners > > > On Nov 17, 2012, at 12:57 PM, Nadav Rotem <nrotem at
2016 Sep 26
2
RFC: New intrinsics masked.expandload and masked.compressstore
| |How would this work in this case? The result would need to affect the |legality and cost of the memory instruction. From your poster, it looks |like we're talking about loops with constructs like this: | |for (i =0; i < N; i++) { | if (topVal > b[i]) { | *dst = a[i]; | dst++; | } |} | |is this loop vectorizable at all without these constructs? Good
2012 Sep 09
1
[LLVMdev] isSafeToSpeculativelyExecute() for CallInst
Hi Nadav, On 08/09/12 22:51, Nadav Rotem wrote: > > On Aug 19, 2012, at 2:55 PM, "Kuperstein, Michael M" > <michael.m.kuperstein at intel.com <mailto:michael.m.kuperstein at intel.com>> wrote: > >> Hello, >> Currently, llvm::isSafeToSpeculativelyExecute() always returns false for Call >> instructions. >> This has actual performance
2016 Sep 25
5
RFC: New intrinsics masked.expandload and masked.compressstore
| |Hi Elena, | |Technically speaking, this seems straightforward. | |I wonder, however, how target-independent this is in a practical |sense; will there be an efficient lowering when targeting any other |ISA? I don't want to get into the territory where, because the |vectorizer is supposed to be architecture independent, we need to |add target-independent intrinsics for all
2013 Nov 15
6
[LLVMdev] Limit loop vectorizer to SSE
On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! > > I think
2014 Mar 12
2
[LLVMdev] Autovectorization questions
Hi, I'm reading "http://llvm.org/docs/Vectorizers.html" and have few question. Hope someone has answers on it. The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions that scatter/gathers memory. ( http://llvm.org/docs/Vectorizers.html#scatter-gather) int foo(int *A, int *B, int n, int k) { for (int i = 0; i < n; ++i) A[i*7] += B[i*k]; } I
2016 Nov 10
2
X86 backend code ownership
+1 - especially since I think Craig convinced Intel that LLVM isn't just a hobby project for him. :) On Thu, Nov 10, 2016 at 5:08 AM, Andrea Di Biagio via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Fwiw, I also think that Craig would be a good code owner. So, my +1 goes > to him :-) > > @Nadav, thanks again for all your kind help and contributions to the x86 >
2013 Nov 15
4
[LLVMdev] Limit loop vectorizer to SSE
Something like: index 6db7f68..68564cb 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr Type *DataTy = VectorType::get(ScalarDataTy, VF); Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand(); unsigned Alignment = LI ?
2012 Aug 19
2
[LLVMdev] isSafeToSpeculativelyExecute() for CallInst
Hello, Currently, llvm::isSafeToSpeculativelyExecute() always returns false for Call instructions. This has actual performance implications, because loop-invariant code motion makes this check, and will never hoist instructions that are not safe to speculatively execute. Unfortunately, there is currently no way to signal to LICM that a function is safe to speculatively execute. The
2012 Nov 17
0
[LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners
> ----- Original Message ----- >> From: "Joe Abbey" <joe.abbey at gmail.com> >> To: "Nadav Rotem" <nrotem at apple.com> >> Cc: llvmdev at cs.uiuc.edu >> Sent: Saturday, November 17, 2012 1:25:04 PM >> Subject: Re: [LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners >> >> >> On Nov 17, 2012, at
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
Nadav, I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorization. I can't tell if it's the loop vectorizer or the codegen at fault, but the alignment assumption seems to sneak in somewhere. v/r, Josh [1]
2012 Sep 08
0
[LLVMdev] isSafeToSpeculativelyExecute() for CallInst
On Aug 19, 2012, at 2:55 PM, "Kuperstein, Michael M" <michael.m.kuperstein at intel.com> wrote: > Hello, > > Currently, llvm::isSafeToSpeculativelyExecute() always returns false for Call instructions. > This has actual performance implications, because loop-invariant code motion makes this check, and will never hoist instructions that are not safe to speculatively
2012 Dec 06
2
[LLVMdev] [RFC] "noclone" function attribute
Hi Michael, After some head-scratching and discussion with our tame Khronos member, I agree with you. It comes down to the interpretation of the ambiguous spec. It refers to "the barrier", implying there is some sort of equivalence relation over barriers. The question is, what is that equivalent relation? In your example code: > void f(int foo) { > if (foo) > b();