similar to: [LLVMdev] Vector ops out of loops

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Vector ops out of loops"

2011 Dec 10
0
[LLVMdev] Types inference in tblgen: Multiple exceptions
On Fri, Dec 9, 2011 at 4:12 PM, Ivan Llopard <ivanllopard at gmail.com> wrote: > Hi Eli, > Thanks for your response. Please see my responses below. > > > On 10/12/2011 00:28, Eli Friedman wrote: >> >> On Fri, Dec 9, 2011 at 4:46 AM, Llopard Ivan<ivanllopard at gmail.com> >>  wrote: >>> >>> Hi all, >>> >>> I am writing
2011 Dec 10
1
[LLVMdev] Types inference in tblgen: Multiple exceptions
On 10/12/2011 01:32, Eli Friedman wrote: > On Fri, Dec 9, 2011 at 4:12 PM, Ivan Llopard<ivanllopard at gmail.com> wrote: >> Hi Eli, >> Thanks for your response. Please see my responses below. >> >> >> On 10/12/2011 00:28, Eli Friedman wrote: >>> On Fri, Dec 9, 2011 at 4:46 AM, Llopard Ivan<ivanllopard at gmail.com> >>> wrote:
2011 Dec 09
0
[LLVMdev] Types inference in tblgen: Multiple exceptions
On Fri, Dec 9, 2011 at 4:46 AM, Llopard Ivan <ivanllopard at gmail.com> wrote: > Hi all, > > I am writing a back-end for a processor that has complex type registers. > It has two functional units to perform complex multiplications. >  From clang, I emulate a complex multiplication using vectors and, at > the IR, I got this tblgen-friendly pattern (real component) : >
2011 Dec 09
2
[LLVMdev] Types inference in tblgen: Multiple exceptions
Hi all, I am writing a back-end for a processor that has complex type registers. It has two functional units to perform complex multiplications. From clang, I emulate a complex multiplication using vectors and, at the IR, I got this tblgen-friendly pattern (real component) : (set RARegs:$dst, (insertelt RARegs:$src, (i16 (trunc (add (ncmul (sext (i16
2011 Dec 10
0
[LLVMdev] Types inference in tblgen: Multiple exceptions
On Dec 9, 2011, at 4:12 PM, Ivan Llopard wrote: > Hi Eli, > Thanks for your response. Please see my responses below. > > On 10/12/2011 00:28, Eli Friedman wrote: >> On Fri, Dec 9, 2011 at 4:46 AM, Llopard Ivan<ivanllopard at gmail.com> wrote: >>> Hi all, >>> >>> I am writing a back-end for a processor that has complex type registers.
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
Thanks so much for your feedback Simon. I am not sure that what I am proposing here is at odds with what you're referring to (here and in the PR you linked). The key difference AFAICT is that the pattern I am referring to is probably more aptly described as "reducing scalarization" than as "vectorization". The reason I say that is that the inputs are vectors and the output
2011 Dec 10
5
[LLVMdev] Types inference in tblgen: Multiple exceptions
Hi Eli, Thanks for your response. Please see my responses below. On 10/12/2011 00:28, Eli Friedman wrote: > On Fri, Dec 9, 2011 at 4:46 AM, Llopard Ivan<ivanllopard at gmail.com> wrote: >> Hi all, >> >> I am writing a back-end for a processor that has complex type registers. >> It has two functional units to perform complex multiplications. >> From clang,
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
Absolutely. We do it for scalars, so it would likely be a matter of just extending it. But that is one example. The issue of extracting elements, performing an operation on each element individually and then rebuilding the vector is likely more prevalent than that. At least I think that is the case, but I'll do some analysis to see if it is so or not. On Sat, Jan 11, 2020 at 6:15 PM Craig
2012 Feb 28
1
[LLVMdev] How to vectorize a vector type cast?
Since Clang does not seem to allow type casts, such as uchar4 to float4, between vector types, it seems it is necessary to write them as element by element conversions, such as typedef float float4 __attribute__((ext_vector_type(4))); typedef unsigned char uchar4 __attribute__((ext_vector_type(4))); float4 to_float4(uchar4 in) { float4 out = {in.x, in.y, in.z, in.w}; return out; } Running
2009 Dec 02
1
[LLVMdev] More AVX Advice Needed
On Wednesday 02 December 2009 17:24, Eli Friedman wrote: > On Wed, Dec 2, 2009 at 3:08 PM, David Greene <dag at cray.com> wrote: > > On Wednesday 02 December 2009 16:51, Eli Friedman wrote: > >> On Wed, Dec 2, 2009 at 2:44 PM, David Greene <dag at cray.com> wrote: > >> > I'm working on some of the AVX insert/extract instructions.  They're >
2020 Jan 10
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
I have added a few PPC-specific DAG combines in the past that follow this pattern on specific operations. Now that it appears that this would be useful to do on yet another operation, I'm wondering what people think about doing this in the target-independent DAG Combiner for any legal/custom operation on the target. TL; DR; The generic pattern would look like this: (build_vector (op
2013 Nov 24
2
[LLVMdev] Strange i386 cross build error.
As part of my ELLCC project (http://ellcc.org), I build clang/LLVM on my native x86_64 Linux box and then use it to compile itself. For further sanity checking, I then use that copy to cross compile for other targets (arm, armeb, i386, microblaze, mips, mipsel, ppc, and x86_64). After updating to a recent TOT revision, r195452, I get a strange error when cross compiling for the i386:
2008 Sep 30
4
[LLVMdev] Generalizing shuffle vector
Hi, The current definition of shuffle vector is <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x i32> <mask> ; yields <n x <ty>> The first two operands of a 'shufflevector' instruction are vectors with types that match each other and types that match the result of the instruction. The third
2009 Sep 04
3
[LLVMdev] TOT opt does not terminate!
The following code causes opt to not terminate! With TOT this morning, and of a week ago: clang foo.c and clang -O1 foo.c work fine. clang -O2 foo.c and clang -O3 foo.c do not terminate. (At least after 10 minutes) If I generate the bit code (clang-cc -emit-llvmbc) and then run: opt -O3 foo.bc it does not terminate. //foo.c int get_id(int); typedef short
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi, I am trying to understand LLVM vectorization implementation and was looking into both loop and SLP vectorization. test case 1: *int foo(int *a) {int sum = 0,i;for(i=0; i<16; i++) sum += a[i];return sum;}* This code is vectorized by loop vectorizer where we calculate scalar loop cost as 4 and vector loop cost as 2. Since vector loop cost is less and above reduction is legal to
2012 Feb 17
0
[LLVMdev] Folding an insertelt chain
On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote: > Hello, > > I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Nadav, Thanks for the quick reply !! Ok, so as of now we are lacking capability to handle flat large reductions. I did go through function vectorizeChainsInBlock() (line number 2862). In this function, we try to vectorize if we have phi nodes in the IR (several if's check for phi nodes) i.e we try to construct tree that starts at chains. Any pointers on how to join multiple trees? I
2012 Feb 17
3
[LLVMdev] Folding an insertelt chain
Hello, I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere else. Please find attached the patch. Regards, Ivan
2014 Nov 10
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Suyog, Thanks for looking at this. This has recently got itself onto my TODO list too. > I am not sure how much all this will improve the code quality for horizontal reduction > (donno how frequently such pattern of horizontal reduction from same array occurs in real world/SPECS). Actually the main loop of 470.lbm can be SLP vectorized like this. We have three parts to it: A fully
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
Hi Arnold, Thanks for your reply. I tried test case as suggested by you. *void foo(int *a, int *sum) {*sum = a[0]+a[1]+a[2]+a[3]+a[4]+a[5]+a[6]+a[7]+a[8]+a[9]+a[10]+a[11]+a[12]+a[13]+a[14]+a[15];}* so that it has a 'store' in its IR. *IR before vectorization :*target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" target triple =