search for: 4xi32

Displaying 15 results from an estimated 15 matches for "4xi32".

Did you mean: 64xi32
2009 Jul 23
1
[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work
...eeding into it, giving us really atrocious code. IMO, the solution to this is to have a legalize-types action for vectors that corresponds to "promote" on scalars. In this case, since X86 supports VSETCC, the 4 x i1 SETCC should "vector promote" to a VSETCC node with a 4xi32 result, the and should vector promote to 4xi32, and the sext should vector promote as a vector sext_inreg. I don't think that implementing this is particularly hard, but I have plenty of other things I'm working on right now. Is anyone else interested in working on this? -Chris
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > > Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints, > > This might sound like a dumb question, but how does one build a vector of ints out of regular ints in IR? See: http://llvm.org/docs/LangRef.html#vector-operations In short, the IR has "insertelement", which maps to "INSERT_VECTOR...
2016 Mar 18
2
generate vectorized code
...ECTOR, MVT::v4i32, Expand); > setOperationAction(ISD::EXTRACT_VECTOR_ELT, MVT::v4i32, Expand); > setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v4i32, Expand); Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints, sum them, and store the result to the pointer. > > In other words I left the code as is. > > However if I use a .c code and run it through clang, I don't see any vector instructions. I'm puzzled. What am I doing wrong? There seems to...
2017 Sep 15
2
What should a truncating store do?
...111111011111110111111101111111. Or it can be written as a packed vector which I think would resemble 0b00001111111111111111111111111111. This would mean the memory layout changes depending on how/whether the legaliser breaks large vectors down into smaller types. Is this the case? For example, <4xi32> => <4 x i31> converts to two <2 x i32> => <2 x i31> stores on a target with <2 x i32> legal but would not be split if <4 x i32> were declared legal. Thanks Jon On Fri, Sep 15, 2017 at 7:41 PM, Friedman, Eli <efriedma at codeaurora.org> wrote: >...
2018 Mar 20
1
Polly -polly-prevect-width
i musing polly with vec-width=16 default my IR emits <16xi32> and remaining as <4xi32> by using polly. I want my IR to emit <16xi32> and remaining left as <8xi32>. How to do this? i m trying to use -polly-prevect-width. please help. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments...
2018 Mar 22
1
TargetOpcode::KILL confusion
...explain the semantics of TargetOpcode::KILL? Specifically, in this example, which register is killed? Would it be legal for operands 0 and 1 to refer to different registers? 128B %R3<def> = KILL %R3, %R3_1<imp-use>, %R3_23<imp-use> (In my out-of-tree target, %R3 is a <4xi32> register, %R3_1 is an i32 sub-register of %R3, and %R3_23 is a <2xi32> sub-register of %R3). Thanks, Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180322/0267f9a4/attachment.html>
2015 Jan 16
3
[LLVMdev] Overloaded intrinsics: name explosion
Philip Reames wrote: >> 1. Introduce aAny. > > Having a generic any type seems fine. I assume you'd create something like > an llvm_any_type in Intrinsics.td? That's precisely what ifavpAny is about: integer, float, array, vector, pointer Any. aAny is meant for array-Any, and I wonder why so few people care about arrays. I'll go ahead with Any and llvm_any_type.
2010 May 31
1
[LLVMdev] Error with instruction selection
...ary register. The physical return register is then overwritten in the next call. (This is visible when calling "llc -view-isel-dags -view-sched-dags". The first graph is OK, the second is not.) The problem goes away if I: -have the getPtr return anything else than <4xfloat>* or <4xi32>* (e.g. <4xfloat> or float* work just fine) -do not load from or store to the pointer - e.g. just returning the pointer works. -target any other processor than CellSPU (ok, some backends assert on this code, and the PIC assebly I didn't understand :) ) Any explanation on what is going...
2016 Mar 18
4
generate vectorized code
...di.amini at apple.com> > wrote: > >> >> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> >> wrote: >> >> Yes this IR does not build or shuffle any vector. Try to write a function >>> that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 >>> ints, >>> >> >> This might sound like a dumb question, but how does one build a vector of >> ints out of regular ints in IR? >> >> >> See: http://llvm.org/docs/LangRef.html#vector-operations >> >&g...
2017 Sep 15
2
What should a truncating store do?
...written as a packed >> vector which I think would resemble 0b00001111111111111111111111111111. >> >> This would mean the memory layout changes depending on how/whether the >> legaliser breaks large vectors down into smaller types. Is this the case? >> For example, <4xi32> => <4 x i31> converts to two <2 x i32> => <2 x i31> >> stores on a target with <2 x i32> legal but would not be split if <4 x i32> >> were declared legal. >> > > Vectors get complicated; I don't recall all the details of what the c...
2018 Jul 23
2
KNL Vectorization with larger vector width
...is. each time i debug it, it returns me vectorized IR in gdb. My goal is simple when i mention my target name in opt it should vectorize by keeping the vector width= highest supported by my target which is 2048. So $ opt -O3 -mytarget 1.ll -o 1_opt.ll 1_opt.ll should emit <2048xi32>, <1024xi32>.........................<32xi32> etc. How to achieve this? Please help. Thank You Regards On Fri, Jul 13, 2018 at 12:40 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > On 07/12/2018 02:32 PM, hameeza ahmed via llvm-dev wrote: > > Hello, > > If we pass march=knl,...
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 12:52 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > >> >> On Mar 18, 2016, at 12:45 PM, Rail Shafigulin <rail at esenciatech.com <mailto:rail at esenciatech.com>> wrote: >> >> On Thu, Mar 17, 2016 at 2:41 PM, Rail Shafigulin <rail at esenciatech.com <mailto:rail at esenciatech.com>> wrote: >> On Thu,
2010 May 31
0
[LLVMdev] Finding Merge nodes in CFG (ambika@cse.iitb.ac.in)
...al return register is then overwritten in the next > call. (This is visible when calling "llc -view-isel-dags > -view-sched-dags". The first graph is OK, the second is not.) > > The problem goes away if I: > -have the getPtr return anything else than <4xfloat>* or <4xi32>* (e.g. > <4xfloat> or float* work just fine) > -do not load from or store to the pointer - e.g. just returning the > pointer works. > -target any other processor than CellSPU (ok, some backends assert on > this code, and the PIC assebly I didn't understand :) ) > &gt...
2008 Sep 30
4
[LLVMdev] Generalizing shuffle vector
...ype(4) )) float float4; typedef __attribute__(( ext_vector_type(8) )) float float8; float8 f8; float4 f4a, f4b, f4c; f4a = f8.hi; f8.hi = f4b; f8.lo = f4c; where hi and lo represent the high half and low half of the vector. The outgoing IR is %f4a = shufflevector <8xf32>%f8, undef, <4xi32> <0, 1, 2, 3> %f8 = shufflevector <4xf32>%f4b, <4xf32>%f4c, <8xi32> <0, 1, 2, 3, 4, 5, 6, 7> The problem with generating insert and extracts is that we can generate poor code %tmp16 = extractelement <4 x float> %f4b, i32 0 %f8a = insert...
2017 Sep 25
0
What should a truncating store do?
...111111011111110111111101111111. Or it can be written as a packed vector which I think would resemble 0b00001111111111111111111111111111. This would mean the memory layout changes depending on how/whether the legaliser breaks large vectors down into smaller types. Is this the case? For example, <4xi32> => <4 x i31> converts to two <2 x i32> => <2 x i31> stores on a target with <2 x i32> legal but would not be split if <4 x i32> were declared legal. Vectors get complicated; I don't recall all the details of what the code generator currently does/is supp...