thr3ads.net - search: "v4i32"

Displaying 20 results from an estimated 163 matches for "v4i32".

Did you mean: v64i32

2016 Mar 18

generate vectorized code

...t;>> (I don't see an equivalent option for the loop vectorizer though). >>>> >>>> Well, it sort of worked. I added a getRegisterBitWidth(...) but then I >>> got this error: >>> >>> fatal error: error in backend: Cannot select: 0x5e949a8: v4i32 = >>> BUILD_VECTOR 0x5e91ae8, 0x5e91ae8, 0x5e91ae8, 0x5e91ae8 [ORD=16] [ID=16] >>> 0x5e91ae8: i32 = Constant<0> [ID=5] >>> 0x5e91ae8: i32 = Constant<0> [ID=5] >>> 0x5e91ae8: i32 = Constant<0> [ID=5] >>> 0x5e91ae8: i32 = Constant...

generate vectorized code

2016 Mar 17

generate vectorized code

...lang: -mllvm -slp-max-reg-size -mllvm 512 >> (I don't see an equivalent option for the loop vectorizer though). >> >> Well, it sort of worked. I added a getRegisterBitWidth(...) but then I > got this error: > > fatal error: error in backend: Cannot select: 0x5e949a8: v4i32 = > BUILD_VECTOR 0x5e91ae8, 0x5e91ae8, 0x5e91ae8, 0x5e91ae8 [ORD=16] [ID=16] > 0x5e91ae8: i32 = Constant<0> [ID=5] > 0x5e91ae8: i32 = Constant<0> [ID=5] > 0x5e91ae8: i32 = Constant<0> [ID=5] > 0x5e91ae8: i32 = Constant<0> [ID=5] > > What am I mis...

generate vectorized code

2016 Mar 18

generate vectorized code

...-slp-max-reg-size -mllvm 512 (I don't see an equivalent option for the loop vectorizer though). >>> >>> Well, it sort of worked. I added a getRegisterBitWidth(...) but then I got this error: >>> >>> fatal error: error in backend: Cannot select: 0x5e949a8: v4i32 = BUILD_VECTOR 0x5e91ae8, 0x5e91ae8, 0x5e91ae8, 0x5e91ae8 [ORD=16] [ID=16] >>> 0x5e91ae8: i32 = Constant<0> [ID=5] >>> 0x5e91ae8: i32 = Constant<0> [ID=5] >>> 0x5e91ae8: i32 = Constant<0> [ID=5] >>> 0x5e91ae8: i32 = Constant<0> [ID...

interpretation of dag output

2016 Mar 23

interpretation of dag output

...graph below? I'm sure once I see one I will be able to plot my own. Any help is appreciated. Here is the graph: Type-legalized selection DAG: BB#3 'foo:middle.block26' SelectionDAG has 19 nodes: 0x26438b0: ch = EntryToken [ID=-3] 0x26438b0: <multiple use> 0x2672810: v4i32 = Register %vreg4 [ID=-3] 0x2672a20: v4i32,ch = CopyFromReg 0x26438b0, 0x2672810 [ORD=5] [ID=-3] 0x26761c8: v4i32 = undef [ID=-3] 0x2672a20: <multiple use> 0x2672a20: <multiple use> 0x26761c8: <multiple use> 0x2674b88: v4i32 = vector_shuffle 0x2672a20, 0...

[LLVMdev] Possible CellSPU Bug?

2011 Jan 29

[LLVMdev] Possible CellSPU Bug?

I'm working on enhancing TableGen's type checking and it triggered with a problem in CellSPU's specification: XSHWv4i32: (set VECREG:v8i16:$rDest, (sext:v8i16 VECREG:v4i32:$rSrc)) It's complaining that v4i32 is not smaller than v8i16, which is true in the sense of vector bit size, and true in the sense of vector element size. To me, a sign extension from i32 to i16 makes no sense. >From the .td file, it l...

[LLVMdev] [AArch64] Should we restrict to the pointer type used in ldN/stN intrinsics?

2015 May 05

[LLVMdev] [AArch64] Should we restrict to the pointer type used in ldN/stN intrinsics?

...td) of such intrinsics use 'LLVMAnyPointerType', which means we can pass any pointer type to such intrinsics. E.g. I tried following case ld2.ll: define { <4 x i32>, <4 x i32> } @test(float* %ptr) { %vld2 = call { <4 x i32>, <4 x i32> } @llvm.aarch64.neon.ld2.v4i32.p0f32(float* %ptr) ret { <4 x i32>, <4 x i32> } %vld2 } declare { <4 x i32>, <4 x i32> } @llvm.aarch64.neon.ld2.v4i32.p0f32(float*) It can pass and generate ld2 with "llc -march=aarch64 < ld2.ll". I just think it's strange that the pointer has n...

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

Hi, I am trying to understand how vector type legalization works. In particular, I'm looking at i8 vector types on x86 (with sse42 features) v3i8 gets widened to v4i8 and then operations get unrolled (scalarized) because v4i8 is not a legal type whereas v4i8 gets promoted to v4i32. Why doesn't v3i8 (or even v4i8) get widened to v16i8? Alternatively, v3i8 could be widened to v4i8 then promoted to v4i32 but this doesn't happen either. Can anyone provide some insight into why vector type legalization works the way it does? Thanks, paul

[SelectionDAG] Assertion due to MachineMemOperand flags difference.

2017 Oct 13

[SelectionDAG] Assertion due to MachineMemOperand flags difference.

...ut different MMO flags is possible, so the Flags should be added to the FoldingSetNodeID. 3) Something else I haven't considered. I have a patch posted implementing 2, but don't know if I should look at fixing 1 as well (or perhaps instead). The loads that trigger the assertion are: t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46, undef:i64 t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64 I would expect the the second load should also be marked dereferenceable since its loading from one of the TargetFrames. Am I on the right track...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...and sub_sd indexes used to play a role in selecting the right register class, but not any longer. That is all derived from the instruction descriptions now. As far as I can tell, all sub-register operations involving sub_ss and sub_sd can simply be replaced with COPY_TO_REGCLASS: def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), sub_sd))>; Becomes: def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), (VMOVSDrr VR128:$src1, (COPY_TO_REGCLASS...

Return value from TargetLowering::LowerOperation?

2016 Jan 25

Return value from TargetLowering::LowerOperation?

...replace the SDValue with itself. >> > > I think this error can only happen during type legalization, so my guess > is that you are returning a node that has an illegal type. Can you > provide more information about the node this is failing with? On my target v2i16, v4i16, v2i32, v4i32, v2f32, v4f32 are legal and all other vector types are not. Vectors of i16 are a bit special and we need to custom lower bitcasts to/from them. Therefore we do setOperationAction(ISD::BITCAST, VT, Custom); on all MVT:s, and in our target's LowerOperation/LowerBitcast we specifically...

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

...8 vector types on x86 (with sse42 features) >> >> v3i8 gets widened to v4i8 and then operations get unrolled (scalarized) >>because v4i8 is not a legal type whereas v4i8 gets > >This does not sound right. v3i8 -> v4i8 is okay. But the next step >should be v4i8 -> v4i32. The operation nay be scalarized in the vector >legalization phase. What I'm looking at is a v3i8 add. In DAGTypeLegalizer::WidenVecRes_Binary the operation gets scalarized (DAG.UnrollVector). The input N is "0x51c1d60: v3i8 = add 0x51c1860, 0x51c1c60 [ORD=5] [ID=0]" and the Wide...

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

...icular, I'm looking at i8 vector types on x86 (with sse42 features) > > v3i8 gets widened to v4i8 and then operations get unrolled (scalarized) because v4i8 is not a legal type whereas v4i8 gets This does not sound right. v3i8 -> v4i8 is okay. But the next step should be v4i8 -> v4i32. The operation nay be scalarized in the vector legalization phase. > promoted to v4i32. Why doesn't v3i8 (or even v4i8) get widened to v16i8? Alternatively, v3i8 could be widened to v4i8 then promoted to v4i32 but this doesn't happen either. > > Can anyone provide some insight...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...... I'm confused. Below you note that they are used in patterns, so they are certainly mentioned more than just in the code above. > As far as I can tell, all sub-register operations involving sub_ss and > sub_sd can simply be replaced with COPY_TO_REGCLASS: > > def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), > (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), > sub_sd))>; > > Becomes: > > def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), > (VMOVSDr...

the as-if rule / perf vs. security

2016 Mar 16

the as-if rule / perf vs. security

...power usage of anything. I guess there would be a minor > "bad" side effect in that a memory read watchpoint would trigger with the > 128 bit load that wouldn't be there with the 32-bit loads. I think it is > semantically very similar to this situation as well... > > v4i32 first_call(int *x) { //use all of the array > int f0 = x[0]; > int f1 = x[1]; > int f2 = x[2]; > int f3 = x[3]; > return (v4i32) { f0, f1, f2, f3 }; > } > v4i32 second_call(int *x) { //use some of the array > int s0 = x[0]; > int s1 = x[1]; > in...

[SDAG] Recovering pointer types

2016 Dec 26

[SDAG] Recovering pointer types

...destination types are the same (i64 according to the DataLayout). So I end up with this as the initial SDAG: Initial selection DAG: BB#0 'test:entry' SelectionDAG has 9 nodes: t0: ch = EntryToken t3: i64 = Constant<0> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t5: v4i32,ch = load<LD16[%0](tbaa=<0x10038f18a98>)> t0, t2, undef:i64 t7: ch,glue = CopyToReg t0, Register:v4i32 %V2, t5 t8: ch = PPCISD::RET_FLAG t7, Register:v4i32 %V2, t7:1 What I would like to do is emit efficient code for cases where the parameter pointer has the same alignment requirem...

Question about quad-register

2017 Sep 10

Question about quad-register

Hi All, If the target supports quad-register R0:R1:R2:R3 (Rn is 32-bit register), is it possible mapping quad-register to v4i32 so that the following example work? typedef int v4si __attribute__ ((vector_size (16))); void foo(v4si i) { v4si j = i; } I don't know how to write CallingConv.td to represent the concept of occupying quad-register R0:R1:R2:R3 once seeing v4i32. Any example that I can refer...

[LLVMdev] initialize register attributes in instruction definition

2014 Jul 31

[LLVMdev] initialize register attributes in instruction definition

...instructions? let Constraints = “$dst.SC_X =1, $src.SC_Y =0" in { def GENri : my_instr <op, 0, (outs GPR_V4_R32:$dst), (ins GPR_V4_R32:$src), !strconcat(asmstr, " $dst, ""$src"), [(set v4i32:$dst, (node v4i32:$src)]>; } tks kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140731/57ca8208/attachment.html>

infer correct types from the pattern

2016 Mar 30

infer correct types from the pattern

i'm getting a Could not infer all types in pattern! error in my backend. it is happening on the following instruction: VGETITEM: (set GPR:{i32:f32}:$rD, (extractelt:{i32:f32} VR:{v4i32:v4f32}:$rA, GPR:i32:$rB)). how do i make it use appropriate types? in other words if it is f32 then use v4v32 and if it is i32 then use v4f32. i'm not sure even where to start? any help is appreciated. -- Rail Shafigulin Software Engineer Esencia Technologies -------------- next part ------...

the as-if rule / perf vs. security

2016 Mar 16

the as-if rule / perf vs. security

...es you have rightfully cited. Let's assume we're not dealing with volatiles, atomics, or FP operands. We'll even guarantee that the extra loaded value is never used. This is, in fact, the scenario that http://reviews.llvm.org/rL263446 is concerned with. Related C example: typedef int v4i32 __attribute__((__vector_size__(16))); // Load some almost-consecutive ints as a vector. v4i32 foo(int *x) { int x0 = x[0]; int x1 = x[1]; // int x2 = x[2]; // U can't touch this? int x3 = x[3]; return (v4i32) { x0, x1, 0, x3 }; } For x86, we notice that we have nearly a v4i32 ve...

[SDAG] Recovering pointer types

2016 Dec 26

[SDAG] Recovering pointer types

...rding to the DataLayout). > > So I end up with this as the initial SDAG: > Initial selection DAG: BB#0 'test:entry' > SelectionDAG has 9 nodes: > t0: ch = EntryToken > t3: i64 = Constant<0> > t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 > t5: v4i32,ch = load<LD16[%0](tbaa=<0x10038f18a98>)> t0, t2, undef:i64 > t7: ch,glue = CopyToReg t0, Register:v4i32 %V2, t5 > t8: ch = PPCISD::RET_FLAG t7, Register:v4i32 %V2, t7:1 > > What I would like to do is emit efficient code for cases where the parameter pointer has the sa...

search for: v4i32