search for: vceq

Displaying 12 results from an estimated 12 matches for "vceq".

Did you mean: vce
2010 Sep 27
2
[LLVMdev] Vectors in structures
...hat 2x smaller, I had a special case that was not a fair comparison. But I recently found out that the polyNxN_t vector type can destroy everything, as it appears to LLVM as <8 x i8>, and is identical to a intNxN_t for base instructions, so an "icmp eq <8 x i8>" always become VCEQ.I8 and never a VCEQ.P8, even though that's what Clang generates. Putting them into structures doesn't help because of the type names being irrelevant, both names become %struct.__simd64_int8_t %struct.__simd64_int8_t = type { <8 x i8> } %struct.__simd64_poly8_t = type { <8 x i8&g...
2010 Sep 27
0
[LLVMdev] Vectors in structures
Support for NEON intrinsics in clang is not complete. Poly types in general are known to be an issue, and the vceq_p8 in your example definitely needs an intrinisic. It should work with llvm-gcc. Can you clarify ARM's position on those structure types? It sounds like you are advocating that we get rid of them. The only reason we've been using them in llvm-gcc and clang is for compatibility for ARM...
2010 Sep 27
0
[LLVMdev] Vectors in structures
On Sep 27, 2010, at 2:58 AM, Renato Golin wrote: > On 22 September 2010 03:43, Bob Wilson <bob.wilson at apple.com> wrote: >> But regardless they are still structures, right? What does it mean for them to map onto other types? Is the parser supposed to treat them as if they _were_ those other types? If so, I think you need to define a type system for those fundamental vector
2010 Sep 27
2
[LLVMdev] Vectors in structures
On 22 September 2010 03:43, Bob Wilson <bob.wilson at apple.com> wrote: > But regardless they are still structures, right?  What does it mean for them to map onto other types?  Is the parser supposed to treat them as if they _were_ those other types?  If so, I think you need to define a type system for those fundamental vector types.  I had read those statements to say something about the
2010 Nov 12
2
[LLVMdev] Simple NEON optimization
Hi folks, me again, So, I want to implement a simple optimization in a NEON case I've seen these days, most as a matter of exercise, but it also simplifies (just a bit) the code generated. The case is simple: uint32x2_t x, res; res = vceq_u32(x, vcreate_u32(0)); This will generate the following code: ; zero d16 vmov.i32 d16, #0x0 ; load a into d17 movw r0, :lower16:a movt r0, :upper16:a vld1.32 {d17}, [r0] ; compare two registers vceq.i32 d17, d17,...
2010 Nov 12
0
[LLVMdev] Simple NEON optimization
...Hi folks, me again, > > So, I want to implement a simple optimization in a NEON case I've seen > these days, most as a matter of exercise, but it also simplifies (just > a bit) the code generated. > > The case is simple: > > uint32x2_t x, res; > res = vceq_u32(x, vcreate_u32(0)); > > This will generate the following code: > > ; zero d16 > vmov.i32 d16, #0x0 > ; load a into d17 > movw r0, :lower16:a > movt r0, :upper16:a > vld1.32 {d17}, [r0] > ; compare two...
2010 Sep 27
2
[LLVMdev] Vectors in structures
...I also noticed that Clang's arm_neon.h is completely different from armcc's, another non-compatible choice that has no impact in the final object code generated. As far as I can see, there is no gain in adding the wrapping structures to the vector types. I'll add the intrinsic to the VCEQ.P8 locally and test. If that works, I'll be sending patches to NEON.td for all ambiguities I find... -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm
2010 Sep 27
0
[LLVMdev] Vectors in structures
...Each compiler can have its own version of arm_neon.h. llvm-gcc's is quite different from clang. That is an internal implementation issue. > > As far as I can see, there is no gain in adding the wrapping > structures to the vector types. > > I'll add the intrinsic to the VCEQ.P8 locally and test. If that works, > I'll be sending patches to NEON.td for all ambiguities I find... Wait a minute.... VCEQ does not have a special polynomial version. There is only VCEQ.I8. What I said about support for polynomial types in Clang is still true, but for this particular...
2010 Nov 12
2
[LLVMdev] Simple NEON optimization
On 12 November 2010 17:52, Bob Wilson <bob.wilson at apple.com> wrote: > I recommend implementing this as a target-specific DAG combine optimization.  We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM).  You just need to teach the DAG combiner how to fold them together.  Here's what you need to do (all of this code is in ARMISelLowering.cpp): Hi Bob, I thought so... I'll get cracked and see if I can generate some simp...
2010 Nov 12
0
[LLVMdev] Simple NEON optimization
..., at 10:42 AM, Renato Golin wrote: > On 12 November 2010 17:52, Bob Wilson <bob.wilson at apple.com> wrote: >> I recommend implementing this as a target-specific DAG combine optimization. We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM). You just need to teach the DAG combiner how to fold them together. Here's what you need to do (all of this code is in ARMISelLowering.cpp): > > Hi Bob, > > I thought so... I'll get cracked and see if I ca...
2010 Sep 28
2
[LLVMdev] Vectors in structures
...the structures from Clang, since I don't know enough about the vector types (and all other back-ends that use it) and what the problems you had with gcc/armcc compatibility. Maybe, because of the way vectors are implemented in LLVM, there is no other way... maybe not. > Wait a minute....  VCEQ does not have a special polynomial version.  There is only VCEQ.I8.  What I said about support for polynomial types in Clang is still true, but for this particular case, there is no difference between vceq_s8, vceq_u8, and vceq_p8 (aside from the types of the intrinsic arguments). Sorry, bad examp...
2010 Nov 12
1
[LLVMdev] Simple NEON optimization
..., Renato Golin wrote: > >> On 12 November 2010 17:52, Bob Wilson <bob.wilson at apple.com> wrote: >>> I recommend implementing this as a target-specific DAG combine optimization. We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM). You just need to teach the DAG combiner how to fold them together. Here's what you need to do (all of this code is in ARMISelLowering.cpp): >> >> Hi Bob, >> >> I thought so... I'll get cracked...