thr3ads.net - llvm dev - [LLVMdev] Vectors in structures [Sep 2010]

If this information is useful, please help other people find it:
Share via:

Renato Golin

2010-Sep-27 21:51 UTC

[LLVMdev] Vectors in structures

On 27 September 2010 18:19, Bob Wilson <bob.wilson at apple.com>
wrote:> I'm not sure what you mean by this.  The llvm intrinsics and built-in
vector operations use plain vectors regardless of the front-end.  The structures
are only relevant for things like argument passing and copying -- you can't
do anything else with them.  Can you post an example of the 5X IR code size that
you're seeing with clang?  I'd like to understand the issue that
you're seeing.

I mean that I could remove all structure boilerplate and it still
works, plus you don't have to define any type (as LLVM uses the vector
types), as per the discussion we're having about needing to use
structures. Make that 2x smaller, I had a special case that was not a
fair comparison.

But I recently found out that the polyNxN_t vector type can destroy
everything, as it appears to LLVM as <8 x i8>, and is identical to a
intNxN_t for base instructions, so an "icmp eq <8 x i8>" always
become
VCEQ.I8 and never a VCEQ.P8, even though that's what Clang generates.

Putting them into structures doesn't help because of the type names
being irrelevant, both names become %struct.__simd64_int8_t

%struct.__simd64_int8_t = type { <8 x i8> }
%struct.__simd64_poly8_t = type { <8 x i8> }
%struct.__simd64_uint8_t = type { <8 x i8> }

@u8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
@i8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
@p8d = common global %struct.__simd64_int8_t zeroinitializer, align 8

The difference between uint8x8 and int8x8 is done via 'nsw' (which,
unless it's really generating a trap value, it's a misleading tag),
but there's nothing that will flag this type as poly8x8.

When I try to compile this with Clang:

=== comp.c ==#define __ARM_NEON__
#include <arm_neon.h>

int8x8_t    i8d;
uint8x8_t   u8d;
poly8x8_t   p8d;

void vceq() {
  u8d  = vceq_s8(i8d, i8d);
  u8d  = vceq_p8(p8d, p8d);
}
=== end ==
It generates exactly the same instruction for both calls:

$ clang -ccc-host-triple armv7a-none-eabi -ccc-gcc-name
arm-none-eabi-gcc -mfloat-abi=hard -w -S comp.c -o - | grep vceq
	.globl	vceq
	.type	vceq,%function
vceq:
	vceq.i8	d0, d1, d0
	vceq.i8	d0, d1, d0
	.size	vceq, .Ltmp0-vceq


Isn't that a call to use intrinsics?

-- 
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Bob Wilson

2010-Sep-27 22:03 UTC

head link

[LLVMdev] Vectors in structures

Support for NEON intrinsics in clang is not complete.  Poly types in general are
known to be an issue, and the vceq_p8 in your example definitely needs an
intrinisic.  It should work with llvm-gcc.

Can you clarify ARM's position on those structure types?  It sounds like you
are advocating that we get rid of them.  The only reason we've been using
them in llvm-gcc and clang is for compatibility for ARM's specifications and
with ARM's RVCT compiler.  If ARM does not care about those things, I'd
love to remove the struct wrappers from llvm.

On Sep 27, 2010, at 2:51 PM, Renato Golin wrote:
> On 27 September 2010 18:19, Bob Wilson <bob.wilson at apple.com>
wrote:
>> I'm not sure what you mean by this.  The llvm intrinsics and
built-in vector operations use plain vectors regardless of the front-end.  The
structures are only relevant for things like argument passing and copying -- you
can't do anything else with them.  Can you post an example of the 5X IR code
size that you're seeing with clang?  I'd like to understand the issue
that you're seeing.
> 
> 
> I mean that I could remove all structure boilerplate and it still
> works, plus you don't have to define any type (as LLVM uses the vector
> types), as per the discussion we're having about needing to use
> structures. Make that 2x smaller, I had a special case that was not a
> fair comparison.
> 
> But I recently found out that the polyNxN_t vector type can destroy
> everything, as it appears to LLVM as <8 x i8>, and is identical to a
> intNxN_t for base instructions, so an "icmp eq <8 x i8>"
always become
> VCEQ.I8 and never a VCEQ.P8, even though that's what Clang generates.
> 
> Putting them into structures doesn't help because of the type names
> being irrelevant, both names become %struct.__simd64_int8_t
> 
> %struct.__simd64_int8_t = type { <8 x i8> }
> %struct.__simd64_poly8_t = type { <8 x i8> }
> %struct.__simd64_uint8_t = type { <8 x i8> }
> 
> @u8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
> @i8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
> @p8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
> 
> The difference between uint8x8 and int8x8 is done via 'nsw' (which,
> unless it's really generating a trap value, it's a misleading tag),
> but there's nothing that will flag this type as poly8x8.
> 
> When I try to compile this with Clang:
> 
> === comp.c ==> #define __ARM_NEON__
> #include <arm_neon.h>
> 
> int8x8_t    i8d;
> uint8x8_t   u8d;
> poly8x8_t   p8d;
> 
> void vceq() {
>  u8d  = vceq_s8(i8d, i8d);
>  u8d  = vceq_p8(p8d, p8d);
> }
> === end ==> 
> It generates exactly the same instruction for both calls:
> 
> $ clang -ccc-host-triple armv7a-none-eabi -ccc-gcc-name
> arm-none-eabi-gcc -mfloat-abi=hard -w -S comp.c -o - | grep vceq
> 	.globl	vceq
> 	.type	vceq,%function
> vceq:
> 	vceq.i8	d0, d1, d0
> 	vceq.i8	d0, d1, d0
> 	.size	vceq, .Ltmp0-vceq
> 
> 
> Isn't that a call to use intrinsics?
> 
> -- 
> cheers,
> --renato
> 
> http://systemcall.org/
> 
> Reclaim your digital rights, eliminate DRM, learn more at
> http://www.defectivebydesign.org/what_is_drm

Renato Golin

2010-Sep-27 22:23 UTC

head link

[LLVMdev] Vectors in structures

On 27 September 2010 23:03, Bob Wilson <bob.wilson at apple.com>
wrote:> Can you clarify ARM's position on those structure types?  It sounds
like you are advocating that we get rid of them.  The only reason we've been
using them in llvm-gcc and clang is for compatibility for ARM's
specifications and with ARM's RVCT compiler.  If ARM does not care about
those things, I'd love to remove the struct wrappers from llvm.
As Al said earlier, you definitely don't need the structures for
compatibility with armcc.

As far as the LLVM back-end is concerned, with or without structures,
the instruction selection works a treat and generates correct NEON
instructions. If the final object has the correct instructions and
follows ARM ABIs, there is no point in keeping IR compatibility.

I also noticed that Clang's arm_neon.h is completely different from
armcc's, another non-compatible choice that has no impact in the final
object code generated.

As far as I can see, there is no gain in adding the wrapping
structures to the vector types.

I'll add the intrinsic to the VCEQ.P8 locally and test. If that works,
I'll be sending patches to NEON.td for all ambiguities I find...

-- 
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Sep 2010 - [LLVMdev] Vectors in structures

[LLVMdev] Vectors in structures

[LLVMdev] Vectors in structures

[LLVMdev] Vectors in structures

Apparently Analagous Threads