On 27 September 2010 18:19, Bob Wilson <bob.wilson at apple.com>
wrote:> I'm not sure what you mean by this. The llvm intrinsics and built-in
vector operations use plain vectors regardless of the front-end. The structures
are only relevant for things like argument passing and copying -- you can't
do anything else with them. Can you post an example of the 5X IR code size that
you're seeing with clang? I'd like to understand the issue that
you're seeing.
I mean that I could remove all structure boilerplate and it still
works, plus you don't have to define any type (as LLVM uses the vector
types), as per the discussion we're having about needing to use
structures. Make that 2x smaller, I had a special case that was not a
fair comparison.
But I recently found out that the polyNxN_t vector type can destroy
everything, as it appears to LLVM as <8 x i8>, and is identical to a
intNxN_t for base instructions, so an "icmp eq <8 x i8>" always
become
VCEQ.I8 and never a VCEQ.P8, even though that's what Clang generates.
Putting them into structures doesn't help because of the type names
being irrelevant, both names become %struct.__simd64_int8_t
%struct.__simd64_int8_t = type { <8 x i8> }
%struct.__simd64_poly8_t = type { <8 x i8> }
%struct.__simd64_uint8_t = type { <8 x i8> }
@u8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
@i8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
@p8d = common global %struct.__simd64_int8_t zeroinitializer, align 8
The difference between uint8x8 and int8x8 is done via 'nsw' (which,
unless it's really generating a trap value, it's a misleading tag),
but there's nothing that will flag this type as poly8x8.
When I try to compile this with Clang:
=== comp.c ==#define __ARM_NEON__
#include <arm_neon.h>
int8x8_t i8d;
uint8x8_t u8d;
poly8x8_t p8d;
void vceq() {
u8d = vceq_s8(i8d, i8d);
u8d = vceq_p8(p8d, p8d);
}
=== end ==
It generates exactly the same instruction for both calls:
$ clang -ccc-host-triple armv7a-none-eabi -ccc-gcc-name
arm-none-eabi-gcc -mfloat-abi=hard -w -S comp.c -o - | grep vceq
.globl vceq
.type vceq,%function
vceq:
vceq.i8 d0, d1, d0
vceq.i8 d0, d1, d0
.size vceq, .Ltmp0-vceq
Isn't that a call to use intrinsics?
--
cheers,
--renato
http://systemcall.org/
Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm