thr3ads.net - search: "qadd"

Displaying 5 results from an estimated 5 matches for "qadd".

Did you mean: add

[LLVMdev] RFC: Integer saturation intrinsics

2011 Jun 18

[LLVMdev] RFC: Integer saturation intrinsics

...gal. The expansion should > be fairly straight forward and produce code that is at least as good as > what LLVM is currently generating for these code sequence. SSAT/USAT may set the Q (saturation) flag, which might cause problems for anyone relying on explicitly using saturating operations (QADD etc.) and testing the Q flag. So there are several possibilities: - you could do liveness analysis on Q and only introduce SSAT/USAT if Q is not live - you could fall back to not introducing SSAT/USAT if you could tell there was no test of Q in the function (the ARM ABI says Q is not def...

[LLVMdev] RFC: Integer saturation intrinsics

2011 Jun 17

[LLVMdev] RFC: Integer saturation intrinsics

Hi all, I'm proposing integer saturation intrinsics. def int_ssat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, llvm_i32_ty]>; def int_usat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, llvm_i32_ty]>; The first operand is the integer value being saturated, and second is the saturation bit position. For scalar integer types, the semantics are: int_ssat: x <

[LLVMdev] Patch for V7M

2010 Nov 27

[LLVMdev] Patch for V7M

Hello All, Attached is a patch for the ARM target. The ARM v7M profile does not have the signed most-significant-word multiply instruction so SMMUL, for instance, is not valid on Cortex-M3 and Cortex-M4. The attached patch adds an additional attribute, +mmul, which controls most-significant word multiplies on v6T2+ targets. This is especially important for me now that I've

[R] performance reflections

2006 Oct 17

[R] performance reflections

...happening in "do_cov", but the underlying issue is the use of "long double" computations. First the results: The timings I get (on 2xG5 2.7GHz) are: gcc3: 0.8s gcc4: 4.5s (dynamic libgcc) gcc4: 4.2s (static libgcc) Basically any calls that use long double will be affected: qadd: 4.5s (gcc3 opt), 6.7s (Agcc4 opt), 7.4s (gcc3), 7.9s (gcc4 opt +dyngcc), 10.5s (Agcc4), 10.6s (gcc4 dyngcc) (this test basically runs 500x 1M long double additions on an array - it's even more extreme if you run it on short arrays : 250kx1k will give 2s on gcc3 and 7.7s on gcc4) Now, th...

[Patch]01-Add ARM5E macros

2013 May 17

[Patch]01-Add ARM5E macros

...2 * c32) >> 16) */ +#define silk_SMLAWW(a32, b32, c32) silk_MLA(silk_SMLAWB((a32), (b32), (c32)), (b32), silk_RSHIFT_ROUND((c32), 16)) + +/* add/subtract with output saturated */ +static inline opus_int32 silk_ADD_SAT32(opus_int32 a, opus_int32 b) +{ + int res; + __asm__( + "qadd %0, %1, %2;\n" + : "=&r"(res) + : "r"(a), "r"(b) + ); + return res; +} + +static inline opus_int32 silk_SUB_SAT32(opus_int32 a, opus_int32 b) +{ + int res; + __asm__( + "qsub %0, %1, %2;\n" + : "=&r"(res) +...

search for: qadd