Displaying 5 results from an estimated 5 matches for "qadd".
Did you mean:
add
2011 Jun 18
0
[LLVMdev] RFC: Integer saturation intrinsics
...gal. The expansion should
> be fairly straight forward and produce code that is at least as good as
> what LLVM is currently generating for these code sequence.
SSAT/USAT may set the Q (saturation) flag, which might cause problems
for anyone relying on explicitly using saturating operations (QADD etc.)
and testing the Q flag. So there are several possibilities:
- you could do liveness analysis on Q and only introduce SSAT/USAT if
Q is not live
- you could fall back to not introducing SSAT/USAT if you could tell there
was no test of Q in the function (the ARM ABI says Q is not def...
2011 Jun 17
5
[LLVMdev] RFC: Integer saturation intrinsics
Hi all,
I'm proposing integer saturation intrinsics.
def int_ssat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, llvm_i32_ty]>;
def int_usat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, llvm_i32_ty]>;
The first operand is the integer value being saturated, and second is the saturation bit position.
For scalar integer types, the semantics are:
int_ssat: x <
2010 Nov 27
1
[LLVMdev] Patch for V7M
Hello All,
Attached is a patch for the ARM target. The ARM v7M profile does not
have the signed most-significant-word multiply instruction so SMMUL,
for instance, is not valid on Cortex-M3 and Cortex-M4.
The attached patch adds an additional attribute, +mmul, which
controls most-significant word multiplies on v6T2+ targets.
This is especially important for me now that I've
2006 Oct 17
0
[R] performance reflections
...happening in "do_cov", but the underlying issue is
the use of "long double" computations. First the results:
The timings I get (on 2xG5 2.7GHz) are:
gcc3: 0.8s
gcc4: 4.5s (dynamic libgcc)
gcc4: 4.2s (static libgcc)
Basically any calls that use long double will be affected:
qadd: 4.5s (gcc3 opt), 6.7s (Agcc4 opt), 7.4s (gcc3), 7.9s (gcc4 opt
+dyngcc), 10.5s (Agcc4), 10.6s (gcc4 dyngcc)
(this test basically runs 500x 1M long double additions on an array -
it's even more extreme if you run it on short arrays : 250kx1k will
give 2s on gcc3 and 7.7s on gcc4)
Now, th...
2013 May 17
1
[Patch]01-Add ARM5E macros
...2 * c32) >> 16) */
+#define silk_SMLAWW(a32, b32, c32) silk_MLA(silk_SMLAWB((a32), (b32), (c32)), (b32), silk_RSHIFT_ROUND((c32), 16))
+
+/* add/subtract with output saturated */
+static inline opus_int32 silk_ADD_SAT32(opus_int32 a, opus_int32 b)
+{
+ int res;
+ __asm__(
+ "qadd %0, %1, %2;\n"
+ : "=&r"(res)
+ : "r"(a), "r"(b)
+ );
+ return res;
+}
+
+static inline opus_int32 silk_SUB_SAT32(opus_int32 a, opus_int32 b)
+{
+ int res;
+ __asm__(
+ "qsub %0, %1, %2;\n"
+ : "=&r"(res)
+...