search for: silk_add_sat32

Displaying 9 results from an estimated 9 matches for "silk_add_sat32".

2015 Nov 19
2
[Aarch64 00/11] Patches to enable Aarch64
> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > > I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. This doesn?t show any appreciable speed difference in my tests, but the code is obviously better by inspection (all three of these map directly to a single Aarch64 instruction and a single Neon intrinsic) so my code paths may just...
2015 Nov 19
0
[PATCH 2/3] Add Aarch64 intrinsics for saturated add/subtract.
...GENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifndef SILK_MACROS_ARM64_H +#define SILK_MACROS_ARM64_H + +#include <arm_neon.h> + +#undef silk_ADD_SAT32 +#define silk_ADD_SAT32(a, b) (vqadds_s32((a), (b))) + +#undef silk_SUB_SAT32 +#define silk_SUB_SAT32(a, b) (vqsubs_s32((a), (b))) + +#endif /* SILK_MACROS_ARM64_H */ diff --git a/silk/macros.h b/silk/macros.h index 7cefedc..d3ca347 100644 --- a/silk/macros.h +++ b/silk/macros.h @@ -151,5 +151,9 @@...
2015 Nov 19
3
[PATCH 1/3] Add configure check for Aarch64-specific Neon intrinsics.
--- configure.ac | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/configure.ac b/configure.ac index 90a06c8..adcb969 100644 --- a/configure.ac +++ b/configure.ac @@ -503,6 +503,26 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [rtcd_support="$rtcd_support (NE10)"]) ]) + OPUS_CHECK_INTRINSICS( +
2015 Nov 13
2
[Aarch64 00/11] Patches to enable Aarch64
Thanks, I look forward to seeing what you find out. BTW, I was wondering if you tried replacing the SIG2WORD16 macro using the vqmovns_s32 intrinsic? I'm sure it would be faster than the C code, but in the grand scheme of things it might not make much difference. On 11/13/2015 12:15 PM, Jonathan Lennox wrote: >> On Nov 13, 2015, at 1:51 PM, John Ridges <jridges at masque.com>
2015 Nov 19
0
[Aarch64 00/11] Patches to enable Aarch64
...code process. But I think you really want SIG2WORD16 to be (vqmovns_s32(PSHR32((x), SIG_SHIFT))) On 11/19/2015 2:52 PM, Jonathan Lennox wrote: >> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: >> >> I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. > This doesn?t show any appreciable speed difference in my tests, but the code is obviously better by inspection (all three of these map directly to a single Aarch64 instruction and a single Neon intrinsic) so my code paths may...
2015 Nov 21
8
[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.
--- configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index f52d2c2..e1a6e9b 100644 --- a/configure.ac +++ b/configure.ac @@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd], [enable_rtcd=yes]) AC_ARG_ENABLE([intrinsics], - [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],, +
2015 Nov 20
2
[Aarch64 00/11] Patches to enable Aarch64
...version (oh, the embarrassment!) Fixed forthcoming once the tests actually run. > On 11/19/2015 2:52 PM, Jonathan Lennox wrote: >>> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: >>> >>> I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. >> This doesn?t show any appreciable speed difference in my tests, but the code is obviously better by inspection (all three of these map directly to a single Aarch64 instruction and a single Neon intrinsic) so my code paths...
2015 Nov 16
0
[Aarch64 00/11] Patches to enable Aarch64
...r OPUS_FAST_INT64 to celt/arch.h, and I?ve found that this is indeed comparable in speed, if not a touch faster, than my inline assembly. I?ll submit patches for this. The inline assembly parts of my aarch64 patch set can thus be considered withdrawn. I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. On Nov 13, 2015, at 2:47 PM, John Ridges <jridges at masque.com<mailto:jridges at masque.com>> wrote: Thanks, I look forward to seeing what you find out. BTW, I was wondering if you tried replacing the SIG2WORD16 macr...
2013 May 17
1
[Patch]01-Add ARM5E macros
...LA(silk_SMULWB((a32), (b32)), (a32), silk_RSHIFT_ROUND((b32), 16)) + +/* a32 + ((b32 * c32) >> 16) */ +#define silk_SMLAWW(a32, b32, c32) silk_MLA(silk_SMLAWB((a32), (b32), (c32)), (b32), silk_RSHIFT_ROUND((c32), 16)) + +/* add/subtract with output saturated */ +static inline opus_int32 silk_ADD_SAT32(opus_int32 a, opus_int32 b) +{ + int res; + __asm__( + "qadd %0, %1, %2;\n" + : "=&r"(res) + : "r"(a), "r"(b) + ); + return res; +} + +static inline opus_int32 silk_SUB_SAT32(opus_int32 a, opus_int32 b) +{ + int res; + __asm__( +...