search for: sig2word16

Displaying 12 results from an estimated 12 matches for "sig2word16".

2015 Nov 20
2
[PATCH] Add Aarch64 intrinsic for SIG2WORD16.
Fixed definition of SIG2WORD16, thanks to John Ridges. --- celt/arch.h | 4 +++- celt/arm/fixed_arm64.h | 35 +++++++++++++++++++++++++++++++++++ celt_headers.mk | 1 + 3 files changed, 39 insertions(+), 1 deletion(-) create mode 100644 celt/arm/fixed_arm64.h diff --git a/celt/arch.h b/celt/arch.h index 6...
2015 Nov 19
0
[PATCH 3/3] Add Aarch64 intrinsic for SIG2WORD16.
...LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef FIXED_ARM64_H +#define FIXED_ARM64_H + +#include <arm_neon.h> + +#undef SIG2WORD16 +#define SIG2WORD16(x) (vqmovns_s32((x))) + +#endif diff --git a/celt_headers.mk b/celt_headers.mk index 0eca6e6..c9df94b 100644 --- a/celt_headers.mk +++ b/celt_headers.mk @@ -36,6 +36,7 @@ celt/static_modes_fixed_arm_ne10.h \ celt/arm/armcpu.h \ celt/arm/fixed_armv4.h \ celt/arm/fixed_armv5e.h...
2015 Nov 21
0
[PATCH] Add Aarch64 intrinsic for SIG2WORD16.
Jonathan Lennox wrote: > Fixed definition of SIG2WORD16, thanks to John Ridges. To answer your earlier question from IRC, yes, resending the whole series would probably be helpful (I'm assuming this replaces on of the previous patches).
2015 Nov 19
2
[Aarch64 00/11] Patches to enable Aarch64
> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > > I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. This doesn?t show any appreciable speed difference in my tests, but the code is obviously better by inspection (all three of these map directly to a single Aarch64 instruction and a single Neon intrinsic) so my...
2015 Nov 13
2
[Aarch64 00/11] Patches to enable Aarch64
Thanks, I look forward to seeing what you find out. BTW, I was wondering if you tried replacing the SIG2WORD16 macro using the vqmovns_s32 intrinsic? I'm sure it would be faster than the C code, but in the grand scheme of things it might not make much difference. On 11/13/2015 12:15 PM, Jonathan Lennox wrote: >> On Nov 13, 2015, at 1:51 PM, John Ridges <jridges at masque.com> wrote: >...
2015 Nov 19
0
[Aarch64 00/11] Patches to enable Aarch64
Any speedup from the intrinsics may just be swamped by the rest of the encode/decode process. But I think you really want SIG2WORD16 to be (vqmovns_s32(PSHR32((x), SIG_SHIFT))) On 11/19/2015 2:52 PM, Jonathan Lennox wrote: >> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: >> >> I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics....
2015 Nov 20
2
[Aarch64 00/11] Patches to enable Aarch64
> On Nov 19, 2015, at 5:47 PM, John Ridges <jridges at masque.com> wrote: > > Any speedup from the intrinsics may just be swamped by the rest of the encode/decode process. But I think you really want SIG2WORD16 to be (vqmovns_s32(PSHR32((x), SIG_SHIFT))) Yes, you?re right. I forgot to run the vectors under qemu with my previous version (oh, the embarrassment!) Fixed forthcoming once the tests actually run. > On 11/19/2015 2:52 PM, Jonathan Lennox wrote: >>> On Nov 16, 2015, at 4:42 PM, Jona...
2015 Nov 16
0
[Aarch64 00/11] Patches to enable Aarch64
...ding support for OPUS_FAST_INT64 to celt/arch.h, and I?ve found that this is indeed comparable in speed, if not a touch faster, than my inline assembly. I?ll submit patches for this. The inline assembly parts of my aarch64 patch set can thus be considered withdrawn. I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. On Nov 13, 2015, at 2:47 PM, John Ridges <jridges at masque.com<mailto:jridges at masque.com>> wrote: Thanks, I look forward to seeing what you find out. BTW, I was wondering if you tried replacing...
2015 Nov 19
3
[PATCH 1/3] Add configure check for Aarch64-specific Neon intrinsics.
--- configure.ac | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/configure.ac b/configure.ac index 90a06c8..adcb969 100644 --- a/configure.ac +++ b/configure.ac @@ -503,6 +503,26 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [rtcd_support="$rtcd_support (NE10)"]) ]) + OPUS_CHECK_INTRINSICS( +
2015 Nov 21
8
[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.
--- configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index f52d2c2..e1a6e9b 100644 --- a/configure.ac +++ b/configure.ac @@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd], [enable_rtcd=yes]) AC_ARG_ENABLE([intrinsics], - [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],, +
2015 Nov 13
2
[Aarch64 00/11] Patches to enable Aarch64
Hi Jonathan, I'm sorry to bring this up again, and I don't want to beat a dead horse, but I was very surprised by your benchmarks so I took a little closer look. I think what's happening is that it's a little unfair to compare the ARM64 inline assembly to the C code, because looking at the C macros in "fixed_generic.h" for MULT16_32_Q16 and MULT16_32_Q15 you find
2015 Nov 21
12
[Aarch64 v2 00/18] Patches to enable Aarch64 (version 2)
..._INT64 macros back to opus_int32. Add OPUS_FAST_INT64 definition of silk_SMULWT. Clean up formatting of configure output for ARM intrinsics detection. Add configure check for Aarch64-specific Neon intrinsics. Add Aarch64 intrinsics for saturated add/subtract. Add Aarch64 intrinsic for SIG2WORD16. Makefile.am | 9 +-- celt/arch.h | 9 ++- celt/arm/arm_celt_map.c | 22 ++++++- celt/arm/celt_neon_intr.c | 61 ++++++++++++++++++- celt/arm/fixed_arm64.h | 35 +++++++++++ celt/arm/pitch_arm.h | 62 +++++++++++++++++++-...