search for: armcpu

Displaying 20 results from an estimated 64 matches for "armcpu".

2015 Jan 20
0
[RFC PATCH v1 1/2] Optimize repeated calls to opus_select_arch
...ge many function signatures in the call stack to make this happen. Instead, just optimize the opus_select_arch() such that only the first call to it takes more time, but subsequent calls to it are much faster. This helps avoid needing to make too many changes to function signatures. --- celt/arm/armcpu.c | 19 +++++++++++++++---- celt/x86/x86cpu.c | 22 ++++++++++++++-------- 2 files changed, 29 insertions(+), 12 deletions(-) diff --git a/celt/arm/armcpu.c b/celt/arm/armcpu.c index 1768525..26aae09 100644 --- a/celt/arm/armcpu.c +++ b/celt/arm/armcpu.c @@ -151,24 +151,35 @@ opus_uint32 opus_...
2015 Nov 02
0
[PATCH 2/2] Fix unit tests on ARM without RTCD (e.g. aarch64 or iOS).
...X86_MAY_HAVE_SSE2) || defined(OPUS_X86_MAY_HAVE_SSE4_1) # include "x86/x86cpu.c" -#elif defined(OPUS_HAVE_RTCD) && \ - (defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) +#elif defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR) # include "arm/armcpu.c" # include "celt_lpc.c" # include "pitch.c" diff --git a/celt/tests/test_unit_mathops.c b/celt/tests/test_unit_mathops.c index 5b446b7..fd3319d 100644 --- a/celt/tests/test_unit_mathops.c +++ b/celt/tests/test_unit_mathops.c @@ -63,8 +63,7 @@ # include "x86/celt_...
2015 Nov 02
0
[PATCH 2/2] Fix unit tests on ARM without RTCD (e.g. aarch64 or iOS).
...X86_MAY_HAVE_SSE2) || defined(OPUS_X86_MAY_HAVE_SSE4_1) # include "x86/x86cpu.c" -#elif defined(OPUS_HAVE_RTCD) && \ - (defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) +#elif defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR) # include "arm/armcpu.c" # include "celt_lpc.c" # include "pitch.c" diff --git a/celt/tests/test_unit_mathops.c b/celt/tests/test_unit_mathops.c index 5b446b7..fd3319d 100644 --- a/celt/tests/test_unit_mathops.c +++ b/celt/tests/test_unit_mathops.c @@ -63,8 +63,7 @@ # include "x86/celt_...
2015 Nov 02
1
[PATCH 1/2] Declare silk_warped_LPC_analysis_filter_FIX_c in silk/fixed/main_FIX.h.
Fixes build failure on platforms with MAY_HAVE_SSE4_1 (but not PRESUME_SSE4_1) with --enable-intrinsics. --- silk/fixed/main_FIX.h | 11 +++++++++++ silk/x86/x86_silk_map.c | 2 ++ 2 files changed, 13 insertions(+) diff --git a/silk/fixed/main_FIX.h b/silk/fixed/main_FIX.h index ffeb4f3..375b5eb 100644 --- a/silk/fixed/main_FIX.h +++ b/silk/fixed/main_FIX.h @@ -97,6 +97,17 @@ void
2015 Nov 02
2
[PATCH 1/2] Declare silk_warped_LPC_analysis_filter_FIX_c in silk/fixed/main_FIX.h.
Fixes build failure on platforms with MAY_HAVE_SSE4_1 (but not PRESUME_SSE4_1) with --enable-intrinsics. --- silk/fixed/main_FIX.h | 11 +++++++++++ silk/x86/x86_silk_map.c | 2 ++ 2 files changed, 13 insertions(+) diff --git a/silk/fixed/main_FIX.h b/silk/fixed/main_FIX.h index ffeb4f3..375b5eb 100644 --- a/silk/fixed/main_FIX.h +++ b/silk/fixed/main_FIX.h @@ -97,6 +97,17 @@ void
2013 May 23
2
ASM runtime detection and optimizations
...diff --git a/opus_sources.mk b/opus_sources.mk index e4eeb91..1e9791b 100644 --- a/opus_sources.mk +++ b/opus_sources.mk @@ -4,7 +4,8 @@ src/opus_encoder.c \ src/opus_multistream.c \ src/opus_multistream_encoder.c \ src/opus_multistream_decoder.c \ -src/repacketizer.c +src/repacketizer.c \ +src/armcpu.c OPUS_SOURCES_FLOAT = \ src/analysis.c \ diff --git a/src/armcpu.c b/src/armcpu.c new file mode 100644 index 0000000..10a2905 --- /dev/null +++ b/src/armcpu.c @@ -0,0 +1,160 @@ +/* Copyright (c) 2010 Xiph.Org Foundation + * Copyright (c) 2013 Parrot */ +/* + Redistribution and use in source...
2015 Dec 20
2
[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.
Jonathan Lennox wrote: > +opus_int32 silk_noise_shape_quantizer_short_prediction_neon(const opus_int32 *buf32, const opus_int32 *coef32) > +{ > + int32x4_t coef0 = vld1q_s32(coef32); > + int32x4_t coef1 = vld1q_s32(coef32 + 4); > + int32x4_t coef2 = vld1q_s32(coef32 + 8); > + int32x4_t coef3 = vld1q_s32(coef32 + 12); > + > + int32x4_t a0 = vld1q_s32(buf32 -
2015 Dec 23
6
[AArch64 neon intrinsics v4 0/5] Rework Neon intrinsic code for Aarch64 patchset
...on. Apply Neon short prediction optimization to silk_noise_shape_quantizer_del_dec. Add Neon intrinsics for Silk noise shape feedback loop. Add Neon fixed-point implementation of xcorr_kernel. Makefile.am | 5 +- celt/arm/arm_celt_map.c | 17 ++++++ celt/arm/armcpu.c | 35 +++++++---- celt/arm/armcpu.h | 6 ++ celt/arm/celt_neon_intr.c | 61 ++++++++++++++++++- celt/arm/pitch_arm.h | 31 +++++++++- silk/NSQ.c | 57 ++++++----------- silk/NSQ.h | 97 ++++++++++++++++++++++++...
2018 May 24
2
NEON detection under iOs
...work. Opus codec works great under many platforms. I have found a small performance issue under iOS platform. If the macro OPUS_HAVE_RTCD is not set, then encoder doesn't use some _neon functions at low bitrates (up to 64k). If the macro is set, then the compiler hits the error at opus/celt/arm/armcpu.c:153 (a function for detection missed). Being compared to Android version performance degradation is ~30%. Detection NEON under iOS is a bit tricky, because there is no API for it. I added compiler-time detection. I made a commit into github repository. Attached please find patch for it. Regards,...
2017 Jul 21
2
[PATCH] Fix celt_pitch_xcorr ARM jump table compiling error
Hi, Attached is a fix related to ARM optimization jump table compiling error. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170720/661d96b5/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:
2015 Dec 21
0
[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.
...hard to track down and update. I see your point ? what do you suggest instead? As the comment mentions, this can?t use the usual IMPL table implementation, because the parameters are different (due to the transformed coefficients.) Would something like an OPUS_ARCH_ARM_NEON #define in celt/arm/armcpu.h be okay? It?d be a bit confusing with OPUS_CPU_ARM_NEON in armcpu.c, but I could probably organize it to be sensible. > (also, I realize libopus doesn't have a line-length restriction, but a > few newlines in here might be a mercy to those of us who work in > 80-column terminals)...
2018 Aug 31
1
NEON detection under iOs
...great under many platforms. > I have found a small performance issue under iOS platform. > If the macro OPUS_HAVE_RTCD is not set, then encoder doesn't use some > _neon functions at low bitrates (up to 64k). If the macro is set, then the > compiler hits the error at opus/celt/arm/armcpu.c:153 (a function for > detection missed). Being compared to Android version performance > degradation is ~30%. > Detection NEON under iOS is a bit tricky, because there is no API for it. > I added compiler-time detection. > I made a commit into github repository. Attached please...
2015 May 15
0
[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library
...0ba5995 100644 --- a/celt/dump_modes/Makefile +++ b/celt/dump_modes/Makefile @@ -19,7 +19,8 @@ INCLUDES += -I$(NE10_INCDIR) -DHAVE_ARM_NE10 -DOPUS_ARM_PRESUME_NEON_INTR LIBDIR = -l:$(NE10_LIBDIR)/libNE10.so SOURCES += ../arm/celt_ne10_fft.c \ dump_modes_arm_ne10.c \ - ../arm/armcpu.c + ../arm/armcpu.c \ + ../entcode.c endif all: dump_modes diff --git a/celt/dump_modes/dump_modes_arch.h b/celt/dump_modes/dump_modes_arch.h index 1436926..59073ee 100644 --- a/celt/dump_modes/dump_modes_arch.h +++ b/celt/dump_modes/dump_modes_arch.h @@ -28,11 +28,17 @@ #i...
2016 Jul 14
0
[PATCH 2/5] Optimize fixed-point celt_fir_c() for ARM NEON
...THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#if !defined(CELT_LPC_ARM_H) +# define CELT_LPC_ARM_H + +# include "armcpu.h" + +# if defined(FIXED_POINT) + +# if defined(OPUS_ARM_MAY_HAVE_NEON) +void celt_fir_neon( + const opus_val16 *_x, + const opus_val16 *num, + opus_val16 *_y, + int N, + int ord, + int arch); +# endif + +# if !defined(OPUS_HAVE_RTCD) +# def...
2015 May 15
0
[RFC V3 4/8] aarch64: Enable intrinsics for aarch64
...t_unit_dft.c @@ -45,8 +45,7 @@ #include "mathops.c" #include "entcode.c" -#if defined(OPUS_HAVE_RTCD) && \ - (defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) +#if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) || defined(OPUS_ARM_ASM) #include "arm/armcpu.c" #if !defined(FIXED_POINT) #if defined(HAVE_ARM_NE10) diff --git a/celt/tests/test_unit_mathops.c b/celt/tests/test_unit_mathops.c index a1cf2f7..2e43e07 100644 --- a/celt/tests/test_unit_mathops.c +++ b/celt/tests/test_unit_mathops.c @@ -65,17 +65,18 @@ #include "x86/celt_lpc_sse.c&...
2016 Jun 17
5
ARM NEON optimization -- celt_fir()
Hi all, This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the next few months. I'm submitting 2 patches in the following couple of emails, which have the new created celt_fir_neon(). I revised celt_fir_c() to not pass in argument "mem" in Patch 1. If there are concerns to this change, please let me know. Many thanks to your comments. Linfeng Zhang
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
...th compilers with non-GCC-flavor flags for enabling architecture options. * Hopefully makes the configuration and ifdef?s easier to follow and understand. Reviewed-by: Viswanath Puttagunta <viswanath.puttagunta at linaro.org> --- Makefile.am | 38 ++-- celt/arm/armcpu.c | 6 +- celt/arm/pitch_arm.h | 4 +- celt/bands.c | 6 +- celt/celt.c | 16 +- celt/celt.h | 12 +- celt/celt_decoder.c | 6 +- celt/celt...
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
...th compilers with non-GCC-flavor flags for enabling architecture options. * Hopefully makes the configuration and ifdef?s easier to follow and understand. Reviewed-by: Viswanath Puttagunta <viswanath.puttagunta at linaro.org> --- Makefile.am | 38 ++-- celt/arm/armcpu.c | 6 +- celt/arm/pitch_arm.h | 4 +- celt/bands.c | 6 +- celt/celt.c | 16 +- celt/celt.h | 12 +- celt/celt_decoder.c | 6 +- celt/celt...
2015 Nov 19
0
[PATCH 3/3] Add Aarch64 intrinsic for SIG2WORD16.
...D_ARM64_H + +#include <arm_neon.h> + +#undef SIG2WORD16 +#define SIG2WORD16(x) (vqmovns_s32((x))) + +#endif diff --git a/celt_headers.mk b/celt_headers.mk index 0eca6e6..c9df94b 100644 --- a/celt_headers.mk +++ b/celt_headers.mk @@ -36,6 +36,7 @@ celt/static_modes_fixed_arm_ne10.h \ celt/arm/armcpu.h \ celt/arm/fixed_armv4.h \ celt/arm/fixed_armv5e.h \ +celt/arm/fixed_arm64.h \ celt/arm/kiss_fft_armv4.h \ celt/arm/kiss_fft_armv5e.h \ celt/arm/pitch_arm.h \ -- 2.4.9 (Apple Git-60)
2018 Aug 30
0
NEON detection under iOs
...codec works great under many platforms. > I have found a small performance issue under iOS platform. > If the macro OPUS_HAVE_RTCD is not set, then encoder doesn't use some _neon functions at low bitrates (up to 64k). If the macro is set, then the compiler hits the error at opus/celt/arm/armcpu.c:153 (a function for detection missed). Being compared to Android version performance degradation is ~30%. > Detection NEON under iOS is a bit tricky, because there is no API for it. I added compiler-time detection. > I made a commit into github repository. Attached please find patch for it....