thr3ads.net - similar to: "[Patch]01-Add ARM5E macros"

[Fast Int64 3/4] Explicitly cast results of silk OPUS_FAST_INT64 macros back to opus_int32.

2015 Nov 16

0

[Fast Int64 3/4] Explicitly cast results of silk OPUS_FAST_INT64 macros back to opus_int32.

--- silk/macros.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/silk/macros.h b/silk/macros.h index 1ba614a..e1e05b9 100644 --- a/silk/macros.h +++ b/silk/macros.h @@ -48,14 +48,14 @@ POSSIBILITY OF SUCH DAMAGE. /* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ #if OPUS_FAST_INT64 -#define silk_SMULWB(a32, b32)

[PATCH] Create OPUS_FAST_INT64 macro, to abstract conditions where opus_int64 should be used.

2015 Aug 04

0

[PATCH] Create OPUS_FAST_INT64 macro, to abstract conditions where opus_int64 should be used.

This patch adds a macro abstracting the condition under which the silk math macros use opus_int64-based calculations rather than opus_int32. No substantive change, but will make it easier to adjust if additional such platforms are found in the future. --- silk/macros.h | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/silk/macros.h b/silk/macros.h index

[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.

2015 Nov 16

3

[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.

--- celt/arch.h | 5 +++++ silk/macros.h | 4 +--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/celt/arch.h b/celt/arch.h index 9f74ddd..670527b 100644 --- a/celt/arch.h +++ b/celt/arch.h @@ -78,6 +78,11 @@ static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line) #define UADD32(a,b) ((a)+(b)) #define USUB32(a,b) ((a)-(b)) +/* Set this if opus_int64

[PATCH 5/8] Arm64 assembly for Silk math.

2015 Aug 05

0

[PATCH 5/8] Arm64 assembly for Silk math.

--- silk/SigProc_FIX.h | 4 +++ silk/arm/SigProc_FIX_arm64.h | 46 ++++++++++++++++++++++++++++++ silk/arm/macros_arm64.h | 66 ++++++++++++++++++++++++++++++++++++++++++++ silk/macros.h | 4 +++ silk_headers.mk | 2 ++ 5 files changed, 122 insertions(+) create mode 100644 silk/arm/SigProc_FIX_arm64.h create mode 100644 silk/arm/macros_arm64.h diff

[RFC PATCHv1] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 21

0

[RFC PATCHv1] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

Optimize celt_pitch_xcorr function (for floating point) using ARM NEON intrinsics for SoCs that have NEON VFP unit. As initial step, targeting ARMv7 NEON (VFP3+) based SoCs. To enable this optimization, use --enable-arm-neon-intrinsics configure option. This flag is not enabled by default. Compile time and runtime checks are also supported to make sure this optimization is only enabled when the

[Fast Int64 4/4] Add OPUS_FAST_INT64 definition of silk_SMULWT.

2015 Nov 16

0

[Fast Int64 4/4] Add OPUS_FAST_INT64 definition of silk_SMULWT.

--- silk/macros.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/silk/macros.h b/silk/macros.h index e1e05b9..7cefedc 100644 --- a/silk/macros.h +++ b/silk/macros.h @@ -61,7 +61,11 @@ POSSIBILITY OF SUCH DAMAGE. #endif /* (a32 * (b32 >> 16)) >> 16 */ +#if OPUS_FAST_INT64 +#define silk_SMULWT(a32, b32) ((opus_int32)(((a32) * (opus_int64)((b32) >> 16))

opus-1.3.1 patch for ARM Cortex-M4F (single precision)

2019 May 27

0

opus-1.3.1 patch for ARM Cortex-M4F (single precision)

The patch prevents KEIL MDK compile warnings, like: warning: #1035-d: single-precision operand implicitly converted to double-precision Actually ARM Cortex-M4F has only a *single precision* (float) FPU. It's suit for all platforms. See the comment at the begin of patch file. Sincerely Forrest Zhang -------------- next part -------------- Specify the floating point constant with single

[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.

2015 Nov 21

8

[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.

--- configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index f52d2c2..e1a6e9b 100644 --- a/configure.ac +++ b/configure.ac @@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd], [enable_rtcd=yes]) AC_ARG_ENABLE([intrinsics], - [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],, +

[PATCH 4/8] Arm64 assembly for Celt fixed-point math.

2015 Aug 05

0

[PATCH 4/8] Arm64 assembly for Celt fixed-point math.

--- celt/arch.h | 2 ++ celt/arm/fixed_arm64.h | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++ celt_headers.mk | 1 + 3 files changed, 78 insertions(+) create mode 100644 celt/arm/fixed_arm64.h diff --git a/celt/arch.h b/celt/arch.h index 9f74ddd..219569b 100644 --- a/celt/arch.h +++ b/celt/arch.h @@ -122,6 +122,8 @@ static OPUS_INLINE opus_int16 SAT16(opus_int32 x)

[Aarch64 06/11] Add aarch64 assembly for Celt fixed-point math.

2015 Nov 07

0

[Aarch64 06/11] Add aarch64 assembly for Celt fixed-point math.

--- celt/arch.h | 2 ++ celt/arm/fixed_arm64.h | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++ celt_headers.mk | 1 + 3 files changed, 78 insertions(+) create mode 100644 celt/arm/fixed_arm64.h diff --git a/celt/arch.h b/celt/arch.h index 9f74ddd..219569b 100644 --- a/celt/arch.h +++ b/celt/arch.h @@ -122,6 +122,8 @@ static OPUS_INLINE opus_int16 SAT16(opus_int32 x)

[PATCH 1/1] Updated opus_types.h to correctly support 8 and 64 bit types.

2015 Jun 03

5

[PATCH 1/1] Updated opus_types.h to correctly support 8 and 64 bit types.

- Replaced blanket #define of 8 & 64 bit types with typedefs for each platform to match 16 & 32 bit types. - Updated existing typedefs for each platform to fix odd values and improve consistency. --- include/opus_types.h | 125 ++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 100 insertions(+), 25 deletions(-) diff --git a/include/opus_types.h

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 13

1

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 12

1

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not

2017 May 08

0

2 patches related to silk_biquad_alt() optimization

Ping for comments. Thanks, Linfeng On Wed, Apr 26, 2017 at 2:15 PM, Linfeng Zhang <linfengz at google.com> wrote: > On Tue, Apr 25, 2017 at 10:31 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> > wrote: > >> >> > A_Q28 is split to 2 14-bit (or 16-bit, whatever) integers, to make the >> > multiplication operation within 32-bits. NEON can do 32-bit x

2017 May 17

0

2 patches related to silk_biquad_alt() optimization

Hi Jean-Marc, Thanks! Please find the 2 updated patches which only optimize stride 2 case and keep the bit exactness. They have passed our internal tests as usual. Thanks, Linfeng On Mon, May 15, 2017 at 9:36 AM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > Sorry for the delay -- I was actually trying to think of the best option > here. For now, my

2017 Apr 26

2

2 patches related to silk_biquad_alt() optimization

On Tue, Apr 25, 2017 at 10:31 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > > > A_Q28 is split to 2 14-bit (or 16-bit, whatever) integers, to make the > > multiplication operation within 32-bits. NEON can do 32-bit x 32-bit = > > 64-bit using 'int64x2_t vmull_s32(int32x2_t a, int32x2_t b)', and it > > could possibly be faster and less

[PATCH 8/8] Optimize silk_NSQ_del_dec() for ARM NEON

2016 Aug 23

0

[PATCH 8/8] Optimize silk_NSQ_del_dec() for ARM NEON

Created corresponding unit test, and the optimization is bit exact with C function. This optimization speeds up SILK encoder on NEON as following. Fixed-point: Complexity 0-5: 0% Complexity 6-7: 6% Complexity 8-9: 10% Complexity 10: 8% Got similar results on floating-point. --- silk/NSQ_del_dec.c | 6 +- silk/SigProc_FIX.h | 4

2017 May 15

2

2 patches related to silk_biquad_alt() optimization

Hi Linfeng, Sorry for the delay -- I was actually trying to think of the best option here. For now, my preference would be to keep things bit-exact, but should there be more similar optimizations relying on 64-bit multiplication results, then we could consider having a special option to enable those (even in C). Cheers, Jean-Marc On 08/05/17 12:12 PM, Linfeng Zhang wrote: > Ping for

[PATCH 9/9] Optimize silk_inner_prod_aligned_scale() for ARM NEON

2016 Aug 26

2

[PATCH 9/9] Optimize silk_inner_prod_aligned_scale() for ARM NEON

Created corresponding unit test, and the optimization is bit exact with C function. --- silk/SigProc_FIX.h | 7 ++- silk/arm/arm_silk_map.c | 12 ++++ silk/arm/inner_prod_aligned_arm.h | 58 +++++++++++++++++++ silk/arm/inner_prod_aligned_neon_intr.c | 66 ++++++++++++++++++++++ silk/enc_API.c

[PATCH 1/2] Declare silk_warped_LPC_analysis_filter_FIX_c in silk/fixed/main_FIX.h.

2015 Nov 02

1

[PATCH 1/2] Declare silk_warped_LPC_analysis_filter_FIX_c in silk/fixed/main_FIX.h.

Fixes build failure on platforms with MAY_HAVE_SSE4_1 (but not PRESUME_SSE4_1) with --enable-intrinsics. --- silk/fixed/main_FIX.h | 11 +++++++++++ silk/x86/x86_silk_map.c | 2 ++ 2 files changed, 13 insertions(+) diff --git a/silk/fixed/main_FIX.h b/silk/fixed/main_FIX.h index ffeb4f3..375b5eb 100644 --- a/silk/fixed/main_FIX.h +++ b/silk/fixed/main_FIX.h @@ -97,6 +97,17 @@ void

similar to: [Patch]01-Add ARM5E macros