thr3ads.net - search: "opus

[PATCH] Create OPUS_FAST_INT64 macro, to abstract conditions where opus_int64 should be used.

2015 Aug 04

0

[PATCH] Create OPUS_FAST_INT64 macro, to abstract conditions where opus_int64 should be used.

This patch adds a macro abstracting the condition under which the silk math macros use opus_int64-based calculations rather than opus_int32. No substantive change, but will make it easier to adjust if additional such platforms are found in the future. --- silk/macros.h | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/silk/macros.h b/silk/macros.h index 2f24950...

[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.

2015 Nov 16

3

[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.

...deletions(-) diff --git a/celt/arch.h b/celt/arch.h index 9f74ddd..670527b 100644 --- a/celt/arch.h +++ b/celt/arch.h @@ -78,6 +78,11 @@ static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line) #define UADD32(a,b) ((a)+(b)) #define USUB32(a,b) ((a)-(b)) +/* Set this if opus_int64 is a native type of the CPU. */ +/* Assume that all LP64 architectures have fast 64-bit types; also x86_64 (which can be ILP32 for x32) + and Win64 (which is LLP64). */ +#define OPUS_FAST_INT64 (defined(__LP64__) || defined(__x86_64__) || defined(_WIN64)) + #define PRINT_MIPS(file) #ifdef FIX...

[Fast Int64 3/4] Explicitly cast results of silk OPUS_FAST_INT64 macros back to opus_int32.

2015 Nov 16

0

[Fast Int64 3/4] Explicitly cast results of silk OPUS_FAST_INT64 macros back to opus_int32.

....h b/silk/macros.h index 1ba614a..e1e05b9 100644 --- a/silk/macros.h +++ b/silk/macros.h @@ -48,14 +48,14 @@ POSSIBILITY OF SUCH DAMAGE. /* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ #if OPUS_FAST_INT64 -#define silk_SMULWB(a32, b32) (((a32) * (opus_int64)((opus_int16)(b32))) >> 16) +#define silk_SMULWB(a32, b32) ((opus_int32)(((a32) * (opus_int64)((opus_int16)(b32))) >> 16)) #else #define silk_SMULWB(a32, b32) ((((a32) >> 16) * (opus_int32)((opus_int16)(b32))) + ((((a32) & 0x0000FFFF) * (opus_int32)((op...

[PATCH 1/1] Updated opus_types.h to correctly support 8 and 64 bit types.

2015 Jun 03

5

[PATCH 1/1] Updated opus_types.h to correctly support 8 and 64 bit types.

...|| defined (HAVE_STDINT_H)) -#include <stdint.h> +# include <stdint.h> + typedef int8_t opus_int8; + typedef uint8_t opus_uint8; typedef int16_t opus_int16; typedef uint16_t opus_uint16; typedef int32_t opus_int32; typedef uint32_t opus_uint32; + typedef int64_t opus_int64; + typedef uint64_t opus_uint64; + #elif defined(_WIN32) # if defined(__CYGWIN__) -# include <_G_config.h> - typedef _G_int32_t opus_int32; - typedef _G_uint32_t opus_uint32; - typedef _G_int16 opus_int16; - typedef _G_uint16 opus_uint16; +# include <stdint...

2017 Apr 26

2

2 patches related to silk_biquad_alt() optimization

...e == 1 ) || ( stride == 2 ) ); if( stride == 1) { opus_int32 out32_Q14; for( k = 0; k < len; k++ ) { /* S[ 0 ], S[ 1 ]: Q12 */ out32_Q14 = silk_LSHIFT( silk_SMLAWB( S[ 0 ], B_Q28[ 0 ], in[ k ] ), 2 ); S[ 0 ] = S[ 1 ] + silk_RSHIFT_ROUND( (opus_int64)out32_Q14 * (-A_Q28[ 0 ]), 30 ); S[ 0 ] = silk_SMLAWB( S[ 0 ], B_Q28[ 1 ], in[ k ] ); S[ 1 ] = silk_RSHIFT_ROUND( (opus_int64)out32_Q14 * (-A_Q28[ 1 ]) , 30 ); S[ 1 ] = silk_SMLAWB( S[ 1 ], B_Q28[ 2 ], in[ k ] ); /* Scale back to Q0 and saturate */...

2017 May 15

2

2 patches related to silk_biquad_alt() optimization

...opus_int32 out32_Q14; > for( k = 0; k < len; k++ ) { > /* S[ 0 ], S[ 1 ]: Q12 */ > out32_Q14 = silk_LSHIFT( silk_SMLAWB( S[ 0 ], B_Q28[ 0 > ], in[ k ] ), 2 ); > > S[ 0 ] = S[ 1 ] + silk_RSHIFT_ROUND( > (opus_int64)out32_Q14 * (-A_Q28[ 0 ]), 30 ); > S[ 0 ] = silk_SMLAWB( S[ 0 ], B_Q28[ 1 ], in[ k ] ); > > S[ 1 ] = silk_RSHIFT_ROUND( (opus_int64)out32_Q14 * > (-A_Q28[ 1 ]) , 30 ); > S[ 1 ] = silk_SMLAWB( S[ 1 ], B_Q28[ 2 ], in[ k ] ); > &g...

[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.

2015 Nov 21

8

[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.

--- configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index f52d2c2..e1a6e9b 100644 --- a/configure.ac +++ b/configure.ac @@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd], [enable_rtcd=yes]) AC_ARG_ENABLE([intrinsics], - [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],, +

2017 May 08

0

2 patches related to silk_biquad_alt() optimization

...if( stride == 1) { > opus_int32 out32_Q14; > for( k = 0; k < len; k++ ) { > /* S[ 0 ], S[ 1 ]: Q12 */ > out32_Q14 = silk_LSHIFT( silk_SMLAWB( S[ 0 ], B_Q28[ 0 ], in[ > k ] ), 2 ); > > S[ 0 ] = S[ 1 ] + silk_RSHIFT_ROUND( (opus_int64)out32_Q14 * > (-A_Q28[ 0 ]), 30 ); > S[ 0 ] = silk_SMLAWB( S[ 0 ], B_Q28[ 1 ], in[ k ] ); > > S[ 1 ] = silk_RSHIFT_ROUND( (opus_int64)out32_Q14 * (-A_Q28[ > 1 ]) , 30 ); > S[ 1 ] = silk_SMLAWB( S[ 1 ], B_Q28[ 2 ], in[ k ] ); > >...

2017 May 17

0

2 patches related to silk_biquad_alt() optimization

...; for( k = 0; k < len; k++ ) { > > /* S[ 0 ], S[ 1 ]: Q12 */ > > out32_Q14 = silk_LSHIFT( silk_SMLAWB( S[ 0 ], B_Q28[ 0 > > ], in[ k ] ), 2 ); > > > > S[ 0 ] = S[ 1 ] + silk_RSHIFT_ROUND( > > (opus_int64)out32_Q14 * (-A_Q28[ 0 ]), 30 ); > > S[ 0 ] = silk_SMLAWB( S[ 0 ], B_Q28[ 1 ], in[ k ] ); > > > > S[ 1 ] = silk_RSHIFT_ROUND( (opus_int64)out32_Q14 * > > (-A_Q28[ 1 ]) , 30 ); > > S[ 1 ] = silk_SMLAWB( S[ 1 ], B_Q28[...

[Fast Int64 2/4] Add OPUS_FAST_INT64 flavors of celt/fixed_generic.h macros.

2015 Nov 16

0

[Fast Int64 2/4] Add OPUS_FAST_INT64 flavors of celt/fixed_generic.h macros.

..._generic.h +++ b/celt/fixed_generic.h @@ -37,16 +37,32 @@ #define MULT16_16SU(a,b) ((opus_val32)(opus_val16)(a)*(opus_val32)(opus_uint16)(b)) /** 16x32 multiplication, followed by a 16-bit shift right. Results fits in 32 bits */ +#if OPUS_FAST_INT64 +#define MULT16_32_Q16(a,b) ((opus_val32)SHR((opus_int64)((opus_val16)(a))*(b),16)) +#else #define MULT16_32_Q16(a,b) ADD32(MULT16_16((a),SHR((b),16)), SHR(MULT16_16SU((a),((b)&0x0000ffff)),16)) +#endif /** 16x32 multiplication, followed by a 16-bit shift right (round-to-nearest). Results fits in 32 bits */ +#if OPUS_FAST_INT64 +#define MULT16_32...

Problem with opusfile & ndk

2019 Jun 07

1

Problem with opusfile & ndk

Hi Xiph.org Team. We are using opusfile library <https://github.com/xiph/opusfile> for streaming *.opus* audio in our projects. But now we have a problem with building opusfile library for android with *ndk-build*. In particular, with arm64-v8a platform: Google removed <sys/timeb.h> from android. And now building opusfile with nkd-build crashes with error "fatal error:

2017 Apr 25

2

2 patches related to silk_biquad_alt() optimization

On Mon, Apr 24, 2017 at 5:52 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > On 24/04/17 08:03 PM, Linfeng Zhang wrote: > > Tested on my chromebook, when stride (channel) == 1, the optimization > > has no gain compared with C function. > > You mean that the Neon code is the same speed as the C code for > stride==1? This is not terribly surprising for an IIRC

[PATCH 5/8] Arm64 assembly for Silk math.

2015 Aug 05

0

[PATCH 5/8] Arm64 assembly for Silk math.

...files changed, 122 insertions(+) create mode 100644 silk/arm/SigProc_FIX_arm64.h create mode 100644 silk/arm/macros_arm64.h diff --git a/silk/SigProc_FIX.h b/silk/SigProc_FIX.h index b632994..0a6969d 100644 --- a/silk/SigProc_FIX.h +++ b/silk/SigProc_FIX.h @@ -603,6 +603,10 @@ static OPUS_INLINE opus_int64 silk_max_64(opus_int64 a, opus_int64 b) #include "arm/SigProc_FIX_armv5e.h" #endif +#ifdef OPUS_ARM64_INLINE_ASM +#include "arm/SigProc_FIX_arm64.h" +#endif + #if defined(MIPSr1_ASM) #include "mips/sigproc_fix_mipsr1.h" #endif diff --git a/silk/arm/SigProc_FIX_a...

[PATCH 4/8] Arm64 assembly for Celt fixed-point math.

2015 Aug 05

0

[PATCH 4/8] Arm64 assembly for Celt fixed-point math.

...EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef FIXED_ARM64_H +#define FIXED_ARM64_H + +/** 16x32 multiplication, followed by a 16-bit shift right. Results fits in 32 bits */ +#undef MULT16_32_Q16 +static OPUS_INLINE opus_val32 MULT16_32_Q16_arm64(opus_val16 a, opus_val32 b) +{ + opus_int64 rd; + __asm__( + "smull %x0, %w1, %w2\n\t" + : "=&r"(rd) + : "%r"(b), "r"(a<<16) + ); + return (rd >> 32); +} +#define MULT16_32_Q16(a, b) (MULT16_32_Q16_arm64(a, b)) + + +/** 16x32 multiplication, followed by a 15-bit shift...

[Aarch64 06/11] Add aarch64 assembly for Celt fixed-point math.

2015 Nov 07

0

[Aarch64 06/11] Add aarch64 assembly for Celt fixed-point math.

...EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef FIXED_ARM64_H +#define FIXED_ARM64_H + +/** 16x32 multiplication, followed by a 16-bit shift right. Results fits in 32 bits */ +#undef MULT16_32_Q16 +static OPUS_INLINE opus_val32 MULT16_32_Q16_arm64(opus_val16 a, opus_val32 b) +{ + opus_int64 rd; + __asm__( + "smull %x0, %w1, %w2\n\t" + : "=&r"(rd) + : "%r"(b), "r"(a<<16) + ); + return (rd >> 32); +} +#define MULT16_32_Q16(a, b) (MULT16_32_Q16_arm64(a, b)) + + +/** 16x32 multiplication, followed by a 15-bit shift...

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 13

2

[Aarch64 00/11] Patches to enable Aarch64

...ce only performed one multiply while the ones you did replace performed two. I think it would be very interesting to try the benchmarks again after adding detection for __aarch64__ to OPUS_FAST_INT64, and replacing MULT16_32_Q15 with something like: #define MULT16_32_Q15(a,b) ((opus_val32)SHR((opus_int64)((opus_val16)(a))*(b),15)) and MULT16_32_Q16 with: #define MULT16_32_Q16(a,b) ((opus_val32)SHR((opus_int64)((opus_val16)(a))*(b),16)) There isn't an "opus_val64" or I would have used it (it would be defined as "float" for floating point). I think the problem here is that...

[PATCH 1/1] Updated opus_types.h to correctly support 8 and 64 bit types.

2015 Jun 03

1

[PATCH 1/1] Updated opus_types.h to correctly support 8 and 64 bit types.

...hese changes through properly and was only really considering common desktop systems. I was originally fixing an issue when compiling my code as 64bit where my use of int64_t (defined as long on my platform) was conflicting with the opusfile stream callbacks (e.g. op_seek_func) which were using the opus_int64 type which was defined as long long. When I looked at the header I thought it could do with some work, so instead of putting casts in my code, I modified the header and sent a patch. I didn't even consider building for architectures which could have a 40bit long etc. and am obviously a bit out...

[Patch]01-Add ARM5E macros

2013 May 17

1

[Patch]01-Add ARM5E macros

...%1, %2;\n" + : "=&r"(res) + : "r"(a), "r"(b) + ); + return res; +} + +#endif diff --git a/silk/SigProc_FIX.h b/silk/SigProc_FIX.h index cf1ab36..bccd73d 100644 --- a/silk/SigProc_FIX.h +++ b/silk/SigProc_FIX.h @@ -576,6 +576,10 @@ static inline opus_int64 silk_max_64(opus_int64 a, opus_int64 b) #include "MacroCount.h" #include "MacroDebug.h" +#ifdef ARM5E_ASM +#include "SigProc_FIX_arm5e.h" +#endif + #ifdef __cplusplus } #endif diff --git a/silk/SigProc_FIX_arm5e.h b/silk/SigProc_FIX_arm5e.h new file mode 100644...

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 13

2

[Aarch64 00/11] Patches to enable Aarch64

...multiply while the ones you did replace performed two. >> >> I think it would be very interesting to try the benchmarks again after adding detection for __aarch64__ to OPUS_FAST_INT64, and replacing MULT16_32_Q15 with something like: >> #define MULT16_32_Q15(a,b) ((opus_val32)SHR((opus_int64)((opus_val16)(a))*(b),15)) >> and MULT16_32_Q16 with: >> #define MULT16_32_Q16(a,b) ((opus_val32)SHR((opus_int64)((opus_val16)(a))*(b),16)) >> There isn't an "opus_val64" or I would have used it (it would be defined as "float" for floating point). >>...

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 12

2

[Aarch64 00/11] Patches to enable Aarch64

One other minor thing: I notice that in the inline assembly the result (rd) is constrained as an earlyclobber operand. What was the reason for that?

search for: opus_int64