search for: silk_smulwb

Displaying 20 results from an estimated 20 matches for "silk_smulwb".

Did you mean: silk_smulww
2017 Nov 20
4
Reg an issue with smoothing factor in VAD implementation
...t > the code where speech probability is calculated based on which SNR is > estimated. Below is the part of the code I am talking about. > > /*********************************/ > /* Speech Probability Estimation */ > /*********************************/ > SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - > VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment > by me > > /* Power scaling */ > if( speech_nrg <= 0 ) { // step2: update speech probability based on > speech energy : comment by me > SA_Q15 = silk_RSHIFT( SA_...
2017 Nov 27
3
Reg an issue with smoothing factor in VAD implementation
...used in opus. We are looking at the code where speech probability is calculated based on which SNR is estimated. Below is the part of the code I am talking about. /*********************************/ /* Speech Probability Estimation */ /*********************************/ SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment by me /* Power scaling */ if( speech_nrg <= 0 ) { // step2: update speech probability based on speech energy : comment by me SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); } else if( speech_nr...
2018 Feb 16
1
Reg an issue with smoothing factor in VAD implementation
...t sign-extension > e.g. 0x00008b3b), but that doesn't seem to be the case here. > The problem is even worse when opus_int is defined as 16-bit in the > platform - SA_Q15 overflows to negative right here with a similar effect > speech_nrg = silk_SQRT_APPROX(speech_nrg); > SA_Q15 = silk_SMULWB(32768 + speech_nrg, SA_Q15); > > I can't speak for the logic where the speech energy gets doubled. It > obviously seems intentional but I don't know why. Maybe so that > smoothing is performed at a constant rate regardless of whether > framesize is 10 or 20ms? > > On...
2015 Nov 16
0
[Fast Int64 3/4] Explicitly cast results of silk OPUS_FAST_INT64 macros back to opus_int32.
...+), 5 deletions(-) diff --git a/silk/macros.h b/silk/macros.h index 1ba614a..e1e05b9 100644 --- a/silk/macros.h +++ b/silk/macros.h @@ -48,14 +48,14 @@ POSSIBILITY OF SUCH DAMAGE. /* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ #if OPUS_FAST_INT64 -#define silk_SMULWB(a32, b32) (((a32) * (opus_int64)((opus_int16)(b32))) >> 16) +#define silk_SMULWB(a32, b32) ((opus_int32)(((a32) * (opus_int64)((opus_int16)(b32))) >> 16)) #else #define silk_SMULWB(a32, b32) ((((a32) >> 16) * (opus_int32)((opus_int16)(b32))) + ((...
2017 Nov 20
0
Reg an issue with smoothing factor in VAD implementation
...used in opus. We are looking at the code where speech probability is calculated based on which SNR is estimated. Below is the part of the code I am talking about. /*********************************/ /* Speech Probability Estimation */ /*********************************/ SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment by me /* Power scaling */ if( speech_nrg <= 0 ) { // step2: update speech probability based on speech energy : comment by me SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); } else if( speech_nr...
2017 Nov 27
0
Reg an issue with smoothing factor in VAD implementation
...ed in a 32-bit int without sign-extension e.g. 0x00008b3b), but that doesn't seem to be the case here. The problem is even worse when opus_int is defined as 16-bit in the platform - SA_Q15 overflows to negative right here with a similar effect speech_nrg = silk_SQRT_APPROX(speech_nrg); SA_Q15 = silk_SMULWB(32768 + speech_nrg, SA_Q15); I can't speak for the logic where the speech energy gets doubled. It obviously seems intentional but I don't know why. Maybe so that smoothing is performed at a constant rate regardless of whether framesize is 10 or 20ms? On Sun, Nov 26, 2017 at 8:07 PM, Chand...
2017 Nov 22
0
Reg an issue with smoothing factor in VAD implementation
...eech probability is calculated based on which SNR is >> estimated. Below is the part of the code I am talking about. >> >> /*********************************/ >> /* Speech Probability Estimation */ >> /*********************************/ >> SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - >> VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment >> by me >> >> /* Power scaling */ >> if( speech_nrg <= 0 ) { // step2: update speech probability based on >> speech energy : comment by me >&g...
2015 Aug 04
0
[PATCH] Create OPUS_FAST_INT64 macro, to abstract conditions where opus_int64 should be used.
...86_64__) || defined(__LP64__) || defined(_WIN64)) + /* This is an OPUS_INLINE header file for general platform. */ /* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ -#if defined(__x86_64__) || defined(__LP64__) || defined(_WIN64) +#if OPUS_FAST_INT64 #define silk_SMULWB(a32, b32) (((a32) * (opus_int64)((opus_int16)(b32))) >> 16) #else #define silk_SMULWB(a32, b32) ((((a32) >> 16) * (opus_int32)((opus_int16)(b32))) + ((((a32) & 0x0000FFFF) * (opus_int32)((opus_int16)(b32))) >> 16)) #endif /* a32 + (b32 * (opus_int32...
2015 Nov 16
3
[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.
--- celt/arch.h | 5 +++++ silk/macros.h | 4 +--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/celt/arch.h b/celt/arch.h index 9f74ddd..670527b 100644 --- a/celt/arch.h +++ b/celt/arch.h @@ -78,6 +78,11 @@ static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line) #define UADD32(a,b) ((a)+(b)) #define USUB32(a,b) ((a)-(b)) +/* Set this if opus_int64
2015 Nov 21
8
[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.
--- configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index f52d2c2..e1a6e9b 100644 --- a/configure.ac +++ b/configure.ac @@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd], [enable_rtcd=yes]) AC_ARG_ENABLE([intrinsics], - [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],, +
2017 Apr 25
2
2 patches related to silk_biquad_alt() optimization
...0x00003FFF; /* lower part */ A0_U_Q28 = silk_RSHIFT( -A_Q28[ 0 ], 14 ); /* upper part */ A1_L_Q28 = ( -A_Q28[ 1 ] ) & 0x00003FFF; /* lower part */ A1_U_Q28 = silk_RSHIFT( -A_Q28[ 1 ], 14 ); /* upper part */ ... S[ 0 ] = S[1] + silk_RSHIFT_ROUND( silk_SMULWB( out32_Q14, A0_L_Q28 ), 14 ); S[ 0 ] = silk_SMLAWB( S[ 0 ], out32_Q14, A0_U_Q28 ); S[ 0 ] = silk_SMLAWB( S[ 0 ], B_Q28[ 1 ], inval); S[ 1 ] = silk_RSHIFT_ROUND( silk_SMULWB( out32_Q14, A1_L_Q28 ), 14 ); S[ 1 ] = silk_SMLAWB( S[ 1 ], out32_Q14, A1_U_Q28 ); S[...
2017 Apr 25
2
2 patches related to silk_biquad_alt() optimization
Hi Jean-Marc, Tested on my chromebook, when stride (channel) == 1, the optimization has no gain compared with C function. When stride (channel) == 2, the optimization is 1.2%-1.8% faster (1.6% at Complexity 8) compared with C function. Please let me know and I can remove the optimization of stride 1 case. If it's allowed to skip the split of A_Q28 and replace by 32-bit multiplication
2017 Apr 26
2
2 patches related to silk_biquad_alt() optimization
...SHIFT_ROUND( (opus_int64)out32_Q14[ {0,1,0,1} ] * (-A_Q28[ {0,0,1,1} ]), 30 ) */ *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); /* S[ {0,1,2,3} ] = {S[ {2,3} ],0,0} + silk_RSHIFT_ROUND( ); */ t_s32x4 = vqdmulhq_s32(inval_s32x4, B_Q28_s32x4); /* silk_SMULWB(B_Q28[ {1,1,2,2} ], in[ k * 2 + {0,1,0,1} ] ) */ *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); /* S[ {0,1,2,3} ] = silk_SMLAWB( S[ {0,1,2,3} ], B_Q28[ {1,1,2,2} ], in[ k * 2 + {0,1,0,1} ] ); */ } Thanks, Linfeng -------------- next part -------------- An HTML attach...
2017 May 15
2
2 patches related to silk_biquad_alt() optimization
...gt; *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); > /* S[ {0,1,2,3} ] = {S[ {2,3} ],0,0} + silk_RSHIFT_ROUND( ); > */ > t_s32x4 = vqdmulhq_s32(inval_s32x4, B_Q28_s32x4); > /* silk_SMULWB(B_Q28[ {1,1,2,2} ], in[ k * 2 + {0,1,0,1} ] ) > */ > *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); > /* S[ {0,1,2,3} ] = silk_SMLAWB( S[ {0,1,2,3} ], B_Q28[ > {1,1,2,2} ], in[ k * 2 + {0,1,0,1} ] ); */ &gt...
2013 May 17
1
[Patch]01-Add ARM5E macros
...VEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifndef SILK_MACROS_ARM5E_H +#define SILK_MACROS_ARM5E_H + +/* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ +static inline opus_int32 silk_SMULWB(opus_int32 a, opus_int16 b) +{ + int res; + __asm__( + "smulwb %0, %1, %2;\n" + : "=&r"(res) + : "r"(a), "r"(b) + ); + return res; +} + +/* a32 + (b32 * (opus_int32)((opus_int16)(c32))) >> 16 output have to be 32bit int */ +st...
2016 Aug 26
2
[PATCH 9/9] Optimize silk_inner_prod_aligned_scale() for ARM NEON
...arch /* I Architecture */ ) { opus_int n, is10msFrame, denom_Q16, delta0_Q13, delta1_Q13; @@ -98,8 +99,8 @@ void silk_stereo_LR_to_MS( SILK_FIX_CONST( STEREO_RATIO_SMOOTH_COEF, 16 ); smooth_coef_Q16 = silk_SMULWB( silk_SMULBB( prev_speech_act_Q8, prev_speech_act_Q8 ), smooth_coef_Q16 ); - pred_Q13[ 0 ] = silk_stereo_find_predictor( &LP_ratio_Q14, LP_mid, LP_side, &state->mid_side_amp_Q0[ 0 ], frame_length, smooth_coef_Q16 ); - pred_Q13[ 1 ] = silk_stereo_find_predictor( &HP_ratio_Q14,...
2017 May 08
0
2 patches related to silk_biquad_alt() optimization
..._Q14[ {0,1,0,1} ] * (-A_Q28[ > {0,0,1,1} ]), 30 ) */ > *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); > /* S[ {0,1,2,3} ] = {S[ {2,3} ],0,0} + silk_RSHIFT_ROUND( ); > */ > t_s32x4 = vqdmulhq_s32(inval_s32x4, B_Q28_s32x4); > /* silk_SMULWB(B_Q28[ {1,1,2,2} ], in[ k * 2 + {0,1,0,1} ] ) > */ > *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); > /* S[ {0,1,2,3} ] = silk_SMLAWB( S[ {0,1,2,3} ], B_Q28[ {1,1,2,2} ], in[ > k * 2 + {0,1,0,1} ] ); */ > } > > Thanks, > Linfeng > ----------...
2017 May 17
0
2 patches related to silk_biquad_alt() optimization
...> *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); > > /* S[ {0,1,2,3} ] = {S[ {2,3} ],0,0} + silk_RSHIFT_ROUND( ); > > */ > > t_s32x4 = vqdmulhq_s32(inval_s32x4, B_Q28_s32x4); > > /* silk_SMULWB(B_Q28[ {1,1,2,2} ], in[ k * 2 + {0,1,0,1} ] ) > > */ > > *S_s32x4 = vaddq_s32(*S_s32x4, t_s32x4); > > /* S[ {0,1,2,3} ] = silk_SMLAWB( S[ {0,1,2,3} ], B_Q28[ > > {1,1,2,2} ], in[ k * 2 + {0,1,0,1} ] ); */...
2016 Aug 23
0
[PATCH 8/8] Optimize silk_NSQ_del_dec() for ARM NEON
...IFT( LTP_pred_Q14, 1 ); /* Q13 -> Q14 */ + pred_lag_ptr++; + } else { + LTP_pred_Q14 = 0; + } + + /* Long-term shaping */ + if( lag > 0 ) { + /* Symmetric, packed FIR coefficients */ + n_LTP_Q14 = silk_SMULWB( silk_ADD32( shp_lag_ptr[ 0 ], shp_lag_ptr[ -2 ] ), HarmShapeFIRPacked_Q14 ); + n_LTP_Q14 = silk_SMLAWT( n_LTP_Q14, shp_lag_ptr[ -1 ], HarmShapeFIRPacked_Q14 ); + n_LTP_Q14 = silk_SUB_LSHIFT32( LTP_pred_Q14, n_LTP_Q14, 2 ); /* Q12 -> Q14 */ +...
2016 Aug 23
2
[PATCH 7/8] Update NSQ_LPC_BUF_LENGTH macro.
NSQ_LPC_BUF_LENGTH is independent of DECISION_DELAY. --- silk/define.h | 4 ---- 1 file changed, 4 deletions(-) diff --git a/silk/define.h b/silk/define.h index 781cfdc..1286048 100644 --- a/silk/define.h +++ b/silk/define.h @@ -173,11 +173,7 @@ extern "C" #define MAX_MATRIX_SIZE MAX_LPC_ORDER /* Max of LPC Order and LTP order */ -#if( MAX_LPC_ORDER >