Chandrakala Madhira
2017-Nov-20 06:21 UTC
[opus] Reg an issue with smoothing factor in VAD implementation
Hi, We are looking at the VAD implementation used in opus. We are looking at the code where speech probability is calculated based on which SNR is estimated. Below is the part of the code I am talking about. /*********************************/ /* Speech Probability Estimation */ /*********************************/ SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment by me /* Power scaling */ if( speech_nrg <= 0 ) { // step2: update speech probability based on speech energy : comment by me SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); } else if( speech_nrg < 32768 ) { if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Energy is doubled here : comment by me } else { speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 15 ); } /* square-root */ speech_nrg = silk_SQRT_APPROX( speech_nrg ); SA_Q15 = silk_SMULWB( 32768 + speech_nrg, SA_Q15 ); } /* Smoothing coefficient */ smooth_coef_Q16 = silk_SMULWB( VAD_SNR_SMOOTH_COEF_Q18, silk_SMULWB( (opus_int32)SA_Q15, SA_Q15 ) ); // step3: Update the smoothing factor based on speech probability : comment by me if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { smooth_coef_Q16 >>= 1; } Here, in step1, Speech probability is calculated whose value is expected to be within [0, 1) in Q15 format. Then based on the speech energy levels, in Step2, the probability is updated whose value shall also lie between [0, 1). Later in Step3, the smooth coeff is calculated. This code do not have any issue when the frame size is more than or equal to 20msec. But, if the frame size is 10ms, then in step2, the energy is doubled (this may be done because the original Silk code is for 20ms. To convert the energy for 20ms, it could have been doubled). When this is done the probability which is updated in step2 becomes more than 1. When this is used in multiplication in Step3, the value is treated as a negative number because its a 32x16 multiplication. This is will result in a negative smooth coefficient. Please let me know if this is a bug. Thank you, Chandrakala -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20171120/8fcd19c3/attachment.html>
Reasonably Related Threads
- Reg an issue with smoothing factor in VAD implementation
- Reg an issue with smoothing factor in VAD implementation
- Reg an issue with smoothing factor in VAD implementation
- Reg an issue with smoothing factor in VAD implementation
- Reg an issue with smoothing factor in VAD implementation