thr3ads.net - opus - [opus] Reg an issue with smoothing factor in VAD implementation [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Chandrakala Madhira

2017-Nov-20 06:21 UTC

[opus] Reg an issue with smoothing factor in VAD implementation

Hi, 

We are looking at the VAD implementation used in opus. We are looking at the
code where speech probability is calculated based on which SNR is estimated.
Below is the part of the code I am talking about.

/*********************************/ 
/* Speech Probability Estimation */ 
/*********************************/ 
SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) -
VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment by me

/* Power scaling */ 
if( speech_nrg <= 0 ) { // step2: update speech probability based on speech
energy : comment by me
SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); 
} else if( speech_nrg < 32768 ) { 
if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { 
speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Energy is doubled here :
comment by me
} else { 
speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 15 ); 
} 

/* square-root */ 
speech_nrg = silk_SQRT_APPROX( speech_nrg ); 
SA_Q15 = silk_SMULWB( 32768 + speech_nrg, SA_Q15 ); 
} 

/* Smoothing coefficient */ 
smooth_coef_Q16 = silk_SMULWB( VAD_SNR_SMOOTH_COEF_Q18, silk_SMULWB(
(opus_int32)SA_Q15, SA_Q15 ) ); // step3: Update the smoothing factor based on
speech probability : comment by me

if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { 
smooth_coef_Q16 >>= 1; 
} 

Here, in step1, Speech probability is calculated whose value is expected to be
within [0, 1) in Q15 format. Then based on the speech energy levels, in Step2,
the probability is updated whose value shall also lie between [0, 1). Later in
Step3, the smooth coeff is calculated. This code do not have any issue when the
frame size is more than or equal to 20msec. But, if the frame size is 10ms, then
in step2, the energy is doubled (this may be done because the original Silk code
is for 20ms. To convert the energy for 20ms, it could have been doubled). When
this is done the probability which is updated in step2 becomes more than 1. When
this is used in multiplication in Step3, the value is treated as a negative
number because its a 32x16 multiplication. This is will result in a negative
smooth coefficient. Please let me know if this is a bug.


Thank you, 
Chandrakala 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xiph.org/pipermail/opus/attachments/20171120/8fcd19c3/attachment.html>

Apparently Analagous Threads

Search for more apparently analagous threads

opus - Nov 2017 - Reg an issue with smoothing factor in VAD implementation

[opus] Reg an issue with smoothing factor in VAD implementation

Apparently Analagous Threads

Wisdom of the Ancients