thr3ads.net - search: "speech

Reg an issue with smoothing factor in VAD implementation

2017 Nov 20

4

Reg an issue with smoothing factor in VAD implementation

...*********/ > /* Speech Probability Estimation */ > /*********************************/ > SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - > VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment > by me > > /* Power scaling */ > if( speech_nrg <= 0 ) { // step2: update speech probability based on > speech energy : comment by me > SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); > } else if( speech_nrg < 32768 ) { > if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { > speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Ener...

Reg an issue with smoothing factor in VAD implementation

2018 Feb 16

1

Reg an issue with smoothing factor in VAD implementation

...; where the 16-bit value is stored in a 32-bit int without sign-extension > e.g. 0x00008b3b), but that doesn't seem to be the case here. > The problem is even worse when opus_int is defined as 16-bit in the > platform - SA_Q15 overflows to negative right here with a similar effect > speech_nrg = silk_SQRT_APPROX(speech_nrg); > SA_Q15 = silk_SMULWB(32768 + speech_nrg, SA_Q15); > > I can't speak for the logic where the speech energy gets doubled. It > obviously seems intentional but I don't know why. Maybe so that > smoothing is performed at a constant rate regardle...

Reg an issue with smoothing factor in VAD implementation

2017 Nov 27

3

Reg an issue with smoothing factor in VAD implementation

...about. /*********************************/ /* Speech Probability Estimation */ /*********************************/ SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment by me /* Power scaling */ if( speech_nrg <= 0 ) { // step2: update speech probability based on speech energy : comment by me SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); } else if( speech_nrg < 32768 ) { if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Energy is doubled here :...

Reg an issue with smoothing factor in VAD implementation

2017 Nov 20

0

Reg an issue with smoothing factor in VAD implementation

...about. /*********************************/ /* Speech Probability Estimation */ /*********************************/ SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment by me /* Power scaling */ if( speech_nrg <= 0 ) { // step2: update speech probability based on speech energy : comment by me SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); } else if( speech_nrg < 32768 ) { if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Energy is doubled here :...

Reg an issue with smoothing factor in VAD implementation

2017 Nov 27

0

Reg an issue with smoothing factor in VAD implementation

...he sign bit (in cases where the 16-bit value is stored in a 32-bit int without sign-extension e.g. 0x00008b3b), but that doesn't seem to be the case here. The problem is even worse when opus_int is defined as 16-bit in the platform - SA_Q15 overflows to negative right here with a similar effect speech_nrg = silk_SQRT_APPROX(speech_nrg); SA_Q15 = silk_SMULWB(32768 + speech_nrg, SA_Q15); I can't speak for the logic where the speech energy gets doubled. It obviously seems intentional but I don't know why. Maybe so that smoothing is performed at a constant rate regardless of whether framesize i...

Reg an issue with smoothing factor in VAD implementation

2017 Nov 22

0

Reg an issue with smoothing factor in VAD implementation

...obability Estimation */ >> /*********************************/ >> SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) - >> VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment >> by me >> >> /* Power scaling */ >> if( speech_nrg <= 0 ) { // step2: update speech probability based on >> speech energy : comment by me >> SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); >> } else if( speech_nrg < 32768 ) { >> if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { >> speech_nrg = silk_LSHIFT_SAT32( speec...

search for: speech_nrg