From 1.2rc1 source code: preprocess.c: /* FIXME: This VAD is a kludge */ st->speech_prob = Pframe; if (st->vad_enabled) { if (st->speech_prob > st->speech_prob_start || (st->was_speech && st->speech_prob > st->speech_prob_continue)) { st->was_speech=1; return 1; } else { st->was_speech=0; return 0; } } else { return 1; } AND case SPEEX_PREPROCESS_SET_VAD: speex_warning("The VAD has been replaced by a hack pending a complete rewrite"); st->vad_enabled = (*(spx_int32_t*)ptr); break; As you can see, it is a hack, not supposed to be very good. Still, weird that you get this result. If you want a sample to test it, check speexenc.c file in speex source code package. I usually implement this kind of feature outside of speex becauses I want a better control on how it happens and when to do something about it :) Yanick Bourbeau On 11-08-29 04:16 PM, Clifton Craig wrote:> +1 on this question as I wanted to start playing with it as well. > > On Aug 29, 2011, at 9:41 AM, Shridhar, Vasant wrote: > >> I have been trying to understand how to get the VAD algorithm >> working. I sent an input stream of all zeros into the preprocessor >> but still got a return value of 1 indicating that speech was >> detected. Is this feature not available with the latest release? I >> thought at the very least it would detect this as silence and return >> 0 but that does not seem to be the case. >> Does anyone have any information on how to use this or some example >> code to set this up I might try. >> Thanks, >> Vas >> _______________________________________________ >> Speex-dev mailing list >> Speex-dev at xiph.org <mailto:Speex-dev at xiph.org> >> http://lists.xiph.org/mailman/listinfo/speex-dev > > > > _______________________________________________ > Speex-dev mailing list > Speex-dev at xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20110829/221e7b09/attachment.htm
I am using the speexenc as my testbed app. I have also looked at this code but am uncertain as to why a zero input would allow this code to pass. I am not sure what the statistics they are looking for. Do you have information you can share as to how you normally perform this test? Vas ________________________________________ From: speex-dev-bounces at xiph.org [speex-dev-bounces at xiph.org] On Behalf Of Yanick Bourbeau [ybourbeau at mrgtech.ca] Sent: Monday, August 29, 2011 4:36 PM To: speex-dev at xiph.org Subject: Re: [Speex-dev] Speex VAD always returning 1
For what it's worth, I've had significantly better luck with the WebRtc VAD. It's pretty good. http://code.google.com/p/webrtc/source/browse/#svn%2Ftrunk%2Fsrc%2Fcommon_audio%2Fvad%2Fmain%2Fsource In general, the WebRTC voice engine seems to be more sophisticated and mature than the Speex preprocessor (as opposed to the Speex codec, which is pretty good). Ken Smith Cell: 425-443-2359 Email: ken at alanta.com Blog: http://blog.wouldbetheologian.com/ On Mon, Aug 29, 2011 at 1:36 PM, Yanick Bourbeau <ybourbeau at mrgtech.ca>wrote:> From 1.2rc1 source code: > > preprocess.c: > > /* FIXME: This VAD is a kludge */ > st->speech_prob = Pframe; > if (st->vad_enabled) > { > if (st->speech_prob > st->speech_prob_start || (st->was_speech && > st->speech_prob > st->speech_prob_continue)) > { > st->was_speech=1; > return 1; > } else > { > st->was_speech=0; > return 0; > } > } else { > return 1; > } > > AND > > case SPEEX_PREPROCESS_SET_VAD: > speex_warning("The VAD has been replaced by a hack pending a complete > rewrite"); > st->vad_enabled = (*(spx_int32_t*)ptr); > break; > > > As you can see, it is a hack, not supposed to be very good. Still, weird > that you get > this result. If you want a sample to test it, check speexenc.c file in > speex source code package. > I usually implement this kind of feature outside of speex becauses I want a > better control > on how it happens and when to do something about it :) > > Yanick Bourbeau > > > > > > On 11-08-29 04:16 PM, Clifton Craig wrote: > > +1 on this question as I wanted to start playing with it as well. > > On Aug 29, 2011, at 9:41 AM, Shridhar, Vasant wrote: > > I have been trying to understand how to get the VAD algorithm working. > I sent an input stream of all zeros into the preprocessor but still got a > return value of 1 indicating that speech was detected. Is this feature not > available with the latest release? I thought at the very least it would > detect this as silence and return 0 but that does not seem to be the case. > **** > ** ** > Does anyone have any information on how to use this or some example code to > set this up I might try.**** > ** ** > Thanks,**** > ** ** > Vas**** > _______________________________________________ > Speex-dev mailing list > Speex-dev at xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev > > > > > _______________________________________________ > Speex-dev mailing listSpeex-dev at xiph.orghttp://lists.xiph.org/mailman/listinfo/speex-dev > > > > _______________________________________________ > Speex-dev mailing list > Speex-dev at xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20110829/f41a8cbc/attachment-0001.htm
This works well. Thanks for the link. However I am curious about something. Speex requires a VAD for the VBR mode and the comfort noise preservation. Is the Speex portion for this different than the VAD result returned from the preprocessor call? Vas ________________________________ From: speex-dev-bounces at xiph.org [speex-dev-bounces at xiph.org] On Behalf Of Ken Smith [ken at alanta.com] Sent: Monday, August 29, 2011 6:38 PM To: speex-dev at xiph.org Subject: Re: [Speex-dev] Speex VAD always returning 1 For what it's worth, I've had significantly better luck with the WebRtc VAD. It's pretty good. http://code.google.com/p/webrtc/source/browse/#svn%2Ftrunk%2Fsrc%2Fcommon_audio%2Fvad%2Fmain%2Fsource In general, the WebRTC voice engine seems to be more sophisticated and mature than the Speex preprocessor (as opposed to the Speex codec, which is pretty good). Ken Smith Cell: 425-443-2359 Email: ken at alanta.com<mailto:ken at alanta.com> Blog: http://blog.wouldbetheologian.com/ On Mon, Aug 29, 2011 at 1:36 PM, Yanick Bourbeau <ybourbeau at mrgtech.ca<mailto:ybourbeau at mrgtech.ca>> wrote: