Thorvald Natvig
2005-Sep-18 19:23 UTC
[Speex-dev] Adjustable parameters for VAD in preprocessor
*Sigh* Some day I'll learn to set the right sender address when posting to membership-restricted mailing addresses. Was wondering why this hadn't arrived. Reposted message as follows: Hi, Would a patch to change the constants on line 454 - 462 in the preprocessor into variables be of general interest? At the moment, whether or not "is_speech" is 1 is hardcoded to be (speec_prob > .35 or >.1 within last 20 frames or it's been less than 20 frames without).. I'd like to turn those 4 constants into speex_preprocess_ctl tunable variables, most likely through SPEEX_PREPROCESS_GET_VAD_PARAMETERS / SET_VAD_PARAMETERS with ptr being a pointer to typedef struct { float trigger_level; float keep_level; int keep_window; int tail_length; } SpeexPreprocessVadParameters; I'm a bit unsure if keep_window (used with prob > .1) should be the same as tail_length. At the moment, I have an external counter in my program that keeps track of how many frames it's been since the vad returned 1, and if it's less than a user-set ammount, treat it as if it's still voice. This works well for users that have slightly long pauses between sentences or words.
Jean-Marc Valin
2005-Sep-24 03:18 UTC
[Speex-dev] Adjustable parameters for VAD in preprocessor
Hi, If you can come up with a clean (non-intrusive) patch, I'm willing to apply it. Ideally, it should go through int parameters to speex_preprocess_ctl(). I think someone sent a similar patch to the list a while ago, but I didn't have time to do the little cleanup it needed. Jean-Marc> Hi, > > Would a patch to change the constants on line 454 - 462 in the preprocessor > into variables be of general interest? At the moment, whether or not > "is_speech" is 1 is hardcoded to be (speec_prob > .35 or >.1 within last 20 > frames or it's been less than 20 frames without).. I'd like to turn those 4 > constants into speex_preprocess_ctl tunable variables, most likely through > SPEEX_PREPROCESS_GET_VAD_PARAMETERS / SET_VAD_PARAMETERS with ptr being a > pointer to > > typedef struct { > float trigger_level; > float keep_level; > int keep_window; > int tail_length; > } SpeexPreprocessVadParameters; > > I'm a bit unsure if keep_window (used with prob > .1) should be the same as > tail_length. At the moment, I have an external counter in my program that keeps > track of how many frames it's been since the vad returned 1, and if it's less > than a user-set ammount, treat it as if it's still voice. This works well for > users that have slightly long pauses between sentences or words. > _______________________________________________ > Speex-dev mailing list > Speex-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev >-- Jean-Marc Valin <Jean-Marc.Valin@USherbrooke.ca> Universit? de Sherbrooke -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20050924/2431798e/attachment.pgp