This must be a simple issue, but I cannot figure it out. I want to use VAD, but I don't know how to check if the actual frame has voice in it or not. So, in my code, I do: int tmp = 1; speex_preprocess_ctl(preprocess_state, SPEEX_PREPROCESS_SET_VAD, &tmp); speex_preprocess_ctl(preprocess_state, SPEEX_PREPROCESS_SET_DENOISE, &tmp); then later, for each frame speex_preprocess_run(preprocess_state, shortPointer); but how do I know if the frame contained voice? I tried if (preprocess_state->was_voice == 1) { ...Do voice present code... } But the compiler complains that was_voice is not defined, which, I assume, comes from the fact that preprocess_state is declared in speech_preprocess.h as struct SpeexPreprocessState_; How do I check the preprocessor for the presence of voice in a frame? Thanks, Evgueni
Just use the return value of speex_preprocess_run() Cheers, Jean-Marc Evgueni Tsygankov a ?crit :> This must be a simple issue, but I cannot figure it out. > > I want to use VAD, but I don't know how to check if the actual frame has > voice in it or not. > > So, in my code, I do: > > int tmp = 1; > speex_preprocess_ctl(preprocess_state, SPEEX_PREPROCESS_SET_VAD, &tmp); > speex_preprocess_ctl(preprocess_state, SPEEX_PREPROCESS_SET_DENOISE, > &tmp); > > then later, for each frame > > speex_preprocess_run(preprocess_state, shortPointer); > > but how do I know if the frame contained voice? I tried > > if (preprocess_state->was_voice == 1) > { > ...Do voice present code... > } > > But the compiler complains that was_voice is not defined, which, I > assume, comes from the fact that preprocess_state is declared in > speech_preprocess.h as struct SpeexPreprocessState_; > > How do I check the preprocessor for the presence of voice in a frame? > > Thanks, > > Evgueni > > > _______________________________________________ > Speex-dev mailing list > Speex-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev > >
Hey sorry to hijack this thread, but I just remembered a request I wanted to make to the speex devs. I tried using the activity detector, but I just couldn't get it working well. I ended up using my own, where I think it just considered voice on if it passed a certain threshold (I know, pretty primitive). I also tried one that checked for a signal, like if the strongest frequency was above a threshold. I don't remember what function it was, but it was very simple, not an FFT, but like an autocorrelation or something, but it didn't work any better than loudness detection. So I would like to use speex's. Anyway, my request is, can you build in a pre and post buffer into the VAD? In mine, if I detect voice any time between now and say a quarter second later, I start sending, and then I wait a half second or whatever after I stop detecting. You pretty much have to have this, or people start getting anxious talking over an internet stream. They have to enunciate expressions like "ya probably" because the ya isn't detected, only the probably. By sending a bit of padding around the detection, it also prevents the detector from dropping out mid-sentence. It takes it from being a screaming contest over a walkie talkie, to a normal telephone conversation. You might be reluctant to do this, because you have to add in some state information instead of just focusing on the current buffer, but the quality improvement is enormous. I'd just like to be able to pass a pre and post value to the VAD in milliseconds, defaulting to either 0 or values similar to what I quoted above. And I realize this can add some delay, but even detecting a single extra syllable makes a world of difference. Well, thanx for your time, --Zack
Thanks for your reply. I changed my code to: if (speex_preprocess_run(preprocess_state, shortPointer) == 1) { speex_encode_int(enc_state, shortPointer, &enc_bits); } In the mobile version of the software, compiled against the mobile build of Speech, I get 1 and 0 based on whether the speech is detected. In the version of the software compiled against the Win32 version of Speex, speex_preprocess_run always returns 0. If I remove the IF statement, then I get voice transmission, so the mic is working. Is it something to do with VAD sensitivity? Thanks, Evgueni Tsygankov www.sqlanswers.com -----Original Message----- From: Jean-Marc Valin [mailto:jean-marc.valin@usherbrooke.ca] Sent: Friday, February 15, 2008 6:16 AM To: Evgueni Tsygankov Cc: speex-dev@xiph.org Subject: Re: [Speex-dev] Voice activity detection Just use the return value of speex_preprocess_run() Cheers, Jean-Marc Evgueni Tsygankov a ?crit :> This must be a simple issue, but I cannot figure it out. > > I want to use VAD, but I don't know how to check if the actual frame has > voice in it or not. > > So, in my code, I do: > > int tmp = 1; > speex_preprocess_ctl(preprocess_state, SPEEX_PREPROCESS_SET_VAD, &tmp); > speex_preprocess_ctl(preprocess_state, SPEEX_PREPROCESS_SET_DENOISE, > &tmp); > > then later, for each frame > > speex_preprocess_run(preprocess_state, shortPointer); > > but how do I know if the frame contained voice? I tried > > if (preprocess_state->was_voice == 1) > { > ...Do voice present code... > } > > But the compiler complains that was_voice is not defined, which, I > assume, comes from the fact that preprocess_state is declared in > speech_preprocess.h as struct SpeexPreprocessState_; > > How do I check the preprocessor for the presence of voice in a frame? > > Thanks, > > Evgueni > > > _______________________________________________ > Speex-dev mailing list > Speex-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev > >