thr3ads.net - Speex dev - [Speex-dev] Speech detection in preprocessor with echo [Jun 2005]

If this information is useful, please help other people find it:
Share via:

Tom Grandgent

2005-Jun-20 11:40 UTC

[Speex-dev] Speech detection in preprocessor with echo

I think you'll have to modify Speex to get the functionality you're 
looking for.  I've made a few simple modifications to the AGC to prevent 
it from 1) exceeding a specified level of amplification and 2) enable 
and disable adaptation, so I can freeze it at a certain level while 
speech is not detected.  It's mostly just a matter of doing this at the 
end of speex_compute_agc():

   if (!st->agc_frozen)
   {
	   agc_gain = st->agc_level/st->loudness2;
	   /*fprintf (stderr, "%f %f %f %f\n", active_bands, st->loudness,
st->loudness2, agc_gain);*/
	   if (agc_gain>st->agc_max_gain)	/* was 200 */
		   agc_gain = st->agc_max_gain;	/* was 200*/
   }
   else
	   agc_gain = st->agc_gain;
   st->agc_gain = agc_gain;

and adding a few items to speex_preprocess_ctl() and the state struct.  
(I control these things at the application level.. you may wish to 
control them from within the preprocessor if you're using the 
preprocessor's VAD.)

Anyway, if you can figure out what's going on with the variables you 
named, I'm sure you can make the necessary modifications to do what 
you've asked for.  I think the preprocessor in general needs a little 
tweaking like this to work well in various real-world situations, but 
I'm not sure how much of this Jean-Marc wants to incorporate into 
Speex vs. leave to application developers.

Tom

Thorvald Natvig <speex@natvig.com> wrote:> 
> 
> Echo cancellation works like a charm, but it seems to confuse the 
> preprocessor a bit.
> 
> If listening to background music (properly fed through the echo 
> cancellator), the music is removed but the result is still detected as 
> speech even if almost silence remains in the signal.
> 
> Also, the AGC keeps adjusting to the minute remains in the signal, meaning 
> that sooner or later it will amplify the remains enough that it's
clearly
> audible on the other side. If I cough or say a word, the AGC readjusts and 
> all is fine.
> 
> Looking at the members of the speex_preprocess structure, I see that 
> during these long periods of "silence" (only the background music
or
> only the other end talking while I shut up):
> 
> - Zlast (which looks like a SNR variable) is at 0.05-0.2, but jumps up
>    above 1.0 if I actually say something.
> - loudness2 keeps decreasing from the "normal" of ~6000 to 1000
or so, at
>    which point the residual echo is amplified enough that it's clearly
>    audible at the other end. If I say something, it adjusts.
> - speech_prob is at 0.999 or 1.000 as long as the other end talks.
> 
> This is all with up-to-date SVN version of speex, and in a fairly noisy 
> environment (it's hot, so I have the window open, so passing cars on
the
> nearby road are quite audible, as is my air cleaner).
> 
> Is there something I can do to tune this away, a way to tell the AGC to 
> never go that low, and a way to tell the speech detector that echo remains 
> are not speech?
> 
> _______________________________________________
> Speex-dev mailing list
> Speex-dev@xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev

Jean-Marc Valin

2005-Jun-22 01:25 UTC

head link

[Speex-dev] Speech detection in preprocessor with echo

Just curious, why are you freezing agc_gain instead of freezing
st->loudness2 ?

Jean-Marc


Le lundi 20 juin 2005 ? 14:40 -0400, Tom Grandgent a ?crit :
> I think you'll have to modify Speex to get the functionality you're
> looking for.  I've made a few simple modifications to the AGC to
prevent
> it from 1) exceeding a specified level of amplification and 2) enable 
> and disable adaptation, so I can freeze it at a certain level while 
> speech is not detected.  It's mostly just a matter of doing this at the
> end of speex_compute_agc():
> 
>    if (!st->agc_frozen)
>    {
> 	   agc_gain = st->agc_level/st->loudness2;
> 	   /*fprintf (stderr, "%f %f %f %f\n", active_bands,
st->loudness, st->loudness2, agc_gain);*/
> 	   if (agc_gain>st->agc_max_gain)	/* was 200 */
> 		   agc_gain = st->agc_max_gain;	/* was 200*/
>    }
>    else
> 	   agc_gain = st->agc_gain;
>    st->agc_gain = agc_gain;
> 
> and adding a few items to speex_preprocess_ctl() and the state struct.  
> (I control these things at the application level.. you may wish to 
> control them from within the preprocessor if you're using the 
> preprocessor's VAD.)
> 
> Anyway, if you can figure out what's going on with the variables you 
> named, I'm sure you can make the necessary modifications to do what 
> you've asked for.  I think the preprocessor in general needs a little 
> tweaking like this to work well in various real-world situations, but 
> I'm not sure how much of this Jean-Marc wants to incorporate into 
> Speex vs. leave to application developers.
> 
> Tom
> 
> Thorvald Natvig <speex@natvig.com> wrote:
> > 
> > 
> > Echo cancellation works like a charm, but it seems to confuse the 
> > preprocessor a bit.
> > 
> > If listening to background music (properly fed through the echo 
> > cancellator), the music is removed but the result is still detected as
> > speech even if almost silence remains in the signal.
> > 
> > Also, the AGC keeps adjusting to the minute remains in the signal,
meaning
> > that sooner or later it will amplify the remains enough that it's
clearly
> > audible on the other side. If I cough or say a word, the AGC readjusts
and
> > all is fine.
> > 
> > Looking at the members of the speex_preprocess structure, I see that 
> > during these long periods of "silence" (only the background
music or
> > only the other end talking while I shut up):
> > 
> > - Zlast (which looks like a SNR variable) is at 0.05-0.2, but jumps up
> >    above 1.0 if I actually say something.
> > - loudness2 keeps decreasing from the "normal" of ~6000 to
1000 or so, at
> >    which point the residual echo is amplified enough that it's
clearly
> >    audible at the other end. If I say something, it adjusts.
> > - speech_prob is at 0.999 or 1.000 as long as the other end talks.
> > 
> > This is all with up-to-date SVN version of speex, and in a fairly
noisy
> > environment (it's hot, so I have the window open, so passing cars
on the
> > nearby road are quite audible, as is my air cleaner).
> > 
> > Is there something I can do to tune this away, a way to tell the AGC
to
> > never go that low, and a way to tell the speech detector that echo
remains
> > are not speech?
> > 
> > _______________________________________________
> > Speex-dev mailing list
> > Speex-dev@xiph.org
> > http://lists.xiph.org/mailman/listinfo/speex-dev
> 
> _______________________________________________
> Speex-dev mailing list
> Speex-dev@xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev

Possibly Parallel Threads

Search for more apparently analagous threads

Speex dev - Jun 2005 - Speech detection in preprocessor with echo

[Speex-dev] Speech detection in preprocessor with echo

[Speex-dev] Speech detection in preprocessor with echo

Possibly Parallel Threads