thr3ads.net - Speex dev - [Speex-dev] Speech switching in speakerphone? [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Johan Nilsson

2009-Jun-22 07:02 UTC

[Speex-dev] Speech switching in speakerphone?

Hi Jean-Marc
> Can you explain what you mean here by "speech switching" 
By speech switching I mean the adaption of "gain2" when near-end or
far-end is talking. What is important is that the timing is good and that the
gain is low/high while far/near-end is talking. By timing I mean that the
"gain2" should remain low until all far-end talk is final and that the
gain should quickly be high when near-and is talking.
> There's also a parameter to control the maximum amount of suppression
> allowed:
> SPEEX_PREPROCESS_SET_NOISE_SUPPRESS : noise suppression
> SPEEX_PREPROCESS_SET_ECHO_SUPPRESS : echo suppression when there is no
> local talk
> SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE: echo suppression in double-talk
Yes, I am aware of these parameters and are familiar with how they affect the
gain. However they do not affect the timing of speech switching very much.

The important parameter for the speech switching is the Pframe. Pframe is as you
know based on the SNR estimation. However when the near-end signal is low
compared to the far-end signal (coming from the close speaker element) the SNR
is not distinctly increased when near-end talks.
> Can you explain what ... problem you've encountered? 
Our main problem is that it is hard to have good reliance on a high
"gain2" when near-end is talking, resulting in missing conversation in
one direction. Some improvement can be made by modifying the Qcurve function but
it is very sensitive.

A secondary problem we also have is that the residual echo during decay of
far-end talk is not suppressed very well. This is probably caused by the strong
echo coupling plus a fairly reverberant room. We have been able to solve this by
adding a weighting factor and some accumulation on the residual_echo and
echo_noise. This modification works perfect on the far-end-problem but worsen
the main problem even more.

Best Regards
Johan

Jean-Marc Valin

2009-Jun-23 11:30 UTC

head link

[Speex-dev] Speech switching in speakerphone?

Johan Nilsson a ?crit :>> There's also a parameter to control the maximum amount of
>> suppression allowed: SPEEX_PREPROCESS_SET_NOISE_SUPPRESS : noise
>> suppression SPEEX_PREPROCESS_SET_ECHO_SUPPRESS : echo suppression
>> when there is no local talk 
>> SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE: echo suppression in
>> double-talk
> 
> Yes, I am aware of these parameters and are familiar with how they
> affect the gain. However they do not affect the timing of speech
> switching very much.
What happens if you make SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE less
aggressive. Does it end up with too much echo or it just doesn't realise
that it's in double-talk conditions?
> The important parameter for the speech switching is the Pframe.
> Pframe is as you know based on the SNR estimation. However when the
> near-end signal is low compared to the far-end signal (coming from
> the close speaker element) the SNR is not distinctly increased when
> near-end talks.
Yes, Pframe estimation is one of the main problems I was having and I'm
not entirely sure how to solve that. I suspect that the residual echo
estimation also doesn't help.
> Our main problem is that it is hard to have good reliance on a high
> "gain2" when near-end is talking, resulting in missing
conversation
> in one direction. Some improvement can be made by modifying the
> Qcurve function but it is very sensitive.
> 
> A secondary problem we also have is that the residual echo during
> decay of far-end talk is not suppressed very well. This is probably
> caused by the strong echo coupling plus a fairly reverberant room. We
> have been able to solve this by adding a weighting factor and some
> accumulation on the residual_echo and echo_noise. This modification
> works perfect on the far-end-problem but worsen the main problem even
> more.
This is probably the effect of reverberation and can probably be solved
by tuning/improving the current recursive averaging of the echo estimate.

	Jean-Marc

Johan Nilsson

2009-Jun-23 14:22 UTC

head link

[Speex-dev] Speech switching in speakerphone?t

>What happens if you make SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE less
>aggressive. Does it end up with too much echo or it just doesn't realise
>that it's in double-talk conditions?
My impression is that it does not make much difference on the timing to 
set this parameter less aggressive. Depending on the how loud the 
near end is talking it may detect double talk but most often it does not
detect double talk and near end is suppressed by amount of ECHO_SUPRESS.

>> The important parameter for the speech switching is the Pframe.
>> Pframe is as you know based on the SNR estimation. However when the
>> near-end signal is low compared to the far-end signal (coming from
>> the close speaker element) the SNR is not distinctly increased when
>> near-end talks.
>
>Yes, Pframe estimation is one of the main problems I was having and I'm
>not entirely sure how to solve that. I suspect that the residual echo
>estimation also doesn't help.
I think the residual echo estimation is fairly reliable but I do not know
how to use this to improve Pframe and in that way solve our main problem
with the gain during near end talk.

>> Our main problem is that it is hard to have good reliance on a high
>> "gain2" when near-end is talking, resulting in missing
conversation
>> in one direction. Some improvement can be made by modifying the
>> Qcurve function but it is very sensitive.
>> 
>> A secondary problem we also have is that the residual echo during
>> decay of far-end talk is not suppressed very well. This is probably
>> caused by the strong echo coupling plus a fairly reverberant room. We
>> have been able to solve this by adding a weighting factor and some
>> accumulation on the residual_echo and echo_noise. This modification
>> works perfect on the far-end-problem but worsen the main problem even
>> more.
>
>This is probably the effect of reverberation and can probably be solved
>by tuning/improving the current recursive averaging of the echo estimate.
Yes, I have basically solved this. 

Our main problem with the poor reliability during near end talk is what we
need to find a solution to now. 

Best Regards
Johan

Seemingly Similar Threads

Search for more possibly parallel threads

Speex dev - Jun 2009 - Speech switching in speakerphone?

[Speex-dev] Speech switching in speakerphone?

[Speex-dev] Speech switching in speakerphone?

[Speex-dev] Speech switching in speakerphone?t

Seemingly Similar Threads