Hi Jean-Marc> Can you explain what you mean here by "speech switching"By speech switching I mean the adaption of "gain2" when near-end or far-end is talking. What is important is that the timing is good and that the gain is low/high while far/near-end is talking. By timing I mean that the "gain2" should remain low until all far-end talk is final and that the gain should quickly be high when near-and is talking.> There's also a parameter to control the maximum amount of suppression > allowed: > SPEEX_PREPROCESS_SET_NOISE_SUPPRESS : noise suppression > SPEEX_PREPROCESS_SET_ECHO_SUPPRESS : echo suppression when there is no > local talk > SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE: echo suppression in double-talkYes, I am aware of these parameters and are familiar with how they affect the gain. However they do not affect the timing of speech switching very much. The important parameter for the speech switching is the Pframe. Pframe is as you know based on the SNR estimation. However when the near-end signal is low compared to the far-end signal (coming from the close speaker element) the SNR is not distinctly increased when near-end talks.> Can you explain what ... problem you've encountered?Our main problem is that it is hard to have good reliance on a high "gain2" when near-end is talking, resulting in missing conversation in one direction. Some improvement can be made by modifying the Qcurve function but it is very sensitive. A secondary problem we also have is that the residual echo during decay of far-end talk is not suppressed very well. This is probably caused by the strong echo coupling plus a fairly reverberant room. We have been able to solve this by adding a weighting factor and some accumulation on the residual_echo and echo_noise. This modification works perfect on the far-end-problem but worsen the main problem even more. Best Regards Johan
Johan Nilsson a ?crit :>> There's also a parameter to control the maximum amount of >> suppression allowed: SPEEX_PREPROCESS_SET_NOISE_SUPPRESS : noise >> suppression SPEEX_PREPROCESS_SET_ECHO_SUPPRESS : echo suppression >> when there is no local talk >> SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE: echo suppression in >> double-talk > > Yes, I am aware of these parameters and are familiar with how they > affect the gain. However they do not affect the timing of speech > switching very much.What happens if you make SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE less aggressive. Does it end up with too much echo or it just doesn't realise that it's in double-talk conditions?> The important parameter for the speech switching is the Pframe. > Pframe is as you know based on the SNR estimation. However when the > near-end signal is low compared to the far-end signal (coming from > the close speaker element) the SNR is not distinctly increased when > near-end talks.Yes, Pframe estimation is one of the main problems I was having and I'm not entirely sure how to solve that. I suspect that the residual echo estimation also doesn't help.> Our main problem is that it is hard to have good reliance on a high > "gain2" when near-end is talking, resulting in missing conversation > in one direction. Some improvement can be made by modifying the > Qcurve function but it is very sensitive. > > A secondary problem we also have is that the residual echo during > decay of far-end talk is not suppressed very well. This is probably > caused by the strong echo coupling plus a fairly reverberant room. We > have been able to solve this by adding a weighting factor and some > accumulation on the residual_echo and echo_noise. This modification > works perfect on the far-end-problem but worsen the main problem even > more.This is probably the effect of reverberation and can probably be solved by tuning/improving the current recursive averaging of the echo estimate. Jean-Marc
>What happens if you make SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE less >aggressive. Does it end up with too much echo or it just doesn't realise >that it's in double-talk conditions?My impression is that it does not make much difference on the timing to set this parameter less aggressive. Depending on the how loud the near end is talking it may detect double talk but most often it does not detect double talk and near end is suppressed by amount of ECHO_SUPRESS.>> The important parameter for the speech switching is the Pframe. >> Pframe is as you know based on the SNR estimation. However when the >> near-end signal is low compared to the far-end signal (coming from >> the close speaker element) the SNR is not distinctly increased when >> near-end talks. > >Yes, Pframe estimation is one of the main problems I was having and I'm >not entirely sure how to solve that. I suspect that the residual echo >estimation also doesn't help.I think the residual echo estimation is fairly reliable but I do not know how to use this to improve Pframe and in that way solve our main problem with the gain during near end talk.>> Our main problem is that it is hard to have good reliance on a high >> "gain2" when near-end is talking, resulting in missing conversation >> in one direction. Some improvement can be made by modifying the >> Qcurve function but it is very sensitive. >> >> A secondary problem we also have is that the residual echo during >> decay of far-end talk is not suppressed very well. This is probably >> caused by the strong echo coupling plus a fairly reverberant room. We >> have been able to solve this by adding a weighting factor and some >> accumulation on the residual_echo and echo_noise. This modification >> works perfect on the far-end-problem but worsen the main problem even >> more. > >This is probably the effect of reverberation and can probably be solved >by tuning/improving the current recursive averaging of the echo estimate.Yes, I have basically solved this. Our main problem with the poor reliability during near end talk is what we need to find a solution to now. Best Regards Johan