thr3ads.net - Speex dev - [Speex-dev] Anyone knows how microsoft AEC can deal with mismatches between clocks of capture and render streams? [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Li Maoquan

2011-Apr-12 18:48 UTC

[Speex-dev] Anyone knows how microsoft AEC can deal with mismatches between clocks of capture and render streams?

Hi Shridhar,

Sample rate conversion is not enough to solve this problem. I have tried this
method several months
ago. The first step is to measure the difference between sample rate of
capturing and rendering. Then
resampling (by what you said "sinc interpolation") one signal to
eliminate the difference. The frequency
step in my experiment is less than 0.1Hz. I have tried speex AEC after
resampling, much more echo is
cancelled than the one without resampling. But there is still echo can be
heared.
After all, frequency step of sample rate conversion is limited, mismatch is
still exist after resampling.
Someone told me that capture and render codec have different clock generator
which shift independently.
And LMS algorithm is very sensitive to the difference between sample rates.

Sincerely
Maoquan


At 2011-04-12 21:46:26?"Shridhar, Vasant" <vasant.shridhar at
harman.com> wrote:


I would imagine that it is handle through basic asynchronous sample rate
conversion.  There is a lot of literature out there on the different techniques
to do this.  A common method is sinc interpolation.  This is how I have handle
these types of things in the past.

 

Vasant Shridhar

 

From:speex-dev-bounces at xiph.org [mailto:speex-dev-bounces at xiph.org]On
Behalf OfLiMaoquan2000
Sent: Tuesday, April 12, 2011 12:36 AM
To: speex-dev
Subject: [Speex-dev] Anyone knows how microsoft AEC can deal with mismatches
between clocks of capture and render streams?

 

Hi all,

We all know that mismatch between clocks of ADCs of far-end voice and near-end
voice is not allowed in a time-domain or frequency-domain LMS based AEC system.
It means that capture and render audio streams must be synchronized to a same
sample rate. However, I found that this restriction is removed in microsoft AEC
from Windows XP SP1. Anyone knows how microsoft AEC do it? This technology is
much helpful for us to implement AEC in common PC. We know that most low-cost
soundcards have different sample rates in capturing and rendering which prevents
LMS based AEC from being used in most computer.

http://msdn.microsoft.com/en-us/library/ff536174(VS.85).aspx
In Windows XP, the clock rate must be matched between the capture and render
streams. The AEC system filter implements no mechanism for matching sample rates
across devices. ............. In Windows XP SP1, Windows Server 2003, and later,
this limitation does not exist. The AEC system filter correctly handles
mismatches between the clocks for the capture and render streams, and separate
devices can be used for capture and rendering.

Maoquan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.xiph.org/pipermail/speex-dev/attachments/20110413/02a15930/attachment.htm

Steve Underwood

2011-Apr-13 01:13 UTC

head link

[Speex-dev] Anyone knows how microsoft AEC can deal with mismatches between clocks of capture and render streams?

On 04/13/2011 02:58 AM, Shridhar, Vasant wrote:> I am doing this right now with no problem.  I am not using speex for this
at the moment though.  Group delay is the biggest problem.  I implemented a
version where the input and output sample rates are known up front.  The routine
than interpolates between the jitter.  This should solve the problem.  The
crystals used to clock the input and output have very fine tolerances on most
standard audio cards.Do you mean the group delay of your interpolation filter? I don't see
why that is an issue. At the echo cancellation point it just looks like
a bit more echo delay. I also don't know why you use the word jitter in
relation to interpolation. The jitter you have is in the reception time
of blocks of samples, which makes the assessment of sampling rates hard,
but doesn't affect the actual interpolation.

We are talking about two clocks, which are not synchronised, and which
may drift in frequency significantly over fairly short periods of time.
The issue is accurately assessing the sampling rate difference, to phase
locked levels of accuracy, so the resampling is precise. You can find
sampling rates like 8000/s and 8100/s, which is a disaster for most echo
cancellers. If the clock rate difference is assessed to 0.1Hz accuracy,
and the 8100/s sampled signal is resampled to 8000.1/s, you would still
need to totally readapt the canceller every 10s, including periods of
double talk. That is too fast for the canceller to ever be working well.
You really need a very accurate assessment of the sampling rate
difference, so you can essentially eliminate all difference between the
two rates.

Assessing the sampling rate difference accurately is not hard, if you
have plenty of time. Doing it in a shorter period is where the challenge
lies. You are decoupled from a precise real time view of the sampling
process. All you can base your sampling rate assessment on is long term
assessments of sample rates, or an analysis of how the echo is drifting
through the samples. From the last 10 years you will find a number of
papers published in IEEE and other journals about this problem, as it
pertains to echo cancelling in conferencing, and other distributed
setups. In these systems, synchronisation of various echo laden signals
is impractical. All the papers I've seen come down to doing basically
the same thing - resampling based on a best assessment of echo drift
rates. It seems like its still a research topic, and it seems like
existing solutions have their problems. Fraunhofer have recently
released a conferencing echo handler with a vague description of how it
works, but a clear indication that it isn't even trying to cancel the
echo. It is juggling gains, and performing other tricks, to make the
echo perceptually tolerable - an approach which has historically worked
pretty well (e.g. the DSP Group solution from the 90s). At least one
person reported, on this list, that their solution is the best
around.> Vas
> ________________________________________
> From: Li Maoquan [limaoquan2000 at 126.com]
> Sent: Tuesday, April 12, 2011 2:48 PM
> To: Shridhar, Vasant
> Cc: speex-dev
> Subject: Re:RE: [Speex-dev] Anyone knows how microsoft AEC can deal with
mismatches     between clocks of capture and render streams?
>
> Hi Shridhar,
>
> Sample rate conversion is not enough to solve this problem. I have tried
this method several months
> ago. The first step is to measure the difference between sample rate of
capturing and rendering. Then
> resampling (by what you said "sinc interpolation") one signal to
eliminate the difference. The frequency
> step in my experiment is less than 0.1Hz. I have tried speex AEC after
resampling, much more echo is
> cancelled than the one without resampling. But there is still echo can be
heared.
> After all, frequency step of sample rate conversion is limited, mismatch is
still exist after resampling.
> Someone told me that capture and render codec have different clock
generator which shift independently.
> And LMS algorithm is very sensitive to the difference between sample rates.
>
> Sincerely
> Maoquan
>
> At 2011-04-12 21:46:26?"Shridhar, Vasant" <vasant.shridhar at
harman.com> wrote:
> I would imagine that it is handle through basic asynchronous sample rate
conversion.  There is a lot of literature out there on the different techniques
to do this.  A common method is sinc interpolation.  This is how I have handle
these types of things in the past.
>
> Vasant Shridhar
>
> From: speex-dev-bounces at xiph.org<mailto:speex-dev-bounces at
xiph.org> [mailto:speex-dev-bounces at xiph.org<mailto:speex-dev-bounces
at xiph.org>] On Behalf Of LiMaoquan2000
> Sent: Tuesday, April 12, 2011 12:36 AM
> To: speex-dev
> Subject: [Speex-dev] Anyone knows how microsoft AEC can deal with
mismatches between clocks of capture and render streams?
>
>
> Hi all,
>
> We all know that mismatch between clocks of ADCs of far-end voice and
near-end voice is not allowed in a time-domain or frequency-domain LMS based AEC
system. It means that capture and render audio streams must be synchronized to a
same sample rate. However, I found that this restriction is removed in microsoft
AEC from Windows XP SP1. Anyone knows how microsoft AEC do it? This technology
is much helpful for us to implement AEC in common PC. We know that most low-cost
soundcards have different sample rates in capturing and rendering which prevents
LMS based AEC from being used in most computer.
>
>
http://msdn.microsoft.com/en-us/library/ff536174(VS.85).aspx<http://msdn.microsoft.com/en-us/library/ff536174%28VS.85%29.aspx>
> In Windows XP, the clock rate must be matched between the capture and
render streams. The AEC system filter implements no mechanism for matching
sample rates across devices. ............. In Windows XP SP1, Windows Server
2003, and later, this limitation does not exist. The AEC system filter correctly
handles mismatches between the clocks for the capture and render streams, and
separate devices can be used for capture and rendering.
>You have posted the same thing before, but ignored replies because you
didn't like them. The paragraph you quoted can be taken as a clear
statement that MS precisely resample the signals. However, if you read
the whole page it is less clear. The key thing that paragraph is talking
about is big sampling rate changes - like taking a 48k/s signal and a
16k/s signal, and resampling the 48k/s one to 16k/s, so cancellation can
work. That is the thing which seems to have been added in XP SP1. The
paragraph seems to imply that fine resampling happens, but if you read
the rest of the page it comes from, things are not so clear. There are
many vague and unclear things on that page. If they had brilliantly
solved this problem, everyone should be relying on the MS canceller for
their Windows solutions, but that doesn't seem to be the case. It seems
many soft-phones rely on their own echo handling solutions, and many do
not handle echo very well.

Steve

Seemingly Similar Threads

Search for more maybe matching threads

Speex dev - Apr 2011 - Anyone knows how microsoft AEC can deal with mismatches between clocks of capture and render streams?

[Speex-dev] Anyone knows how microsoft AEC can deal with mismatches between clocks of capture and render streams?

[Speex-dev] Anyone knows how microsoft AEC can deal with mismatches between clocks of capture and render streams?

Seemingly Similar Threads