Jean-Marc Valin wrote:>> As you can tell, the AAGC integration with speex was really a
classic
>> hack. Instead of re-creating the hack, what's probably best here
is to
>> integrate AAGC back into speex, and have a proper API.
>>
>
> Agreed here. If you can come up with a clean patch to add that feature,
> it's something I'd like to see in Speex.
>
I hate to be a talker and not a do-er, but I won't be able to write this
myself, probably someone on the iaxclient team could do it.
>> For those of you just tuning in, what I call "AAGC" is an
AGC
>> implementation where analog gains are manipulated instead, or in
>> addition to the AGC within speex (where levels are normalized via
>> multiplication). The benefits of AAGC are: (1) (most important),
>> reducing the analog gain can prevent clipping, which can't be done
with
>> speex' current AGC, and (2) when raising levels, you get better
quality
>> by raising the mixer levels, as opposed to just multiplying.
>>
>
> It's a good thing to do, but you need to be really careful when doing
> that because:
> 1) Any change in the analogue gain automatically de-adapts the echo
> canceller so you only want to do that when really necessary (e.g.
> clipping screws up the EC anyway)
> 2) The processing chain goes "AEC -> noise suppressor ->
AGC", but for
> the analog gain, you really want to measure the signal that goes into
> the echo cancellation, not at the AGC. Otherwise, you risk increasing
> the analog gain to a level that creates clipping before the AEC (even if
> the signal at the AGC is lower.
>
> Hmm, or does that mean the analogue AGC is actually completely
> independent from the "real" AGC. Any thoughts?
>
It's actually a bit more complicated, because it's more like "AEC
->
Noise Suppressor -> VAD -> AGC", even if the VAD decision isn't
used by
the consumer, right. Because the VAD decision needs to be used by AGC,
so that it isn't raising the gain of background noise (although it
should probably lower the gain when there's any signal higher than it's
threshold).
For AAGC, though, I guess one way to do this would be if you could
somehow "transport" the un-cancelled, un-noise-suppressed energy level
past the VAD decision, and then used that to determine what gain
adjustments to make. In this fashion, you'd be making your adjustments
based on the information you want: (a) the actual signal energy before
processing, and (b) VAD decision.
You might be able to fake it good enough by putting AAGC before AEC, and
using the VAD decison from frame "n-1" when you're processing
frame
"n". You'll probably have enough hysteresis and a bit of history
in the
decision making process anyway that it might not matter.
As far as gain changes messing up the rest of the preprocessing chain:
It would seem to mess up the denoiser, the VAD logic, etc., as well as
the echo canceller. It might be possible (as I wrote earlier) to give
the filter chain some hints about what the effects of the changes are,
but it probably won't be perfect, because it would be difficult or
impossible to predict the exact response of gain adjustments, and the
delay after which they will actually take effect.
The AAGC mechanism I implemented, though, was good enough, for some
measure of good enough. It basically made step-wise adjustments (10% or
20%) every so often, when speex' loudness parameter was above or below
certain thresholds, and it strongly detected speech. If you use this
mechanism, and pre-set the mixers to be at about 80%, it relatively
quickly gets the gain into a reasonable place once speech is detected.
It would probably work just as well when EC is involved, as long as EC
and VAD work together well enough such that you don't get VAD
false-positives from echo. The target "loudness" range here is 4000
<->
8000, but it could be widened a bit to avoid more adjustments.
<snip>
/* Analog AGC: Bring speex AGC gain out to mixer, with lots of
hysteresis */
/* use a higher continuation threshold for AAGC than for VAD itself */
if(!silent &&
(iaxc_silence_threshold != 0) &&
(iaxc_filters & IAXC_FILTER_AGC) &&
(iaxc_filters & IAXC_FILTER_AAGC) &&
(st->speech_prob > .20)
) {
static int i;
double level;
i++;
if((i&0x3f) == 0) {
float loudness = st->loudness2;
if((loudness > 8000) || (loudness < 4000)) {
level = iaxc_input_level_get();
/* fprintf(stderr, "loudness = %f, level = %f\n",
loudness,
level); */
/* lower quickly if we're really too hot */
if((loudness > 16000) && (level > 0.5)) {
/* fprintf(stderr, "lowering quickly level\n"); */
iaxc_input_level_set(level - 0.2);
}
/* lower less quickly if we're a bit too hot */
else if((loudness > 8000) && (level >= 0.15)) {
/* fprintf(stderr, "lowering slowly level\n"); */
iaxc_input_level_set(level - 0.1);
}
/* raise slowly if we're cold */
else if((loudness < 4000) && (level <= 0.9)) {
/* fprintf(stderr, "raising level\n"); */
iaxc_input_level_set(level + 0.1);
}
}
}
}
</snip>
>
>> (1) is really the most important reason.
>>
>> Now, the API I'd envision for this would be one where you could
tell
>> speex that you would like to use AAGC, and then register some callbacks
>> that speex_preprocess() could call to query or set the input or mixer
>> level. Further, a more intellegent implementation within speex could
>> consider the requested changes in the rest of the preprocessor chain
>> (i.e. it would know that if it asked for a 3dB increase in input gain,
>> to expect that input levels would rise by 3dB within a few frames).
The
>> hacky implementation I did inside of iaxclient gave speex no such
>> information.
>>
>
> This is probably things we'll want to consider one we decide on where
to
> put the AAGC in the first place.
>
> Jean-Marc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.xiph.org/pipermail/speex-dev/attachments/20070503/109c03fb/attachment.htm