> I hate to be a talker and not a do-er, but I won't be able to write this > myself, probably someone on the iaxclient team could do it.Anyway, let me know if/when someone's working on that.>> Hmm, or does that mean the analogue AGC is actually completely >> independent from the "real" AGC. Any thoughts? >> > > It's actually a bit more complicated, because it's more like "AEC -> > Noise Suppressor -> VAD -> AGC", even if the VAD decision isn't used by > the consumer, right. Because the VAD decision needs to be used by AGC, > so that it isn't raising the gain of background noise (although it > should probably lower the gain when there's any signal higher than it's > threshold). > > For AAGC, though, I guess one way to do this would be if you could > somehow "transport" the un-cancelled, un-noise-suppressed energy level > past the VAD decision, and then used that to determine what gain > adjustments to make. In this fashion, you'd be making your adjustments > based on the information you want: (a) the actual signal energy before > processing, and (b) VAD decision.I don't see b) as being that important. Could help a bit, but you really want to use a).> You might be able to fake it good enough by putting AAGC before AEC, and > using the VAD decison from frame "n-1" when you're processing frame > "n". You'll probably have enough hysteresis and a bit of history in the > decision making process anyway that it might not matter.sure.> As far as gain changes messing up the rest of the preprocessing chain: > It would seem to mess up the denoiser, the VAD logic, etc., as well as > the echo canceller. It might be possible (as I wrote earlier) to give > the filter chain some hints about what the effects of the changes are, > but it probably won't be perfect, because it would be difficult or > impossible to predict the exact response of gain adjustments, and the > delay after which they will actually take effect.Well, I guess you could: 1) say "freeze!" to everyone 2) increase the analogue gain 3) let everyone know by how much the gain was increased 4) wait a little while (e.g. 100 ms) 5) unfreeze everyone> The AAGC mechanism I implemented, though, was good enough, for some > measure of good enough. It basically made step-wise adjustments (10% or > 20%) every so often, when speex' loudness parameter was above or below > certain thresholds, and it strongly detected speech. If you use this > mechanism, and pre-set the mixers to be at about 80%, it relatively > quickly gets the gain into a reasonable place once speech is detected. > It would probably work just as well when EC is involved, as long as EC > and VAD work together well enough such that you don't get VAD > false-positives from echo. The target "loudness" range here is 4000 <-> > 8000, but it could be widened a bit to avoid more adjustments.You don't want to make small +-10% adjustments. I would go for +-10 dB at *least* (probably even 20 dB). Quantization noise issues at 16 bits per sample aren't worth the trouble of doing smaller steps. Jean-Marc
Jean-Marc Valin wrote:>> I hate to be a talker and not a do-er, but I won't be able to write this >> myself, probably someone on the iaxclient team could do it. >> > > Anyway, let me know if/when someone's working on that. > > >>> Hmm, or does that mean the analogue AGC is actually completely >>> independent from the "real" AGC. Any thoughts? >>> >>> >> It's actually a bit more complicated, because it's more like "AEC -> >> Noise Suppressor -> VAD -> AGC", even if the VAD decision isn't used by >> the consumer, right. Because the VAD decision needs to be used by AGC, >> so that it isn't raising the gain of background noise (although it >> should probably lower the gain when there's any signal higher than it's >> threshold). >> >> For AAGC, though, I guess one way to do this would be if you could >> somehow "transport" the un-cancelled, un-noise-suppressed energy level >> past the VAD decision, and then used that to determine what gain >> adjustments to make. In this fashion, you'd be making your adjustments >> based on the information you want: (a) the actual signal energy before >> processing, and (b) VAD decision. >> > > I don't see b) as being that important. Could help a bit, but you really > want to use a). >I think you want to use both pieces of information, so you're not raising the level of a signal that's not speech. It's _especially_ important when you're doing EC, of course, because you don't want to raise the gain on an echo.> >> As far as gain changes messing up the rest of the preprocessing chain: >> It would seem to mess up the denoiser, the VAD logic, etc., as well as >> the echo canceller. It might be possible (as I wrote earlier) to give >> the filter chain some hints about what the effects of the changes are, >> but it probably won't be perfect, because it would be difficult or >> impossible to predict the exact response of gain adjustments, and the >> delay after which they will actually take effect. >> > > Well, I guess you could: > 1) say "freeze!" to everyone > 2) increase the analogue gain > 3) let everyone know by how much the gain was increased > 4) wait a little while (e.g. 100 ms) > 5) unfreeze everyone > > >> The AAGC mechanism I implemented, though, was good enough, for some >> measure of good enough. It basically made step-wise adjustments (10% or >> 20%) every so often, when speex' loudness parameter was above or below >> certain thresholds, and it strongly detected speech. If you use this >> mechanism, and pre-set the mixers to be at about 80%, it relatively >> quickly gets the gain into a reasonable place once speech is detected. >> It would probably work just as well when EC is involved, as long as EC >> and VAD work together well enough such that you don't get VAD >> false-positives from echo. The target "loudness" range here is 4000 <-> >> 8000, but it could be widened a bit to avoid more adjustments. >> > > You don't want to make small +-10% adjustments. I would go for +-10 dB > at *least* (probably even 20 dB). Quantization noise issues at 16 bits > per sample aren't worth the trouble of doing smaller steps. >The thing is, I don't know if mixer controls on most platforms give you any idea of by how many dB you're changing things, whether the changes are linear or not, etc -- the mixer controls are just a know with levels from 0<->1 (mac, I think), 0-100, or 0-255. I raise/lower them by 10% or 20% of their full range, so with the 10% adjustments, you only have 10 steps. That seemed a big enough jump in practice. -SteveK -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20070503/a6800bc0/attachment.html
>> I don't see b) as being that important. Could help a bit, but you really >> want to use a). >> > > I think you want to use both pieces of information, so you're not > raising the level of a signal that's not speech. It's _especially_ > important when you're doing EC, of course, because you don't want to > raise the gain on an echo.Keep in mind that this mainly controls the saturation point. Any time you change the analogue gain, you should tell the AGC to decrease it's gain by the same amount.>> You don't want to make small +-10% adjustments. I would go for +-10 dB >> at *least* (probably even 20 dB). Quantization noise issues at 16 bits >> per sample aren't worth the trouble of doing smaller steps. >> > The thing is, I don't know if mixer controls on most platforms give you > any idea of by how many dB you're changing things, whether the changes > are linear or not, etc -- the mixer controls are just a know with levels > from 0<->1 (mac, I think), 0-100, or 0-255. I raise/lower them by 10% > or 20% of their full range, so with the 10% adjustments, you only have > 10 steps. That seemed a big enough jump in practice.OK, I thought you mean a 10-20% difference in the gain, which is quite small (< 1dB). Anyway, you don't want to be playing with that gain unless either 1) You have clipping 2) The capture level is ridiculously low (e.g. 16-bit samples don't go above 256 or something). Jean-Marc
Alexander Chemeris
2007-May-06 02:26 UTC
[Speex-dev] Re: [Iaxclient-devel] iaxclient & speex
Hello, On 5/4/07, Jean-Marc Valin <jean-marc.valin@usherbrooke.ca> wrote:> > As far as gain changes messing up the rest of the preprocessing chain: > > It would seem to mess up the denoiser, the VAD logic, etc., as well as > > the echo canceller. It might be possible (as I wrote earlier) to give > > the filter chain some hints about what the effects of the changes are, > > but it probably won't be perfect, because it would be difficult or > > impossible to predict the exact response of gain adjustments, and the > > delay after which they will actually take effect. > > Well, I guess you could: > 1) say "freeze!" to everyone > 2) increase the analogue gain > 3) let everyone know by how much the gain was increased > 4) wait a little while (e.g. 100 ms) > 5) unfreeze everyoneSimilar (same may be) mechanism may be used to prevent AEC upset in case user change mic or speaker (analog) gain manually. Application might detect all HW mixer changes and report them to speex library even if AAGC would not be used. Speex library will do essential actions to prevent AEC mess. So, there should be some generic API to notify Speex about HW mixer level changes. What do you think? -- Regards, Alexander Chemeris. SIPez LLC. SIP VoIP, IM and Presence Consulting http://www.SIPez.com tel: +1 (617) 273-4000
> Similar (same may be) mechanism may be used to prevent AEC > upset in case user change mic or speaker (analog) gain manually. > Application might detect all HW mixer changes and report them to speex > library even if AAGC would not be used. Speex library will do essential > actions to prevent AEC mess. So, there should be some generic API > to notify Speex about HW mixer level changes. What do you think?Could be done relatively easily -- in theory. The main problem is that you need to be able to know *exactly* (within less than 1 dB) how much the gain changed. If not, the amount of de-adaptation you cause isn't much better than if you didn't try to compensate. In any case, the AEC will take a few seconds to get back to its original behaviour. Jean-Marc