I'm having trouble with the preprocessor's noise reduction feature. The basic issue is that it simply doesn't work very well. With my laptop (whose microphone is otherwise quite capable) I routinely hear transient background noise, typing, and other "quiet" sounds leaking through to the speex stream. Even worse, the AGC feature is blowing these things up into just awful explosions and whines. Honestly, it works so badly that I wonder if I'm doing something wrong. I wrote a trivial squelch feature* in 10 minutes that works basically 100% of the time. * Zero the sample data if the maximum sample in a frame is less than 4% of saturation or 20% of the maximum sample yet seen. It's about 8 lines of code. I'm using the 32KHz ultra wide band mode with 16 bit sample data and am setting all of AGC, VAD, DEREVERB and DENOISE to 1 using speex_preprocess_ctl(). AGC_LEVEL is set to 20000.0. Any thoughts? Andy
Andy Ross a ?crit :> I'm having trouble with the preprocessor's noise reduction feature. > The basic issue is that it simply doesn't work very well. > > With my laptop (whose microphone is otherwise quite capable) I > routinely hear transient background noise, typing, and other "quiet" > sounds leaking through to the speex stream. Even worse, the AGC > feature is blowing these things up into just awful explosions and > whines.It may sound odd to you, but there's actually no way for the noise suppressor to know whether you want to keep these sounds or not. The noise suppressor will only attempt to remove stationary noise, such as thermal noise, fans, ... The AGC can indeed do strange things in these cases, but it's been improved in svn (compared to 1.2beta1).> Honestly, it works so badly that I wonder if I'm doing something > wrong. I wrote a trivial squelch feature* in 10 minutes that works > basically 100% of the time. > > * Zero the sample data if the maximum sample in a frame is less than > 4% of saturation or 20% of the maximum sample yet seen. It's about > 8 lines of code.Congratulations. If it works better on your data, then use it. It'll just fail miserably in other conditions, but you may not care about those. In any case, that's the main difficulty with speech enhancement because you've got all kinds of noise, and you never know what the mic recording level is.> I'm using the 32KHz ultra wide band mode with 16 bit sample data and > am setting all of AGC, VAD, DEREVERB and DENOISE to 1 using > speex_preprocess_ctl(). AGC_LEVEL is set to 20000.0.Are you using a frame size in the 10-20 ms range? Jean-Marc
Jean-Marc Valin wrote:> The noise suppressor will only attempt to remove stationary noise, > such as thermal noise, fans, ... The AGC can indeed do strange > things in these cases, but it's been improved in svn (compared to > 1.2beta1).OK, then the problem is that I misunderstood the feature. I assumed that dynamic squelch was part of it, but it's really something more along the lines of active noise cancellation. That's fine, I'll work on improving my own squelch code.> Congratulations. If it works better on your data, then use it. It'll > just fail miserably in other conditions, but you may not care about > those.Uh, production applications almost always require squelch, no? This is no less true today than it was in the days of analog transmitters. Note that mobile phones don't transmit low-value transients, even if I'm typing right next to them. While it's certainly true that the fixed-threshold static peak implementation I banged out isn't going to work everywhere, some more elaborate variation would be really nice to have in speex.> Are you using a frame size in the 10-20 ms range?I'm using the frame size returned from SPEEX_GET_FRAME_SIZE. It's 640 frames under this implementation, or 20ms. Andy