Jean-Marc Valin wrote:> Andy Ross wrote: > > Uh, production applications almost always require squelch, no? > > Some do, some don't. In general, distinguishing between a keyboard > and a speech transient is next to impossible based only on a few ms > of speech.That is true for distinguishing it by waveform, but not by amplitude. As I mentioned, these transients are objectively tiny. I guess I'd be curious as to which voice codec applications require no squelch (other than trivial examples like push-to-talk interfaces). Especially in the presence of the AGC feature, it seems pretty much required to me. (I'll try the svn code, though. It may be that a better AGC would eliminate the need for squelch.) I'm not saying a general squelch algorithm would be a easy task, just that it's (1) important to real world applications (IMHO rather more important than spectrum-based denoise or AGC) and (2) not all *that* difficult, as evidenced by my really brief experiment. Again: my tiny chat application is perceptibly better sounding than the bare speex preprocessor in the presence of anything but total silence in the environment.> I wouldn't be surprised if you algo either 1) adds delay or 2) cuts > onsetsThe latter, if anything. There is no state or buffering; it's just a short-circuit to the preprocess call. But I will say that, given minimal testing, the output quality doesn't seem to suffer at all. And the lack of the very loud keyboard pops and squeals is a real improvement. I guess I don't understand your resistance to squelch, it's a very well-tested idiom. Sure, there are sexier algorithms out there, but there's still room for squelch in a modern application. Andy
> That is true for distinguishing it by waveform, but not by amplitude. > As I mentioned, these transients are objectively tiny.*Your* transients may be "tiny" and in any case, it doesn't help if you don't know the level you're recording at. I guess I'd be> curious as to which voice codec applications require no squelch (other > than trivial examples like push-to-talk interfaces). Especially in > the presence of the AGC feature, it seems pretty much required to me. > (I'll try the svn code, though. It may be that a better AGC would > eliminate the need for squelch.)As surprising as it may sound, I might want this lecture recording to keep the typing, but remove the ventilation noise! I might even be interested *only* is some sort of clicks and not the stationary background noise!> I'm not saying a general squelch algorithm would be a easy task, just > that it's (1) important to real world applications (IMHO rather more > important than spectrum-based denoise or AGC)Again your application != what everyone wants. and (2) not all *that*> difficult, as evidenced by my really brief experiment.Except that your application assumes a certain input level. Try taking the samples you test on and then apply a gain of 10 and run your magic algorithm. Then take the original sample and apply a gain of 0.02 instead. Do you always get the same result from your algorithm? What about also varying the SNR from 0 dB to 30 dB. Still perfect performance across the range? If so, please submit to http://www.ieee.org/organizations/society/sp/infotsa.html ...> I guess I don't understand your resistance to squelch, it's a very > well-tested idiom. Sure, there are sexier algorithms out there, but > there's still room for squelch in a modern application.As I mentioned before, if you find a good algorithm that'll work across any (or a good range of) input level and SNR, I'll be quite happy to consider it. Jean-Marc
Jean-Mark Valin wrote:> *Your* transients may be "tiny" and in any case, it doesn't > help if you don't know the level you're recording at.Yes, exactly: my transients are tiny. But in this case we're talking about transients produced by the keyboard device on a laptop with a microphone. That's hardly what I would call an unimportant edge case. And those transients *have* to be squelched somewhere in the application, because the output sounds like crap if they aren't.> Again your application != what everyone wants.I don't believe I ever said it was. I do argue that "many" people want squelch, however. And I think that if you look beyond the speex codebase, you'll probably find that many (maybe even most) of your users are implementing it themselves (I am, at least). And they are probably not doing so as well or as elaborately as could be done inside the preprocessor.> Except that your application assumes a certain input level.Well, of course it does, because it's 8 lines of code written in 10 minutes. But dynamic thresholds are certainly possible. That's what the AGC feature is, after all, right? Likewise, a better squelch would be based on signal energy instead of peak sample, etc... And your requirement that any feature in speex must work well in all conditions is specious anyway: speex is filled with optional features (VAD, de-reverb, echo cancellation) that are only applicable to certain regimes. Certainly squelch is no different. Look, I'm not trying to tell you how to write your code. I'm trying to tell you that there's a real world feature people might like to have. You don't have to implement it if you don't want to do so. But please stop yelling at me for asking for it. Andy