Andy Ross <andy@plausible.org> wrote:>
> Tom Grandgent wrote:
> > Andy Ross wrote:
> > > I wrote a trivial squelch feature* in 10 minutes that works
> > > basically 100% of the time.
> >
> > Could you please explain how this differs from VAD?
>
> Not knowing how VAD works, I can't say for sure. But enabling
> VAD wasn't catching the existing transients (see original post),
> and this does. So at the very least it differs in threshold.
Ok, well as far as I can tell, VAD is just a system that detects
whether or not some audio contains voice (or something else that
you want to transmit.) There are different ways to do it but
those are implementation details. I also didn't get good results
with Speex VAD for my VoIP app, so I ended up using a similar
trivial approach - a threshold based on the power level. It
works very well but is still far from perfect. I fully intend to
continue to re-evaluate Speex VAD on an ongoing basis.
So I think what you want is more related to VAD than denoise.
But really, there tend to be relationships between all of the
algorithms that work on the input audio stream - denoise, VAD,
AGC, AEC, etc, so it's easy to confuse their responsibilities.
I think Jean-Marc is helping out tremendously by providing
implementations of these features, but making them work well in
the general case and handling the huge variety of conditions in
real-world audio is a tough job which should not be underestimated.
Tom