Steve, The main problem I am having with the system is clipping off the start of someone's speech when they first start talking- the ends of the sentences seem to be handled properly. I am wondering whether this is the fault of the audio playback system or whether this is a speex issue- I also get the musical artifacts problem with the denoiser. This seems to be more of a problem on open air mics/speaker combinations. However, this and the issues you pointed out are a minor deal in comparison with the clipping problem. Tom <p><p>At 03:40 PM 5/17/2004, Steve Kann wrote:>Tom Harper wrote: > >>Hi All & Jean Marc, >> >>Once again I find myself delving into the pre-processing code to fiddle >>with the VAD, AGC and denoising code. >> >>Where i am at is that I have implemented all of Steve Kann's mods, and >>they are 90% of the way there in terms of working, except that I am still >>having issues denoising open air mics. But that is tangential to my >>question- >> >>I was wondering what the following function is supposed to be used for: >>speex_preprocess_estimate_update(....) >> >>I couldn't find anywhere in the code that references it- it appears to update >>the noise estimate without denoising anything- ? If so, does this need to >>be called for AGC to work, or? > > >Beats me :) > >I've found the denoiser to work very well on open-air microphones. > > >One thing I've found in using the preprocessor that I need to find a >solution for is that sometimes the denoiser is incorrectly detecting >speech as noise. This happens when a speaker speaks for a while, and the >denoiser tries to remove their intonation (i.e. the sound of their >voicebox; the vowels). The result is that towards the ends of sentences, >their voice gets "thinned" out. > >I'd like to find out how to "slow down" the rate of adaptation within the >denoiser to make this less likely to happen. I assume that the tradeoff >is that it will adapt more slowly to changing noise patterns (or initial >noise). I'm just not sure where to do that; maybe if I had the "Cohen >Paper" it would help :) > >-SteveK > >> >>Thanks, >>Tom >> >>--- >8 ---- >>List archives: http://www.xiph.org/archives/ >>Ogg project homepage: http://www.xiph.org/ogg/ >>To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' >>containing only the word 'unsubscribe' in the body. No subject is needed. >>Unsubscribe messages sent to the list will be ignored/filtered. > >--- >8 ---- >List archives: http://www.xiph.org/archives/ >Ogg project homepage: http://www.xiph.org/ogg/ >To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' >containing only the word 'unsubscribe' in the body. No subject is needed. >Unsubscribe messages sent to the list will be ignored/filtered.-- Tom Harper - tharper@sightspeed.com Lead Software Engineer SightSpeed - A Roda Group Affiliated Company 918 Parker St, Suite A14 Berkeley, CA 94710 Phone: 510.665.2920 Cell: 415.378.3779 http://www.sightspeed.com <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Tom Harper wrote:> Steve, > > The main problem I am having with the system is clipping off the start > of someone's speech when they first start talking- the ends of the > sentences > seem to be handled properly. I am wondering whether this is the fault > of the > audio playback system or whether this is a speex issue-I don't seem to get that; I suppose you've already changed the thresholds to make things more sensitive though: This is what I have now in iaxclient; the first number is the "initial" probability to go from not speech -> speech. The second is to go from speech -> not speech. [the difference implements hysteresis]. /* if (st->speech_prob> .35 || (st->last_speech < 20 && st->speech_prob>.1)) */ if (st->speech_prob> .30 || (st->last_speech < 20 && st->speech_prob>.07))> I also get the musical artifacts problem with the denoiser. This > seems to > be more of a problem on open air mics/speaker combinations. However, > this and the issues you pointed out are a minor deal in comparison > with the > clipping problem.Yes, I get some of that too, but it isn't really bad. It does obviously get worse as the signal to noise ratio gets lower. The denoiser isn't perfect, but the consensus amongst people using it seems to be that it's good enough to be enabled by default; So far the only problem it's caused for people without noisy environments is the occasional thinning; For people with noisy environments, even with some artifacts, the result is much better than the input. <p>-SteveK <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Tom Harper wrote:> Hi All & Jean Marc, > > Once again I find myself delving into the pre-processing code to fiddle > with the VAD, AGC and denoising code. > > Where i am at is that I have implemented all of Steve Kann's mods, and > they are 90% of the way there in terms of working, except that I am still > having issues denoising open air mics. But that is tangential to my > question- > > I was wondering what the following function is supposed to be used for: > speex_preprocess_estimate_update(....) > > I couldn't find anywhere in the code that references it- it appears to > update > the noise estimate without denoising anything- ? If so, does this > need to > be called for AGC to work, or?<p>Beats me :) I've found the denoiser to work very well on open-air microphones. <p>One thing I've found in using the preprocessor that I need to find a solution for is that sometimes the denoiser is incorrectly detecting speech as noise. This happens when a speaker speaks for a while, and the denoiser tries to remove their intonation (i.e. the sound of their voicebox; the vowels). The result is that towards the ends of sentences, their voice gets "thinned" out. I'd like to find out how to "slow down" the rate of adaptation within the denoiser to make this less likely to happen. I assume that the tradeoff is that it will adapt more slowly to changing noise patterns (or initial noise). I'm just not sure where to do that; maybe if I had the "Cohen Paper" it would help :) -SteveK> > Thanks, > Tom > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'speex-dev-request@xiph.org' > containing only the word 'unsubscribe' in the body. No subject is > needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.