thr3ads.net - Speex dev - [Speex-dev] Audio Speed Variability [Oct 2007]

If this information is useful, please help other people find it:
Share via:

James Stanton

2007-Oct-04 12:53 UTC

[Speex-dev] Audio Speed Variability

I have a video conference like application that I've been working on for 
a while now, and a recent change is causing some odd problems, and I was 
wondering if anyone else had seen problems like this.  The issue I'm 
seeing is that when using the sound card for capture, the audio will 
eventually get about 1-2 seconds out of synch (delayed), from the 
video.  However, if I use USB devices for capture and playback, the 
delay disapears.  To make life more complex, using the sound card only 
causes the delay on some systems, but not all systems.  On my two main 
development boxes I see no problems with either USB or mini-jack on the 
soundboard, but on one of our other test machines only the USB works.  
It feels to me like it might be an audio clocking issue, but it could 
also be a speed of processing issue.  Has anyone seen this at all, and 
if so did anything help it?  Is there anything I can change in the speex 
codec to speed things up, in case it's a speed of processing issue?  I 
am seeing a fair number of problem records coming out of the speex 
jitter buffer, but they might just mean that data isn't being fed in 
fast enough into the buffer.  I'm checking:

speex_jitter_get(&speexJitter, (short *)newData, NULL);
if (speexJitter.valid_bits == 0) //bad record
{
fprintf(stderr, "Interpolating since nothing happened!!!!\n");
fflush(stderr);
}

Does this say I'm processing too slowly into the buffer, or that the 
data put in is somehow corrupt?
The application is Windows based, and I'm using DirectSound for capture 
and playback, specifically DirectSoundFullDuplexCreate8() with 
Notification Positions.  The RTP Library I'm using for transfer is 
JRTPLib 3.7.1, and I'm using the associated JThread 1.2.1 for .  I'm 
currently using Speex 1.1.12, in wideband mode with the speex jitter 
buffer, Quality is set to 8, and perceptual enhancement on.  The 
preprocessor state is being set as follows:

preprocessorState = speex_preprocess_state_init(320, 16000);  //640, 32000
    int denoise = 0;
    int agc = 0;
    int vad = 0;
    int dreverb = 0;
    float agcLevel = 8000;
    float dereverb_decay = .5f;
    float dereverb_level = .2f;
    speex_preprocess_ctl(preprocessorState, 
SPEEX_PREPROCESS_SET_DENOISE, &denoise);
    speex_preprocess_ctl(preprocessorState, SPEEX_PREPROCESS_SET_AGC, &agc);
    speex_preprocess_ctl(preprocessorState, SPEEX_PREPROCESS_SET_VAD, &vad);
    speex_preprocess_ctl(preprocessorState, 
SPEEX_PREPROCESS_SET_DEREVERB, &dreverb);
    speex_preprocess_ctl(preprocessorState, 
SPEEX_PREPROCESS_SET_DEREVERB_DECAY, &dereverb_decay);
    speex_preprocess_ctl(preprocessorState, 
SPEEX_PREPROCESS_SET_DEREVERB_LEVEL, &dereverb_level);
    speex_preprocess_ctl(preprocessorState, 
SPEEX_PREPROCESS_SET_AGC_LEVEL, &agcLevel);

Thank you for any input you might have!

Jamie Stanton

John Miles

2007-Oct-04 13:33 UTC

head link

[Speex-dev] Audio Speed Variability

> -----Original Message-----
> From: speex-dev-bounces@xiph.org [mailto:speex-dev-bounces@xiph.org]On
> Behalf Of James Stanton
> Sent: Thursday, October 04, 2007 12:53 PM
> To: speex-dev@xiph.org
> Subject: [Speex-dev] Audio Speed Variability
>
>
> I have a video conference like application that I've been working on
for
> a while now, and a recent change is causing some odd problems, and I was
> wondering if anyone else had seen problems like this....

Short answer: don't use output sample rates other than 44100 or 48000.

Longer answer: Sound chips usually run at one of those rates, often either.
Those rates are more or less guaranteed to work properly.  Most chips don't
support other rates directly; a software resampler in the driver is used
instead.  Unfortunately, Microsoft released a horribly-broken reference
resampler implementation to sound hardware OEMs a few years ago, and many of
them still use it.  On their sound cards, if you ask for 11025 Hz, for
example you're likely to get 11100 Hz or something similarly-imprecise.
That obviously causes cumulative latency/slippage problems.

Bottom line: voice codec applications that need to work at lower rates
really need to resample to 44.1K or 48K themselves in order to work robustly
across all hardware platforms.  Neither MS nor sound-hardware OEMs have
shown the slightest interest in fixing this bug, so that's just the way it
goes.

-- john

James Stanton

2007-Oct-04 15:16 UTC

head link

[Speex-dev] Audio Speed Variability

John,

Thanks for the reply!  You mentioned output sample rates should be 44100 
or 48000, should I worry about input (Mic) Sample rates as well?  
(Currently I was requesting the sample rate on both ends to be 16000 
samplesPerSecond, for ease of passing into the codec)   Also, do you 
recommend any particular resampler that I should use, or are any of the 
ones out there probably okay, or should I just write my own?  Thanks 
again for your help!

Jamie

John Miles wrote:
>>-----Original Message-----
>>From: speex-dev-bounces@xiph.org [mailto:speex-dev-bounces@xiph.org]On
>>Behalf Of James Stanton
>>Sent: Thursday, October 04, 2007 12:53 PM
>>To: speex-dev@xiph.org
>>Subject: [Speex-dev] Audio Speed Variability
>>
>>
>>I have a video conference like application that I've been working on
for
>>a while now, and a recent change is causing some odd problems, and I was
>>wondering if anyone else had seen problems like this....
>>    
>>
>
>
>Short answer: don't use output sample rates other than 44100 or 48000.
>
>Longer answer: Sound chips usually run at one of those rates, often either.
>Those rates are more or less guaranteed to work properly.  Most chips
don't
>support other rates directly; a software resampler in the driver is used
>instead.  Unfortunately, Microsoft released a horribly-broken reference
>resampler implementation to sound hardware OEMs a few years ago, and many of
>them still use it.  On their sound cards, if you ask for 11025 Hz, for
>example you're likely to get 11100 Hz or something similarly-imprecise.
>That obviously causes cumulative latency/slippage problems.
>
>Bottom line: voice codec applications that need to work at lower rates
>really need to resample to 44.1K or 48K themselves in order to work robustly
>across all hardware platforms.  Neither MS nor sound-hardware OEMs have
>shown the slightest interest in fixing this bug, so that's just the way
it
>goes.
>
>-- john
>
>_______________________________________________
>Speex-dev mailing list
>Speex-dev@xiph.org
>http://lists.xiph.org/mailman/listinfo/speex-dev
>  
>

Alexander Chemeris

2007-Oct-05 02:40 UTC

head link

[Speex-dev] Audio Speed Variability

Hi,

On 10/5/07, John Miles <jmiles@pop.net> wrote:> Longer answer: Sound chips usually run at one of those rates, often either.
> Those rates are more or less guaranteed to work properly.  Most chips
don't
> support other rates directly; a software resampler in the driver is used
> instead.  Unfortunately, Microsoft released a horribly-broken reference
> resampler implementation to sound hardware OEMs a few years ago, and many
of
> them still use it.  On their sound cards, if you ask for 11025 Hz, for
> example you're likely to get 11100 Hz or something similarly-imprecise.
> That obviously causes cumulative latency/slippage problems.
Is there any statistic regarding which chips operate on which frequency?
I'm wondering will it be safe to use 48kHz rather then 44.1kHz, as it is
a multiple of 16kHz, so it should be easier, faster and less lossy to
convert to/from it.

-- 
Regards,
Alexander Chemeris.

SIPez LLC.
SIP VoIP, IM and Presence Consulting
http://www.SIPez.com
tel: +1 (617) 273-4000

Apparently Analagous Threads

Search for more maybe matching threads

Speex dev - Oct 2007 - Audio Speed Variability

[Speex-dev] Audio Speed Variability

[Speex-dev] Audio Speed Variability

[Speex-dev] Audio Speed Variability

[Speex-dev] Audio Speed Variability

Apparently Analagous Threads