thr3ads.net - Speex dev - [Speex-dev] How does the jitter buffer "catch up"? [Sep 2005]

If this information is useful, please help other people find it:
Share via:

Jean-Marc Valin

2005-Sep-18 18:42 UTC

[Speex-dev] How does the jitter buffer "catch up"?

> Err, unless I'm totally wrong, there are a few race conditions.
> 
> Assume the buffer is full of packets newer than the current pointer, and 
> one that is at the current pointer.
> 
> get and put start at the same time.
> 
> get will find the correct buffer index. Now, just after it finds it's 
> index, assume we switch to the put thread.
> 
> Put needs to put a new packet in, so discards the oldest, which is the 
> current packet, and replaces it with the new one (let's call it
newest).[...]

True. It was originally race-safe when I was discarding the incoming
packet in case of buffer full. I changed that behaviour exactly because
of burst issues and the like. Other than that, I don't think there
should be any other possible race (assuming CPU cache coherence).
> > Why would you do that. The idea of interpolating a frame is exactly to
> > get better quality than just putting zeros.
> 
> Actually, I oversimplified a bit. I check if valid_bits has been zero for 
> the last 4 frames or more, because once you interpolate more than 100ms 
> from the last known state, you end up with some weird 
> blipp-blopp-blooiiing sound. Actually it reminds me of the ambient sound 
> of weird aliens in bad 50s scifi movies. At that point, silence is much 
> better :)
Then it would be a problem with the packet loss concealment. It's
actually decreasing the level of the interpolated audio with time. Maybe
it's not decreasing quickly enough?

	Jean-Marc

-- 
Jean-Marc Valin <Jean-Marc.Valin@USherbrooke.ca>
Universit? de Sherbrooke

Thorvald Natvig

2005-Sep-18 19:03 UTC

head link

[Speex-dev] How does the jitter buffer "catch up"?

>> Err, unless I'm totally wrong, there are a few race conditions.
>>
>> Assume the buffer is full of packets newer than the current pointer,
and
>> one that is at the current pointer.
>>
>> get and put start at the same time.
>>
>> get will find the correct buffer index. Now, just after it finds
it's
>> index, assume we switch to the put thread.
>>
>> Put needs to put a new packet in, so discards the oldest, which is the
>> current packet, and replaces it with the new one (let's call it
newest).
> [...]
>
> True. It was originally race-safe when I was discarding the incoming
> packet in case of buffer full. I changed that behaviour exactly because
> of burst issues and the like. Other than that, I don't think there
> should be any other possible race (assuming CPU cache coherence).
Weeeeelll.. Actually, now that you mention it, the histogram shifting is a 
race as well.

In _put, the lines

          jitter->shortterm_margin[i] *= .98;
          jitter->longterm_margin[i] *= .995;
          jitter->shortterm_margin[int_margin] += .02;
          jitter->longterm_margin[int_margin] += .005;

are read-accumulate-write operations.

Let's assume the +40 bin has the shorterm_margin 1.0, and _put just read 
it to do a multiply.

At this point, _get shifts the histogram, so the 1.0 value is copied to 
the +20ms bin.

_put gets the CPU again, and writes 0.98 (the 1.0 it fetched, multiplied 
by 0.98) back to the +40 bin.

The sum of margins is now 1.98, which is Not Good (tm); it'll first shift 
it once more (ontime = 0.9x, +20ms = 0.9x), then shift again, (late =0.9x, 
ontime = 0.9x).. Eeek, packets are late, interpolate.. then they're early 
and so on. This will continue until the margins return to sane values.

The chance of this happening is much less than the other one, but it COULD 
happen ;)
>>> Why would you do that. The idea of interpolating a frame is exactly
to
>>> get better quality than just putting zeros.
>>
>> Actually, I oversimplified a bit. I check if valid_bits has been zero
for
>> the last 4 frames or more, because once you interpolate more than 100ms
>> from the last known state, you end up with some weird
>> blipp-blopp-blooiiing sound. Actually it reminds me of the ambient
sound
>> of weird aliens in bad 50s scifi movies. At that point, silence is much
>> better :)
>
> Then it would be a problem with the packet loss concealment. It's
> actually decreasing the level of the interpolated audio with time. Maybe
> it's not decreasing quickly enough?
I'd say so, my testers kept asking me what planet my room-mate is from :) 
Note that this only happens when you have quite a bit of interpolation, 
meaning there is serious network trouble anyway; don't sacrifice the 
quality of one or two-frame interpolation (which happens quite 
frequently) for these extreme cases.

PS: Regarding the earlier stuff about DTX using the VAD from the 
preprocessor, I ended up switching back to silence during 
non-transmission. The denoiser is good enough that people expect silence 
when others aren't talking, the comfort noise was unwelcome.

Jean-Marc Valin

2005-Sep-18 19:24 UTC

head link

[Speex-dev] How does the jitter buffer "catch up"?

> Weeeeelll.. Actually, now that you mention it, the histogram shifting is a 
> race as well.
Indeed. So I guess it was a good thing that I never felt confident
enough to advertise the jitter buffer as "safe to use without
mutexes" :-)
> The chance of this happening is much less than the other one, but it COULD 
> happen ;)
Exactly, which is why mutexes should be used.
> > Then it would be a problem with the packet loss concealment. It's
> > actually decreasing the level of the interpolated audio with time.
Maybe
> > it's not decreasing quickly enough?
> 
> I'd say so, my testers kept asking me what planet my room-mate is from
:)
> Note that this only happens when you have quite a bit of interpolation, 
> meaning there is serious network trouble anyway; don't sacrifice the 
> quality of one or two-frame interpolation (which happens quite 
> frequently) for these extreme cases.
Actually, I can control the rate at which the level decreases, so I can
make it go down faster after several lost frames without affecting the
"one frame" case. Look at the "attenuation" array near line
1080 of
nb_celp.c. It controls the level of the interpolated frame as a function
of the number of consecutive lost frames. There's something similar for
the high-band too. If you tweak it, please send me the optimal coefs you
found. I guess another reason for the buzz is the way I reconstruct the
excitation. I should use a noise generator a bit more instead of just
continuing the pitch.
> PS: Regarding the earlier stuff about DTX using the VAD from the 
> preprocessor, I ended up switching back to silence during 
> non-transmission. The denoiser is good enough that people expect silence 
> when others aren't talking, the comfort noise was unwelcome.
The DTX code is getting a bit old. Perhaps it could be improved. But if
silence is fine with your users, you can use that too.

	Jean-Marc

-- 
Jean-Marc Valin <Jean-Marc.Valin@USherbrooke.ca>
Universit? de Sherbrooke

Jean-Marc Valin

2005-Oct-01 20:44 UTC

head link

[Speex-dev] How does the jitter buffer "catch up"?

> I'd say so, my testers kept asking me what planet my room-mate is from
:)
> Note that this only happens when you have quite a bit of interpolation, 
> meaning there is serious network trouble anyway; don't sacrifice the 
> quality of one or two-frame interpolation (which happens quite 
> frequently) for these extreme cases.
I just checked in several improvements to the packet loss concealment
code, both for narrowband and wideband. Please test that and let me know
if you still have problems.

	Jean-Marc

Maybe Matching Threads

Search for more maybe matching threads

Speex dev - Sep 2005 - How does the jitter buffer "catch up"?

[Speex-dev] How does the jitter buffer "catch up"?

[Speex-dev] How does the jitter buffer "catch up"?

[Speex-dev] How does the jitter buffer "catch up"?

[Speex-dev] How does the jitter buffer "catch up"?

Maybe Matching Threads