thr3ads.net - ogg dev - [ogg-dev] Seeking to granules in discontinuous streams [Feb 2008]

If this information is useful, please help other people find it:
Share via:

ogg.k.ogg.k@googlemail.com

2008-Feb-07 03:20 UTC

[ogg-dev] Seeking to granules in discontinuous streams

> No particular answers, but I can at least point out that the way
> things were designed for CMML was to work with the existing Ogg
> seeking algorithm. The idea is that a generic seeking routine can work
> on any Ogg file, as long as it knows the granulepos->time mapping for
> the logical bitstreams in the file. That's why all the
> timestamp-related info is crammed into granulepos rather than being
> placed in packet data, where it is no longer visible in the
> container..
I think I'm missing something here, but I'm not quite sure what and
I'd
really like to understand it.

At a basic level, CMML granule-to-time function is (granule>>32)*scale.
This loses the low bits, which encode the high bits of the previous packet's
granulepos. Please stop me there if I'm wrong.

If one wants to seek into a CMML stream, one starts with a time. One does
not have a granulepos, but has to construct one from the time, so one just
generates a "good guess" granulepos: (time/scale<<32). The low
bits may
as well be left to zero, since we have no knowledge what they might be.

And this is the crux of the problem, as far as I understand how it works:
you'll have to do a first bisection to find the page that maps to that
granulepos,
then, and only then, you get to know the low bits of the actual granulepos,
and then you have to do another bisection to find it.

This is *not* handled by Ogg, since the granulepos encoding is wholly out
of its concern, so there's no way it can optimize out hte second biseciton.
If you want to be clever and you know clips are short and can't overlap,
you can get away with manual backward streaming to avoid the bisection,
but that's still a second seek.

Compare this with storing the granulepos of the earliest still active packet,
which is what I'm doing currently. This means a granulepos doesn't
contain
info on the previous packet's granulepos, but since you have to do two seeks
anyway, you don't mind. You'll just have to start decoding the first
packet to
find out the granulepos of the earlier packet, rather than just shift the found
granulepos by 32. This means you have to know the packet format, so it does
depend on the lib/codec. *BUT* this is also the case for CMML, where the
seeking client must know about the fact that CMML encodes a granule like
it does, with (x<<32)|y. Ogg can't do anything from this knowledge
(*), it still
has to be handled via the client code. The only difference I see between the
two methods is that mine requires a lookup inside the packet, which you'll
have anyway since Ogg had to seek to it beforehand.

(*) or can it ? Is this the bit I'm missing ? I'm 99% sure it can't,
but I'd be happy
to be proved wrong as it would resolve the conendrum.
> Assumptions about seeking in Ogg:
> annodex.net/software/liboggz/html/group__seek__semantics.html
>
> which quotes from a mail from Monty in 2002 explaining the seek algorithm.
Ah, quite interesting. That last solution Monty mentions would "fix"
the double
seek problems.
> Ralph's recent talk at LCA, "Seeking is hard: Ogg design
internals":
> linux.conf.au/programme/friday
I'll have a look, seems very topical.
> Hope that clarifies things a little bit. The answer may simply be that
> what you want to do isn't possible within the existing Ogg container +
> seek algorithm.
I don't think so, though it'd require a bit of help from the client
code, who'd
know the semantics of the codec. Unless this is also a consequence of the
bit I'm missing and not managing to get my head around.

Ralph Giles

2008-Feb-08 18:14 UTC

head link

[ogg-dev] Seeking to granules in discontinuous streams

On 2/7/08, ogg.k.ogg.k@googlemail.com <ogg.k.ogg.k@googlemail.com> wrote:
> And this is the crux of the problem, as far as I understand how it works:
> you'll have to do a first bisection to find the page that maps to that
> granulepos,
> then, and only then, you get to know the low bits of the actual granulepos,
> and then you have to do another bisection to find it.
The original seeking algorithm was that: granulepos maps to a
non-decreasing timestamp (in a way defined by the codec) and pages are
strictly ordered by that timestamp. So you seek by bisection searching
for a time. It doesn't matter that you can't compute a granulepos from
a time; all you need is the other direction. What you read from the
Ogg stream is a granulepos from a page header out of some particular
substream, which you *can* convert to a time, and that tells you which
way to jump for your next bisection.

Then since the pages are in order, you just start decoding from a bit
before your seek target *in time* and off you go.

What makes designing Ogg embeddings complicated in working out all the
implications of this for a new style of codec.
> This is *not* handled by Ogg, since the granulepos encoding is wholly out
> of its concern, so there's no way it can optimize out hte second
biseciton.
> If you want to be clever and you know clips are short and can't
overlap,
> you can get away with manual backward streaming to avoid the bisection,
> but that's still a second seek.
What I was trying to address in my talk, and don't know if it came out
clearly, was that, yes there's no way to avoid the second bisection.
That's bad, but not as bad as you might think because *you already
have to do other bisections anyway*.

Video codecs have keyframes, so you usually have to start way before
your seek target. And because of the problem of continued packets,
it's not enough to find a single granulepos which maps to a time
nearest your seek target, you actually have to treat all the streams
independently. Monty wasn't aware of this when he described the seek
algorithm in 2002 (which works fine if there are no continued packets
or only one logical stream).

Now, this is actually an argument for not having 30 different subtitle
streams all muxed together. However, I think the best way to solve
this is to have a profile require some kind of lower bound on the
frequency of subtitle pages. And perhaps also their coincidence with
video frames, &c. so demuxers have some idea how hard to try to find
such streams.
> Compare this with storing the granulepos of the earliest still active
packet,
> which is what I'm doing currently. This means a granulepos doesn't
contain
> info on the previous packet's granulepos, but since you have to do two
seeks
> anyway, you don't mind.
The advantage of storing this in the granulepos field itself, like
theora and CMML do, is that the seek code may already understand how
to handle the back pointer. Right now everything assumes the mapping
is from 'initialized decoder' + 'granulepos from page header'
=>
timestamp, or in the case of theora and CMML => 'timestamp' +
'last
keyframe timestamp.' Having to look in the accompanying page data is
an additional complication.

BTW, I don't think the 'discontinuous codec thing in ogg-multiplex.htm
is a very useful concept. Ogg is conceptually time-continuous media,
and one may trivially make any discontinuous codec continuous by
declaring that no packets mean nothing is to be played during the
interval in question. The real issue is the relative frequency of
pages, and the burden a great asymmetry there places on muxers and
decoders to find all the relevant data. Put another way, start-time
labelling for substreams with low page frequency helps buffering after
they're muxed, but it only goes so far if there's not also a lower
bound on that frequency.

Make any sense?

 -r

ogg.k.ogg.k@googlemail.com

2008-Feb-11 02:46 UTC

head link

[ogg-dev] Seeking to granules in discontinuous streams

> The advantage of storing this in the granulepos field itself, like
> theora and CMML do, is that the seek code may already understand how
> to handle the back pointer. Right now everything assumes the mapping
> is from 'initialized decoder' + 'granulepos from page
header' =>
> timestamp, or in the case of theora and CMML => 'timestamp' +
'last
> keyframe timestamp.' Having to look in the accompanying page data is
> an additional complication.
Since the format of that back pointer is specific to each codec, the seek
code must already know how that codec does its backlink, for each codec
it supports. This is not generic for all codecs. Therefore, picking an extra
64 bits granulepos at the start of the data is a fairly trivial complication, as
the Ogg layer will have the data handy since it just seeked to it. That extra
complication comes with the plus of not chopping off half the granule bit
space for the backlink, leaving only a 4 billionth of the granule space
available (if one assumes no reserved bits).
Or is there something that makes it that much easier to deal with a backlink
in the granulepos itself ?
For reference, my data packets start with start granule, end granule, backlink
granule, all 64 bits, so picking the granule is just a simple constant offset
lookup into the data packet, eg, you replace:

backlink = granulepos<<32;

by

backlink = read64le(op.packet+constant_offset);

Hardly complex.
> decoders to find all the relevant data. Put another way, start-time
> labelling for substreams with low page frequency helps buffering after
> they're muxed, but it only goes so far if there's not also a lower
> bound on that frequency.
If I understand what you want to get at, the frequency would be tied to the
maximum delay between two packets ?
FWIW, we only need to find a packet from the disc stream on the first
bisection (to find the backlink). For the second bisection, it is enough to
seek to a time before that backlink, as streaming in will eventually encounter
the backlinked packet, and that's enough for our purposes. Therefore, the
second bisection can be done timewise on the main continuous codec (eg,
the accompanying theora video).
If there is no accompanying theora video (or similar), then I don't think
the
problem happens, since you'll be swamped by kate packets anyway, since
there'll be nothing else there, so you don't waste time looking for
them.
Still means you get the hit for the first bisection.

Following on your point about a lower bound on the frequency, maybe one
could add "keepalive" packets when no data packets have been issued
for
a long enough time, and such packets would carry no data but the backlink
itself. Hmm, I like the idea. The packets would be very small (well, datawise,
most of the data will be Ogg header stuff), and we wouldn't get that bad
data
duplication that was inherent in the "echo" method described in the
Writ doc.

Mind you, that kind of frequency info is something that might be usefully put
in a new skeleton field, if it could help the seek code seek more efficiently.
Not sure if and how it could though.

Apparently Analagous Threads

Search for more reasonably related threads

ogg dev - Feb 2008 - Seeking to granules in discontinuous streams

[ogg-dev] Seeking to granules in discontinuous streams

[ogg-dev] Seeking to granules in discontinuous streams

[ogg-dev] Seeking to granules in discontinuous streams

Apparently Analagous Threads