thr3ads.net - Vorbis dev - [vorbis-dev] ogg stream-id options [Nov 2000]

If this information is useful, please help other people find it:
Share via:

Ralph Giles

2000-Nov-16 20:24 UTC

[vorbis-dev] ogg stream-id options

In http://advogato.net/person/rakholh/diary.html?start=165 Ali wrote:
>  For same reason I am arguing with people on vorbis-dev - but I don't
> understand what the argument is about (considering that the vorbis
> developers proposed a solution which mjs and I thought was reasonable,
> and then some developers decided to criticize us again for no reason).
Goodness, get dropped from the list and miss the return of the son of the
favorite flamewar. :^)

I don't know what you're arguing about either. But monty's talking
about
adding a toc substream sooner rather than later, so here's how I see our
options:

Right now, the only thing we produce are 'degenerate' ogg files. They
*only* contain a single vorbis audio stream. We've been telling people the
file extension is .ogg, to to magic detection on the initial OggS, and
that the mime-type is application/x-ogg.

We reevaluated the extension issue (the answer was no) and the mime-type
issue (we were swayed) and decided we'd recommend multiple mimetypes when
we have the video codec working, and add efficient discrimiation to the
requirements for the toc/metadata substream we'd always planned on.

None of this is a pressing issue; the useability arguments are moot until
we actually have both audio and video data.

That's the story so far.

Now, what are our options for implementation?

I'd proposed we combine the toc header with the kitchen-sink metadata
people have requested, and that we use xml-encoded rdf based on the Dublic
Core element set to do it. I still think this is the best option. XML is
the most obvious way to encode text streams (what subtitles should be) so
we can share part of the code, and conceptually the substream type. It
also offers good interoperability with indexing/catalog systems and plenty
of flexibility for future requirements.

Note that this doesn't really allow mime magic detection of the
'sequence
x at offset n' type. What I meant earlier about substring searching is
that you first look for the initial OggS, then search for
'<useage>' in
bytes 15-200 and case on whatever comes immediately after it.

But the time for that isn't now. Aside from not having the resources to
implement it, the standards for this sort of thing are very much still in
flux. Rob (of musicbrainz.org) and I couldn't even reach an agreement on
the encoding, and to support the general case is both unwieldy and
expensive, and likely to be obsolete next year. If we can wait 6-12
months, there should be much more of an external standard we can
incorporate. The librarians think this is a hard problem too.

Better, our solution will be a much closer fit if we give ourselves a
chance to evolve the format through usage while we're developing the video
codec.

If we had to do it NOW, I'd suggest something based on the vorbis
comment header, with (possibly hierarchic) text vectors in a set order.
The substream would consist of a head page and an empty tail page.

The first element would be something like "STREAMCLASS=audio". This
would
allow mime magic filetype detection if we require that the toc always be
first. Others would follow like so:

general bitstream headers:
        STREAMCLASS=video
        <misc metadata a la vorbis/kitchensink?>
substream 8347929:
        STREAMTYPE=toc	(this example)
substream 2361643:
        STREAMTYPE=tarkin
        LAYER=0		(means this is a primary stream, not an overlay)
        USAGE=default	(not an alternate track)
substream 8293298:
        STREAMTYPE=vorbis
        SUBTYPE=surround	(could be a mapping number instead)
        LAYER=0
        USAGE=default
        LANG=en
        LABEL=English surround audio
substeram 0923470:
        STREAMTYPE=vorbis
        SUBTYPE=stereo
        LAYER=0
        USAGE=alternate
        LANG=es
        LABEL=director's commentary
substream 7829372:
        STREAMTYPE=mng		(these would be pre-rendered subtitles)
        LAYER=1
        LANG=jp
        http://advogato.net/person/rakholh/diary.html?start=165
..and so on. The substream numbers refer to the logical substream ids, for
easy correlation. They could be encoded either as separate section
headers, or a part of the tags in a linear arrangement.

That's about as general as I can make it right now, and I think something
(particularly a forwards-compatible vorbis-only implementation) could be
written in time for 1.0.

IMHO,
 -ralph

--
giles@ashlu.bc.ca

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Monty

2000-Nov-16 20:59 UTC

head link

[vorbis-dev] ogg stream-id options

> Right now, the only thing we produce are 'degenerate' ogg files.
They
> *only* contain a single vorbis audio stream. We've been telling people
the
> file extension is .ogg, to to magic detection on the initial OggS, and
> that the mime-type is application/x-ogg.
A degenerate ogg file is any meta-header-less stream/link containing only
one logical bitstream.
> We reevaluated the extension issue (the answer was no) and the mime-type
> issue (we were swayed) and decided we'd recommend multiple mimetypes
when
> we have the video codec working, and add efficient discrimiation to the
> requirements for the toc/metadata substream we'd always planned on.
I'm swayed in that I agree with their functionality arguments
(meaning, I agree with the end goal).  I'm not convinced mime is the
sole way to do this.  Mime (to identify ogg) and magic (to identify
container contents) is still my proposal.  Of course, we'll eventually
agree to something.
> I'd proposed we combine the toc header with the kitchen-sink metadata
> people have requested, and that we use xml-encoded rdf based on the Dublic
> Core element set to do it.
No.  The metaheader is meant to be something *much* simpler.  No XML
there (and I say this because I don't want a full blown XML parser,
again, just to figure out what to do with a stream.  XML is alot of
weight).  It's to be a single page with very basic arrangement
information.
> Note that this doesn't really allow mime magic detection of the
'sequence
> x at offset n' type. What I meant earlier about substring searching is
> that you first look for the initial OggS, then search for
'<useage>' in
> bytes 15-200 and case on whatever comes immediately after it.
> incorporate. The librarians think this is a hard problem too.
Except I want something simpler than what you propose.  Perhaps
something more complex than what I'm thinking of now will become
necessary (actually I expect that to be the case).  This is meant to
be information for applications to use, not as much humans.

Monty

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Ali Abdin

2000-Nov-17 10:30 UTC

head link

[vorbis-dev] Re: ogg stream-id options

* Ralph Giles (giles@ashlu.bc.ca) wrote at 16:18 on
17/11/00:> In http://advogato.net/person/rakholh/diary.html?start=165 Ali wrote:
> 
> >  For same reason I am arguing with people on vorbis-dev - but I
don't
> > understand what the argument is about (considering that the vorbis
> > developers proposed a solution which mjs and I thought was reasonable,
> > and then some developers decided to criticize us again for no reason).
> 
> Goodness, get dropped from the list and miss the return of the son of the
> favorite flamewar. :^)
> 
> I don't know what you're arguing about either. But monty's
talking about
> adding a toc substream sooner rather than later, so here's how I see
our
> options:
Actually, I thought I was just defending myself from criticism :)
 > Right now, the only thing we produce are 'degenerate' ogg files.
They
> *only* contain a single vorbis audio stream. We've been telling people
the
> file extension is .ogg, to to magic detection on the initial OggS, and
> that the mime-type is application/x-ogg.
> 
> We reevaluated the extension issue (the answer was no) and the mime-type
> issue (we were swayed) and decided we'd recommend multiple mimetypes
when
> we have the video codec working, and add efficient discrimiation to the
> requirements for the toc/metadata substream we'd always planned on.
I have no objections with that.
> None of this is a pressing issue; the useability arguments are moot until
> we actually have both audio and video data.
If none of this is a "pressing issue" then why can't I jus use
audio/x-ogg and
then switch to application/x-ogg when you guys develop a video codec? ;)
(currently we use application/x-ogg as the mime-type so I have been complying
to your requests :)
> That's the story so far.
> 
> Now, what are our options for implementation?
> 
> I'd proposed we combine the toc header with the kitchen-sink metadata
> people have requested, and that we use xml-encoded rdf based on the Dublic
> Core element set to do it. I still think this is the best option. XML is
> the most obvious way to encode text streams (what subtitles should be) so
> we can share part of the code, and conceptually the substream type. It
> also offers good interoperability with indexing/catalog systems and plenty
> of flexibility for future requirements.
Thats interesting, there is a project I am related to that is working on the
Dublin Core stuff (Dubline Core == Object Metadata Framework right?)

XML could be too "heavyweight" to parse (the tags "waste"
space :P) this has
consequences for file-size, streaming, embedded devices ;) i.e:
compare the number of chars:
<title>Lala</title> (19 chars)
Title=Lala (10 chars)

Its not a /huge/ difference I know - but every little bit counts doesn't it?
;)
> Note that this doesn't really allow mime magic detection of the
'sequence
> x at offset n' type. What I meant earlier about substring searching is
> that you first look for the initial OggS, then search for
'<useage>' in
> bytes 15-200 and case on whatever comes immediately after it.
This provides no advantages over the current method, I believe. Basically we
would be still stuck with an algorithmic approach to determining the
file-type's contents.

It is better than the current solution (which doesn't exist) - but is not
the
ideal solution (in my humble opinion).

Perhaps the '<usage>' should be the first tag within toc header or
something?
(giving you what you want, and giving us what we want)
> But the time for that isn't now. Aside from not having the resources to
> implement it, the standards for this sort of thing are very much still in
> flux. Rob (of musicbrainz.org) and I couldn't even reach an agreement
on
> the encoding, and to support the general case is both unwieldy and
> expensive, and likely to be obsolete next year. If we can wait 6-12
> months, there should be much more of an external standard we can
> incorporate. The librarians think this is a hard problem too.
> 
> Better, our solution will be a much closer fit if we give ourselves a
> chance to evolve the format through usage while we're developing the
video
> codec.
> 
> If we had to do it NOW, I'd suggest something based on the vorbis
> comment header, with (possibly hierarchic) text vectors in a set order.
> The substream would consist of a head page and an empty tail page.
> 
> The first element would be something like "STREAMCLASS=audio".
This would
> allow mime magic filetype detection if we require that the toc always be
> first. Others would follow like so:
> 
> general bitstream headers:
> 	STREAMCLASS=video
> 	<misc metadata a la vorbis/kitchensink?>
> substream 8347929:
> 	STREAMTYPE=toc	(this example)
> substream 2361643:
> 	STREAMTYPE=tarkin
> 	LAYER=0		(means this is a primary stream, not an overlay)
> 	USAGE=default	(not an alternate track)
> substream 8293298:
> 	STREAMTYPE=vorbis
> 	SUBTYPE=surround	(could be a mapping number instead)
> 	LAYER=0
> 	USAGE=default
> 	LANG=en
> 	LABEL=English surround audio
> substeram 0923470:
> 	STREAMTYPE=vorbis
> 	SUBTYPE=stereo
> 	LAYER=0
> 	USAGE=alternate
> 	LANG=es
> 	LABEL=director's commentary
> substream 7829372:
> 	STREAMTYPE=mng		(these would be pre-rendered subtitles)
> 	LAYER=1
> 	LANG=jp
> 	http://advogato.net/person/rakholh/diary.html?start=165
> ..and so on. The substream numbers refer to the logical substream ids, for
> easy correlation. They could be encoded either as separate section
> headers, or a part of the tags in a linear arrangement.
> 
> That's about as general as I can make it right now, and I think
something
> (particularly a forwards-compatible vorbis-only implementation) could be
> written in time for 1.0.
Regards,
Ali

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Ralph Giles

2000-Nov-19 12:48 UTC

head link

[vorbis-dev] ogg stream-id options

On Sat 18 Nov 2000, Michael Smith wrote:
> Yes. I think bits of Dublin Core have been considered, but much of it is
> inappopriate (which isn't a problem. Using the bits which are
appropriate
> is fine given the way DC is designed, I think).
Certainly. One of the ideas with Dublin Core is that you can "dumb
down"
your metadata format so that foreign parsers can get useful information out
even if they can't tell the difference between a lead guitarist and a 
guest soloist.

Rakholh: what's the other project? (just curious)

If we don't end up using one of the recommended (xml) encodings I then
we really just need to insure a bijective mapping to the dc element set,
since foreign parsers won't be able to get at the data anyway. For example,
the canonical tags for the Vorbis comment header mostly aren't dc, but
there are clear correspondences: artist->creator, organization->publisher,
copyright->rights.
> The size difference is basically irrelevent. Compared to the size of the
> actual data, even 'bloated' metadata will be tiny.
And if every bit *does* count, we can run the stream through gz/bz2 and
remove any difference. :-)
> It doesn't really help you much with getting details out for complex
stream
> types (since there can be an arbitrary number of streams, identifiers
> obviously CAN'T be at a fixed offset for all of them. It could use a
simple
> table (at a fixed offset) of 'pointers' into the data - you'd
always need
> to check each of them, that's unavoidable, but this would make it
possible
> to do it quickly and simply).
I think we're talking about general usage classes here, not detailed
capability determination, which could just be the first entry.

Technically the offset+<value at other offset> method is required for
degenerate streams as well, unless you can make an assumption as to
how many lacing values the head packet requires.

We can't really do that with an xml-based toc. There could be arb.
whitespace and other tags in from the the <usage> so you really have
to search. OTOH, if we do go the binary or text-vector routes, I think
it's only fair to support magic detection at a specific offset in the 
first packet. (my last example does this)

To the nautilus people: how hard is it to do 'if (first magic) { more magic
}'
tests? Are you really looking just for magic+offset entries you can stick in
a table, or do you support hierarchic determination in some form?

In another message, Monty wrote:
> No. The metaheader is meant to be something *much* simpler. No XML
> there (and I say this because I don't want a full blown XML parser,
> again, just to figure out what to do with a stream. XML is alot of
> weight). It's to be a single page with very basic arrangement
> information.
Ah, I had misunderstood your position.
> Except I want something simpler than what you propose. Perhaps
> something more complex than what I'm thinking of now will become
> necessary (actually I expect that to be the case). This is meant to
> be information for applications to use, not as much humans.
Simpler than the XML/RDF/DC proposal, or simpler than the text-vector
toc format I suggested in the last message? I don't see any way to
simplify the second and still support any DVD-like features. And if
not that, what do we need it for?

To be fair, I don't think RDF is especially human-intended either. :)

Cheers,
 -ralph


--
giles@ashlu.bc.ca

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Reasonably Related Threads

Search for more reasonably related threads

Vorbis dev - Nov 2000 - ogg stream-id options

[vorbis-dev] ogg stream-id options

[vorbis-dev] ogg stream-id options

[vorbis-dev] Re: ogg stream-id options

[vorbis-dev] ogg stream-id options

Reasonably Related Threads