In http://advogato.net/person/rakholh/diary.html?start=165 Ali wrote:> For same reason I am arguing with people on vorbis-dev - but I don't > understand what the argument is about (considering that the vorbis > developers proposed a solution which mjs and I thought was reasonable, > and then some developers decided to criticize us again for no reason).Goodness, get dropped from the list and miss the return of the son of the favorite flamewar. :^) I don't know what you're arguing about either. But monty's talking about adding a toc substream sooner rather than later, so here's how I see our options: Right now, the only thing we produce are 'degenerate' ogg files. They *only* contain a single vorbis audio stream. We've been telling people the file extension is .ogg, to to magic detection on the initial OggS, and that the mime-type is application/x-ogg. We reevaluated the extension issue (the answer was no) and the mime-type issue (we were swayed) and decided we'd recommend multiple mimetypes when we have the video codec working, and add efficient discrimiation to the requirements for the toc/metadata substream we'd always planned on. None of this is a pressing issue; the useability arguments are moot until we actually have both audio and video data. That's the story so far. Now, what are our options for implementation? I'd proposed we combine the toc header with the kitchen-sink metadata people have requested, and that we use xml-encoded rdf based on the Dublic Core element set to do it. I still think this is the best option. XML is the most obvious way to encode text streams (what subtitles should be) so we can share part of the code, and conceptually the substream type. It also offers good interoperability with indexing/catalog systems and plenty of flexibility for future requirements. Note that this doesn't really allow mime magic detection of the 'sequence x at offset n' type. What I meant earlier about substring searching is that you first look for the initial OggS, then search for '<useage>' in bytes 15-200 and case on whatever comes immediately after it. But the time for that isn't now. Aside from not having the resources to implement it, the standards for this sort of thing are very much still in flux. Rob (of musicbrainz.org) and I couldn't even reach an agreement on the encoding, and to support the general case is both unwieldy and expensive, and likely to be obsolete next year. If we can wait 6-12 months, there should be much more of an external standard we can incorporate. The librarians think this is a hard problem too. Better, our solution will be a much closer fit if we give ourselves a chance to evolve the format through usage while we're developing the video codec. If we had to do it NOW, I'd suggest something based on the vorbis comment header, with (possibly hierarchic) text vectors in a set order. The substream would consist of a head page and an empty tail page. The first element would be something like "STREAMCLASS=audio". This would allow mime magic filetype detection if we require that the toc always be first. Others would follow like so: general bitstream headers: STREAMCLASS=video <misc metadata a la vorbis/kitchensink?> substream 8347929: STREAMTYPE=toc (this example) substream 2361643: STREAMTYPE=tarkin LAYER=0 (means this is a primary stream, not an overlay) USAGE=default (not an alternate track) substream 8293298: STREAMTYPE=vorbis SUBTYPE=surround (could be a mapping number instead) LAYER=0 USAGE=default LANG=en LABEL=English surround audio substeram 0923470: STREAMTYPE=vorbis SUBTYPE=stereo LAYER=0 USAGE=alternate LANG=es LABEL=director's commentary substream 7829372: STREAMTYPE=mng (these would be pre-rendered subtitles) LAYER=1 LANG=jp http://advogato.net/person/rakholh/diary.html?start=165 ..and so on. The substream numbers refer to the logical substream ids, for easy correlation. They could be encoded either as separate section headers, or a part of the tags in a linear arrangement. That's about as general as I can make it right now, and I think something (particularly a forwards-compatible vorbis-only implementation) could be written in time for 1.0. IMHO, -ralph -- giles@ashlu.bc.ca --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
> Right now, the only thing we produce are 'degenerate' ogg files. They > *only* contain a single vorbis audio stream. We've been telling people the > file extension is .ogg, to to magic detection on the initial OggS, and > that the mime-type is application/x-ogg.A degenerate ogg file is any meta-header-less stream/link containing only one logical bitstream.> We reevaluated the extension issue (the answer was no) and the mime-type > issue (we were swayed) and decided we'd recommend multiple mimetypes when > we have the video codec working, and add efficient discrimiation to the > requirements for the toc/metadata substream we'd always planned on.I'm swayed in that I agree with their functionality arguments (meaning, I agree with the end goal). I'm not convinced mime is the sole way to do this. Mime (to identify ogg) and magic (to identify container contents) is still my proposal. Of course, we'll eventually agree to something.> I'd proposed we combine the toc header with the kitchen-sink metadata > people have requested, and that we use xml-encoded rdf based on the Dublic > Core element set to do it.No. The metaheader is meant to be something *much* simpler. No XML there (and I say this because I don't want a full blown XML parser, again, just to figure out what to do with a stream. XML is alot of weight). It's to be a single page with very basic arrangement information.> Note that this doesn't really allow mime magic detection of the 'sequence > x at offset n' type. What I meant earlier about substring searching is > that you first look for the initial OggS, then search for '<useage>' in > bytes 15-200 and case on whatever comes immediately after it.> incorporate. The librarians think this is a hard problem too.Except I want something simpler than what you propose. Perhaps something more complex than what I'm thinking of now will become necessary (actually I expect that to be the case). This is meant to be information for applications to use, not as much humans. Monty --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
* Ralph Giles (giles@ashlu.bc.ca) wrote at 16:18 on 17/11/00:> In http://advogato.net/person/rakholh/diary.html?start=165 Ali wrote: > > > For same reason I am arguing with people on vorbis-dev - but I don't > > understand what the argument is about (considering that the vorbis > > developers proposed a solution which mjs and I thought was reasonable, > > and then some developers decided to criticize us again for no reason). > > Goodness, get dropped from the list and miss the return of the son of the > favorite flamewar. :^) > > I don't know what you're arguing about either. But monty's talking about > adding a toc substream sooner rather than later, so here's how I see our > options:Actually, I thought I was just defending myself from criticism :)> Right now, the only thing we produce are 'degenerate' ogg files. They > *only* contain a single vorbis audio stream. We've been telling people the > file extension is .ogg, to to magic detection on the initial OggS, and > that the mime-type is application/x-ogg. > > We reevaluated the extension issue (the answer was no) and the mime-type > issue (we were swayed) and decided we'd recommend multiple mimetypes when > we have the video codec working, and add efficient discrimiation to the > requirements for the toc/metadata substream we'd always planned on.I have no objections with that.> None of this is a pressing issue; the useability arguments are moot until > we actually have both audio and video data.If none of this is a "pressing issue" then why can't I jus use audio/x-ogg and then switch to application/x-ogg when you guys develop a video codec? ;) (currently we use application/x-ogg as the mime-type so I have been complying to your requests :)> That's the story so far. > > Now, what are our options for implementation? > > I'd proposed we combine the toc header with the kitchen-sink metadata > people have requested, and that we use xml-encoded rdf based on the Dublic > Core element set to do it. I still think this is the best option. XML is > the most obvious way to encode text streams (what subtitles should be) so > we can share part of the code, and conceptually the substream type. It > also offers good interoperability with indexing/catalog systems and plenty > of flexibility for future requirements.Thats interesting, there is a project I am related to that is working on the Dublin Core stuff (Dubline Core == Object Metadata Framework right?) XML could be too "heavyweight" to parse (the tags "waste" space :P) this has consequences for file-size, streaming, embedded devices ;) i.e: compare the number of chars: <title>Lala</title> (19 chars) Title=Lala (10 chars) Its not a /huge/ difference I know - but every little bit counts doesn't it? ;)> Note that this doesn't really allow mime magic detection of the 'sequence > x at offset n' type. What I meant earlier about substring searching is > that you first look for the initial OggS, then search for '<useage>' in > bytes 15-200 and case on whatever comes immediately after it.This provides no advantages over the current method, I believe. Basically we would be still stuck with an algorithmic approach to determining the file-type's contents. It is better than the current solution (which doesn't exist) - but is not the ideal solution (in my humble opinion). Perhaps the '<usage>' should be the first tag within toc header or something? (giving you what you want, and giving us what we want)> But the time for that isn't now. Aside from not having the resources to > implement it, the standards for this sort of thing are very much still in > flux. Rob (of musicbrainz.org) and I couldn't even reach an agreement on > the encoding, and to support the general case is both unwieldy and > expensive, and likely to be obsolete next year. If we can wait 6-12 > months, there should be much more of an external standard we can > incorporate. The librarians think this is a hard problem too. > > Better, our solution will be a much closer fit if we give ourselves a > chance to evolve the format through usage while we're developing the video > codec. > > If we had to do it NOW, I'd suggest something based on the vorbis > comment header, with (possibly hierarchic) text vectors in a set order. > The substream would consist of a head page and an empty tail page. > > The first element would be something like "STREAMCLASS=audio". This would > allow mime magic filetype detection if we require that the toc always be > first. Others would follow like so: > > general bitstream headers: > STREAMCLASS=video > <misc metadata a la vorbis/kitchensink?> > substream 8347929: > STREAMTYPE=toc (this example) > substream 2361643: > STREAMTYPE=tarkin > LAYER=0 (means this is a primary stream, not an overlay) > USAGE=default (not an alternate track) > substream 8293298: > STREAMTYPE=vorbis > SUBTYPE=surround (could be a mapping number instead) > LAYER=0 > USAGE=default > LANG=en > LABEL=English surround audio > substeram 0923470: > STREAMTYPE=vorbis > SUBTYPE=stereo > LAYER=0 > USAGE=alternate > LANG=es > LABEL=director's commentary > substream 7829372: > STREAMTYPE=mng (these would be pre-rendered subtitles) > LAYER=1 > LANG=jp > http://advogato.net/person/rakholh/diary.html?start=165 > ..and so on. The substream numbers refer to the logical substream ids, for > easy correlation. They could be encoded either as separate section > headers, or a part of the tags in a linear arrangement. > > That's about as general as I can make it right now, and I think something > (particularly a forwards-compatible vorbis-only implementation) could be > written in time for 1.0.Regards, Ali --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On Sat 18 Nov 2000, Michael Smith wrote:> Yes. I think bits of Dublin Core have been considered, but much of it is > inappopriate (which isn't a problem. Using the bits which are appropriate > is fine given the way DC is designed, I think).Certainly. One of the ideas with Dublin Core is that you can "dumb down" your metadata format so that foreign parsers can get useful information out even if they can't tell the difference between a lead guitarist and a guest soloist. Rakholh: what's the other project? (just curious) If we don't end up using one of the recommended (xml) encodings I then we really just need to insure a bijective mapping to the dc element set, since foreign parsers won't be able to get at the data anyway. For example, the canonical tags for the Vorbis comment header mostly aren't dc, but there are clear correspondences: artist->creator, organization->publisher, copyright->rights.> The size difference is basically irrelevent. Compared to the size of the > actual data, even 'bloated' metadata will be tiny.And if every bit *does* count, we can run the stream through gz/bz2 and remove any difference. :-)> It doesn't really help you much with getting details out for complex stream > types (since there can be an arbitrary number of streams, identifiers > obviously CAN'T be at a fixed offset for all of them. It could use a simple > table (at a fixed offset) of 'pointers' into the data - you'd always need > to check each of them, that's unavoidable, but this would make it possible > to do it quickly and simply).I think we're talking about general usage classes here, not detailed capability determination, which could just be the first entry. Technically the offset+<value at other offset> method is required for degenerate streams as well, unless you can make an assumption as to how many lacing values the head packet requires. We can't really do that with an xml-based toc. There could be arb. whitespace and other tags in from the the <usage> so you really have to search. OTOH, if we do go the binary or text-vector routes, I think it's only fair to support magic detection at a specific offset in the first packet. (my last example does this) To the nautilus people: how hard is it to do 'if (first magic) { more magic }' tests? Are you really looking just for magic+offset entries you can stick in a table, or do you support hierarchic determination in some form? In another message, Monty wrote:> No. The metaheader is meant to be something *much* simpler. No XML > there (and I say this because I don't want a full blown XML parser, > again, just to figure out what to do with a stream. XML is alot of > weight). It's to be a single page with very basic arrangement > information.Ah, I had misunderstood your position.> Except I want something simpler than what you propose. Perhaps > something more complex than what I'm thinking of now will become > necessary (actually I expect that to be the case). This is meant to > be information for applications to use, not as much humans.Simpler than the XML/RDF/DC proposal, or simpler than the text-vector toc format I suggested in the last message? I don't see any way to simplify the second and still support any DVD-like features. And if not that, what do we need it for? To be fair, I don't think RDF is especially human-intended either. :) Cheers, -ralph -- giles@ashlu.bc.ca --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.