Don't have too much to report, but I talked to Michael Smith about things on IRC this evening and we made some progress in compromising our visions. On the static and stream identification metadata there was a tenative decision to go with RDF, but in such as way that a limited player could choose not to support it and still be able to play the a/v data. Robert mentioned trying to work his Trackinfo DTD into a set of attributes, but I haven't heard if he's gotten to it. Again, I would push to unify music, film, and video here. One thing I don't know how to treat is liner notes in multiple languages. It's not uncommon for classical music to be distributed with extensive liner notes in multiple languages, otherwise identical. In the case of lyrics/subtitles we should create separate streams for each translation, but do we want to do that for the static metadata? Does RDF already have a way to handle this? Most of our discussion was about the timecoded metadata. The proverbial scrolling lyrics, but this must also serve for subtitles (in multiple languanges), transcripts, commentary, headlines, guitar tabulature, and so on. Our most important concern to maintain maximum flexibility. As I've said before, I think it's important to have at least some kind of inline text markup. Raw text, even with unicode, is too limited. So we waved the magic xml buzzword at the problem. What I wanted to do was allow arbitrary xml streams in ogg, but specify specific dtds for compliance with a particular mapping, if that's the right term. Players would ignore unknown dtds, or optionally try to do something intelligent with them (like feed them to a browser). The problem with this is that there's no way to resume parsing of an xml document after a dropped packet. In fact, you're suppose to stop and error out if you encounter an inconsistency. The audio and video codecs (including mng) all have a way to restart decoding periodically, so this is a bothersome lack of parity. What's new: our compromise was to divide the timecoded xml into "doclets", small packets with all the desired markup but as a self-contained xml document complete with it's own header. Neither of us likes it very much, but so far it's the best compromise that fits the requirements we've established. Criticism please. :) My suggestion was to model the doclets on the vorbis comment header, encoding each as a bytevector, with the content being the xml doclet. For maximum flexibility, we would add start and stop display timestamps externally encoded in addition to the content. There would still be internal timestamps, but these would mark the boundaries for the whole packet and help with seeking. It also means the content doesn't really have to be xml at all, leaving lots of headroom for future extension. Internal timestamps are important because for things like karaoke you need to mark each word separately, which makes for something >200% overhead with just one word per vector. You can also do things like change the character set encoding mid-stream, should that ever make sense. Michael's example was mixed English and Klingon. (The Unicode Consortium has so far studiously ignored the Klingon encoding proposal.) I also think it would be possible to implement some of these features implicitly by always breaking xml document into ogg packets at the same level of the document heirarchy. One can also imagine various hacks to include a psuedo-header, prehaps as a comment, to tell the parser where it is. The disadvantanges: It feels complicated to me, introducing another layer between Ogg and the data, but Michael feels this way about my continuous- xml proposal. :-) It's very difficult to do document structure this way, at least without something equivalent to the periodic-pseudoheader hacks. I can live with that, we'll probably just get non-container headings like html. It's also more work to display the timecoded stream in a static format (I just want to read the lyrics, not listen to the song!) I guess at this point what we need are some concrete proposals, both for the encoding spec and DTDs, so we can work out the details and experiment practically. Cheers, -ralph -- giles@ashlu.bc.ca Subtle mind control? Why do all these HTML buttons say 'Submit'? --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.