thr3ads.net - Vorbis - [vorbis] xml stream formats [Jun 2000]

If this information is useful, please help other people find it:
Share via:

Ralph Giles

2000-Jun-18 01:25 UTC

[vorbis] xml stream formats

Speaking of Metadata, how's work going on the definition?

Looking back at the list archives, there seems to be a semi-plan to use
Robert Kay's DTD from cdindex.org/dtd/TrackInfo.dtd, but I'm
conserned that it's too specialized for video and we'll end up having a
special case for audio-only files.

There are a couple of general issues here. Micheal Smith suggested on irc
that an important distinction to make is between "timecoded" data like
audio, video, and scrolling lyrics, and "timeless" data like the
production notes, or the fact that logical bitstream 12 is the
pop-up-video overlay in bengalese.

My proposal was that each type or instance of timecoded data be
encapsulated in its own logical stream. For a song, that might mean one
vorbis-encoded audio track, an xml track with the lyrics, and another xml
track with the phrasing, keychanges, and other musical markup. For a
video, it might be the a video track, three vorbis-encoded audio tracks in
different languages, 3 subtitle overlays, and 5 xml streams duplicating
the subtitles with two additional translations. 

I'd hope we could make a single dtd for the timecoded xml streams, relying
on conventional attributes to generalize the markup; karaoke and musical
annotation of a raga both have very specialized needs. something like:

<event timestamp="532739"
type="chord">E7m</event>

or

<event timestamp="1462" speaker="Robin">No Julia, I
don't want to go to
the prom with you.</event>

I guess that doesn't enforce segragation of content. Hmm. Well, we'd
want
the player to key on something in the header, not on a quick scan of the
contents. My point was I wanted to avoid having a tag for every requested
markup (the kitchen-sink, in other words) and focus on a generalized
presentation of text synchronized to the media being played.

This actually moves a lot of the data out of the "kitchen sink" xml
document. We may still want some metadata to tell the player about each of
those streams, though I think it might be possible to get away with just
what's in the comment fields in the individual streams. That just leaves
the static data, like the lyrics if I'm too lazy to time-index them. 

As has been mentioned, it would be nice if we could share as much of the
"metadata" format as possible between ogg (as a file format), icecast
(as
a network streaming application), and the cdindex. How difficult would it
be for the cdindex project to use separate records for the timecoded
xml data rather than embedding them in the TrackInfo record?

We also talked about streaming issues a bit. Jack explained that icecast
is going to (does?) insert the three pages of the vorbis header when a new
client connects in the middle of a song. This is necessary to set up the
decoder, but we get the comment page more-or-less for free. Something
similar would have to be done with the timecoded xml streams, since
well-formed xml has a header, and there will probably be a small amount of
metadata associated with each stream: language, who translated it,
revision, and so on. Finally, it might make sense to insert the static
metadata "out of band" like this, again so it's available even if
the
player connects in the middle of a file.

The multiple-streams idea does help the streaming server by making it
easier to split out the parts of interest. A client could ask only for the
video and the director's commentary xml, for instance.

Thanks for reading so far. I think it's important to work this out soon so
we don't end up limiting ogg to a mostly-audio format.

Cheers,
 -ralph

--- >8 ----
List archives:  xiph.org/archives
Ogg project homepage: xiph.org/ogg
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Michael Smith

2000-Jun-18 01:51 UTC

head link

[vorbis] xml stream formats

>We also talked about streaming issues a bit. Jack explained that icecast
>is going to (does?) insert the three pages of the vorbis header when a new
>client connects in the middle of a song. This is necessary to set up the
>decoder, but we get the comment page more-or-less for free. Something
>similar would have to be done with the timecoded xml streams, since
>well-formed xml has a header, and there will probably be a small amount of
>metadata associated with each stream: language, who translated it,
Ouch. Reading this made me remember something else that I hadn't thought of
in this context previously - it is NOT possible to stream well-formed XML,
in general. By limiting yourself in certain ways, you can get away with
just sending the start of the 'file' (as you've suggested here),
then
streaming - but then you have some subset of XML, rather than XML. 

Maybe we have to go back and think about this - is XML really what we need?
In fact, if we have seperate streams for most stuff, XML really isn't the
most suitable solution, since it's intrinsically tree-structured. If we
have seperate streams, isn't each one going to be basically just a sequence
of (whatever)? A lyrics stream might have a series of lines, each keyed to
a time, for example.

Am I missing something? Is there a way around this?

Michael

--- >8 ----
List archives:  xiph.org/archives
Ogg project homepage: xiph.org/ogg
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

robert@moon.eorbit.net

2000-Jul-05 12:22 UTC

head link

[vorbis] xml stream formats

Sorry for not responding sooner -- my team (FreeAmp) got laid off a
couple weeks back and I've been trying to salvage what's left.

     >We also talked about streaming issues a bit. Jack explained that
     >is going to (does?) insert the three pages of the vorbis header when a
new
     >client connects in the middle of a song. This is necessary to set up
the
     >decoder, but we get the comment page more-or-less for free. Something 
     >similar would have to be done with the timecoded xml streams, since 
     >well-formed xml has a header, and there will probably be a small amount
of
     >metadata associated with each stream: language, who translated it, 

     Ouch. Reading this made me remember something else that I hadn't
thought of
     in this context previously - it is NOT possible to stream well-formed XML, 
     in general. By limiting yourself in certain ways, you can get away with 
     just sending the start of the 'file' (as you've suggested
here), then
     streaming - but then you have some subset of XML, rather than XML. 

     Maybe we have to go back and think about this - is XML really what we need?
     In fact, if we have seperate streams for most stuff, XML really isn't
the
     most suitable solution, since it's intrinsically tree-structured. If we
     have seperate streams, isn't each one going to be basically just a
sequence
     of (whatever)? A lyrics stream might have a series of lines, each keyed to 
     a time, for example. 

I just checked out RDF (Resource Description Framework) which intends to
describe resources available on the net. I could see us using RDF as the
format inside the metadata stream, instead of using a full blown XML
DTD. I haven't used RDF -- Jack, is looks like you've used it. Do you
have any feedback on using RDF? 

In any case, using a complete RDF chunk to describe the time-coded
information seems like a great amount of overhead. Both XML and RDF seem
like inappropriate tools for use with time-coded information that is
in-line with the stream. However, XML and RDF are the best tools that I
can think of for maintaining the non-time-coded-metadata information.

Maybe the best approach is to use XML or RDF as part of the stream
header(s) and then to use some other format for the time-coded
information.

In any case, to answer Ralph's original question, the current TrackInfo
dtd does not take video into account, but it should be easy to extend
for use with video as well. But, it sounds as if we need to answer a few
other questions before we delve into the details of creating the overall
metadata solution.

--ruaok         Freezerburn! All else is only icing. -- Soul Coughing

Robert Kaye -- robert@moon.eorbit.net  moon.eorbit.net/~robert

--- >8 ----
List archives:  xiph.org/archives
Ogg project homepage: xiph.org/ogg
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Maybe Matching Threads

Search for more seemingly similar threads

Vorbis - Jun 2000 - xml stream formats

[vorbis] xml stream formats

[vorbis] xml stream formats

[vorbis] xml stream formats

Maybe Matching Threads