What I've gotten out of this discussion so far: 1) we need to introduce a means in which to do captions; this could be done through adding a "caption" element to CMML, or in another time-continuous annotation format; so far I am not sure which would be the better way 2) we need a XML annotation format for audio - in particular for music - that is more structured than vorbiscomment (and this probably applies to video, too) 3) we need a means to describe relationships between different logical bitstreams; we had a discussion about this years ago, but never got to a proper specification of this 4) we need a means to address logcial bitstreams by name; this should be an ID attribute to be added to skeleton These four things are all very different and separate things - number 2 may even need further structuring IMO. Yes, they interrelate and there should be means to address one from the other. But IMO they all need a different approach. Silvia. On 9/10/07, Ian Malone <ibmalone@gmail.com> wrote:> On 09/09/2007, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote: > > Daniel, > > > > > A similar argument goes for the encoding quality description and digital rights. > > > > In contrast, the improved description of the content as in: artist, > > band name, title, organisation involved, and people involved are > > things that improve upon vorbiscomment and should probably be included > > there directly. > > > > There are arguments against simply using vorbiscomment for this, > beyond simply the fact that the Vorbis spec says it should only be > used for short 'jotted on the CD' type notes. First it is based on > value pairs, this makes it hard to describe relationships in detail; > you must either choose a field name that will not be widely > understood ('drummer', 'sound engineer', even 'composer' won't > get you far) or use a funny way of refining it in the value (such > as artist=(composer) Beethoven), I think cast lists for films present > a similar problem. There is consistency and indexability to be > addressed (Ludvig van Beethoven; Beethoven, Ludvig van; > Beethoven). Finally complex relationships are even harder to > handle such as specifying a resource's relationship to the rest > of a collection. > > > All I ask for is *not* to reinvent the wheel when there are already > > working, semi-complete metadata formats for Ogg that have been > > carefully prepared to fit with the existing Ogg framework. It would be > > a sheer nightmare to create another new one that does not fit with any > > of the existing ones and is not supported by any media application. > > > > Yes, any attempt to add a metadata format should leverage > what exists already. I see it like this: Skeleton describes the > technical aspects of the stream, CMML splits it into temporal > parts, but describing the contents is left to Vorbiscomments, > which were by design only sufficient for simple descriptions > (and are tied to a logical stream). Looking back through > the list archives it was always intended there should be a > metadata format to go beyond what Vorbis comments > could do. > > I suppose I should draw attention to some of the stuff I did > before I discovered I didn't have enough time to get some > kind of working application together: > <http://www.imalone.co.uk/omd_background/index.html> > <http://www.imalone.co.uk/omd_questions/index.htm> > <http://wiki.xiph.org/index.php/Metadata> > I got a bit sidetracked by the multiplexed streams > compatibility stuff at the time, that and trying to learn XPCOM > to produce a FF plugin (something which was going to take > far more time than I had). A Rhythmbox plugin is probably the > easiest target, though something Windows based would get > a wider audience, Songbird might be good as it's built on > top of a platform which is designed to handle XML. > > P.S. does anyone know what XMP actually does? Whenever > I look at it all I can find are descriptions of how the embedding > is done. > > -- > imalone > _______________________________________________ > ogg-dev mailing list > ogg-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/ogg-dev >
Daniel Aleksandersen
2007-Sep-10 15:39 UTC
[ogg-dev] The use for an XML based metadata format
On Monday 10. September 2007 23:39:50 Silvia Pfeiffer wrote:> 2) we need a XML annotation format for audio - in particular for music > - that is more structured than vorbiscomment (and this probably > applies to video, too)It would have to apply to any kind of media. By the way, I have bee discussing Dublin Core (?DC?) with the developers of the Atom 1.0 specification. It seams the reason they created atom:rights instead of using dc:rights were just about what I thought it was: They though DC was too loosely defined. Their own atom:rights element were designed to more clearly define what the element contained (escaped HTML, clear text, or whatever else). When it comes to other dc:elements the arguments were about the same: Could be more clearly defined what they contain and remove redundant attributes and children elements. -- Daniel Aleksandersen
Silvia Pfeiffer wrote:> What I've gotten out of this discussion so far: > > 1) we need to introduce a means in which to do captions; this could be > done through adding a "caption" element to CMML, or in another > time-continuous annotation format; so far I am not sure which would be > the better way >There is also the OggWrit draft <http://wiki.xiph.org/index.php/OggWrit>. Certainly a content description metadata format does not need to address this as there are so many other places it would fit better. As mentioned else- thread it could well describe the relation of the captions to other resources, as in 3) below.> 2) we need a XML annotation format for audio - in particular for music > - that is more structured than vorbiscomment (and this probably > applies to video, too) >While the particular examples I pick tend to be for music because those are the obvious ones I think the interesting applications may be for non-musical resources; see the Metadata talk page where someone asked (a long time ago) about Learning Object Metadata for teaching resources: <http://wiki.xiph.org/index.php/Talk:Metadata>. Embedding in Ogg is the simple bit; the only point of contention being whether you use a magic number to label it as metadata or just package XML and let the parser sort it out. (With a bit more experience under my belt I'm persuaded a magic number might be worthwhile, otherwise there'll be someone who hardcodes their app to expect XML to be a metadata stream). The missing bit (and the difficult one), is the format. The best thing to do is to nail down a set of XML or RDF that addresses the obvious needs and allows inclusion of further namespaces/schemas by end developers as needed (c.f. LOM). But I do think music and the cast/ensemble problem might be a nice starting point as this is something classical music fans have been looking for for a while but has never been provided by other formats. Coupled with the fact that Ogg audio covers FLAC too you may win some audiophiles back. Trackbacks and store links are also probably in the scope of 'relatively straightforward'. That said, RDF makes my head hurt. I spent a while looking at how to do this with DC and friend-of-a-friend (FOAF) but nothing really clicked for me; perhaps a simple and cheerful new namespace to tie it together is what's called for. Oh, I realise that we have XML expertise here in the form of the Annodex/CMML group, but there's also the XPSF people who seem to know a lot about the darker corners of XML and URIs.> 3) we need a means to describe relationships between different logical > bitstreams; we had a discussion about this years ago, but never got to > a proper specification of this > > 4) we need a means to address logcial bitstreams by name; this should > be an ID attribute to be added to skeleton >3 & 4 are separate points, but obviously 3) needs 4). If there's an obvious way to do the URN bit then 4) is the most straightforward of the lot.> > These four things are all very different and separate things - number > 2 may even need further structuring IMO. Yes, they interrelate and > there should be means to address one from the other. But IMO they all > need a different approach. >Yes. To recap 1) is covered elsewhere, potentially several times over. 2) is the big one, what's needed depends on the end use, however I think if we have a good foundation people can add the bits they need; most of my current ideas about use cases have metadata produced /once/ by the content provider or media management, who can potentially supply the interpretor too if the basic tools are there. 3) How separate is this to 2)? If you view the metadata as a manifest for the physical stream then it describes the collection of the logical streams and their relations. On the one hand we might have a concert recording where the overall description for the stream would name the artists, on the other a film where the musicians might be relegated to the description for the soundtrack (depending how fastidious the metadata supplier is). Yes there are special bits needed for things like captions and multi-track audio, but these are just relationships to the whole. (Suggests ensure 4) can address whole bitstream too.) The model I've got in my head is a tree which describes properties of the physical stream, where necessary (actually, as part of that process) defining how the logical streams relate to it. The deficiency with that is the logical streams' relationship to each other is only given via their relations to the overall stream. You can describe their individual properties further down their branches. N.B. in all I've written above, physical stream really should be read as 'single-link or non-chained physical stream'. I believe it would make sense to expect each link to carry its own metadata and bases to refer to links. It may be necessary for external references and bizarre corner cases to be able to specify an id in a link within a chain. Maybe add an overall link id to 4), but they probably shouldn't share an addressing mechanism with logical stream ids for obvious reasons. Wondering if any of the above makes sense. -- imalone
Daniel Aleksandersen wrote:> On Monday 10. September 2007 23:39:50 Silvia Pfeiffer wrote: >> 2) we need a XML annotation format for audio - in particular for music >> - that is more structured than vorbiscomment (and this probably >> applies to video, too) > > It would have to apply to any kind of media. > >I'm not quite sure what you mean by this. Some of the things suggested so far (musicians for example) are obviously going to be music orientated. If you mean it has to be able to describe things you put in a Ogg stream then yes it does. However you have to pick what it is you want to describe about them or you'll be there forever. (Scholarly analysis and criticism could probably be omitted for the time being...) -- imalone
Daniel Aleksandersen wrote:> By the way, I have bee discussing Dublin Core (?DC?) with the developers of > the Atom 1.0 specification. It seams the reason they created atom:rights > instead of using dc:rights were just about what I thought it was: They > though DC was too loosely defined. Their own atom:rights element were > designed to more clearly define what the element contained (escaped HTML, > clear text, or whatever else). > > When it comes to other dc:elements the arguments were about the same: Could > be more clearly defined what they contain and remove redundant attributes > and children elements.(Sorry, should have replied to this at the same time as the last.) I'd be interested which ones. DC is a bit nebulous, but that gives you tremendous freedom too. Atom on the other hand has a very specific target for the things they describe (but they did take a very pragmatic approach to their problem from what I understand, which means they're probably good people to be talking to). -- imalone
On 11/09/2007, Ian Malone <ibmalone@gmail.com> wrote:> > Embedding in Ogg is the simple bit; the only point of contention > being whether you use a magic number to label it as metadata or > just package XML and let the parser sort it out. (With a bit > more experience under my belt I'm persuaded a magic number might > be worthwhile, otherwise there'll be someone who hardcodes their > app to expect XML to be a metadata stream).Agreed, having a magic identifier at the beginning of the stream is useful where you only need to decide how to deal with a track (logical bitstream), for example in the demux stage of a player. It is also useful in a server that does stream recomposition or selection without decode (eg. serving a temporal subset of the stream, or selecting language tracks). For CMML, we first tried just using raw XML from byte 0 but later opted for using a binary header with magic identifier in the first (bos) packet for these reasons. cheers, Conrad.
Daniel Aleksandersen
2007-Sep-10 21:08 UTC
[ogg-dev] Working towards an XML based metadata format
On Monday 10. September 2007 23:39:50 Silvia Pfeiffer wrote:> 2) we need a XML annotation format for audio - in particular for music > - that is more structured than vorbiscomment (and this probably > applies to video, too)There are two possible ways to go regarding the format: 1) Build on XML+RDF and DC[1] with an additional name space for the more media specific elements and attributes required. or 2) Build an all new format name space in XML and incorporate elements from DC with RDF thinking. To argue for the latter: The Atom 1.0 developers choose to abandon the RDF and DC name spaces because they wanted a format that was not cluttered by and requiring multiple name spaces out of the box. They also felt DC was to loosly defined. (Note that RDF and DC combined in the case of a web feed format would end up being little more than W3C's RSS 1.0 specification. Cluttered and unendorsed.) [1] See http://dublincore.org/documents/dc-xml-guidelines/ and http://dublincore.org/documents/dcmes-xml/ to get the general idea of what this would look like. -- Daniel Aleksandersen
On 11/09/2007, Daniel Aleksandersen <aleksandersen+xiphlists@runbox.com> wrote:> On Monday 10. September 2007 23:39:50 Silvia Pfeiffer wrote:(Meta(!) comment, are you using your email client's reply function? I'm finding the threading for these messages is broken.)> > 2) we need a XML annotation format for audio - in particular for music > > - that is more structured than vorbiscomment (and this probably > > applies to video, too) > > There are two possible ways to go regarding the format: > 1) Build on XML+RDF and DC[1] with an additional name space for the more > media specific elements and attributes required. > or 2) Build an all new format name space in XML and incorporate elements > from DC with RDF thinking. > > To argue for the latter: The Atom 1.0 developers choose to abandon the RDF > and DC name spaces because they wanted a format that was not cluttered by > and requiring multiple name spaces out of the box. They also felt DC was to > loosly defined. (Note that RDF and DC combined in the case of a web feed > format would end up being little more than W3C's RSS 1.0 specification. > Cluttered and unendorsed.) >Notably Atom also ignores RDF. After seeing RDF used to describe Mozilla extensions I've come to realise that might be its more natural use. In defence of DC, it is meant to describe collections and has incredible flexibility, whereas Atom has a much more limited set of things to describe. This does lead to it being very sparse though (the problem I ran into when I tried to do this). One of my favourite DC examples comes from 'Using Dublin Core'[2]: Title="The Bronco Buster" Creator="Frederic Remington" Type="Physical object" Format="bronze" Format="22 in." This also points up one of its design features, the reason it is less strictly specified to leave flexibility to the metadata maintainers: DC on it's own doesn't give you a complete vocabulary to describe everything, it must be tailored to your needs. I guess that's essentially what the Atom people have done.> [1] See http://dublincore.org/documents/dc-xml-guidelines/ and > http://dublincore.org/documents/dcmes-xml/ to get the general idea of what > this would look like.[2] http://dublincore.org/documents/usageguide/elements.shtml -- imalone
On 10/09/2007, Daniel Aleksandersen <aleksandersen+xiphlists@runbox.com> wrote:> On Monday 10. September 2007 23:39:50 Silvia Pfeiffer wrote: > > 2) we need a XML annotation format for audio - in particular for music > > - that is more structured than vorbiscomment (and this probably > > applies to video, too) > > It would have to apply to any kind of media. > >I think, and others have suggested this, break the metadata problem up. This is in addition to breaking off the support requirements as we've begun to do. You want to create a metadata format to describe any type of media. That's a really major task to start. One option would be to split off areas. We need a common set describing things such as rights, this could also include contributors but that's such a complex topic that it might warrant its own division. The required data for audio, moving picture and stills could then each be separate regions. This would let you decide what's needed for each type of media. In addition such a modular approach would make it easy to incorporate other metadata schemes (I think the Learning Objects Metadata would be interesting.) Anyway, just firing out ideas. -- imalone