Daniel, before you step over everything that has been done before, we need to determine what exactly is the use case for your new specification. What concerns metadata, we currently have: * vorbiscomment - this is a header at the beginning of a logical bitstream which has metadata that refers to the complete file; there is a specification, which has been public for a long time and is the de-facto standard that is (or should be) used by all software (see http://xiph.org/vorbis/doc/v-comment.html) * cmml - this is a logical bitstream for time-continuous textual annotations (metadata) for ogg files (see http://wiki.xiph.org/index.php/CMML) * skeleton - this is an extension to the ogg bitstream format, which has all the encapsulation-specific low-level metadata (see http://wiki.xiph.org/index.php/Ogg_Skeleton) All of these are supported by xiph and may need further work/extensions or potentially a replacement if they are not fit to provide what is required. Before throwing out more random specifications, could we please look at what you are trying to achieve with the new format? Can you tell us where the existing technologies are lacking? Thanks, Silvia. On 9/9/07, Ralph Giles <giles@xiph.org> wrote:> On Sat, Sep 08, 2007 at 07:42:11PM +0200, Daniel Aleksandersen wrote: > > > Anyhow, some audio manager software uses Author, some uses Artist, and many > > use Creator. It just leaves a mess. > > Really? What programs aren't using the recommended artist tag? > > -r > _______________________________________________ > ogg-dev mailing list > ogg-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/ogg-dev >
Daniel Aleksandersen
2007-Sep-09 05:24 UTC
[ogg-dev] The use for an XML based metadata format
On 2007-09-09, Silvia wrote:> Daniel,Hi Silvia, I realise I should have started with this. I got a little carried on with my ideas. Apparently I am no good when it comes to sharing an idea. Short answer: The format should describe media content and relation between them in an Ogg stream. Intended usage is media management and sorting trough search and media manager software. Long answer: See below.> before you step over everything that has been done before, we need to > determine what exactly is the use case for your new specification. > > What concerns metadata, we currently have: > > * vorbiscomment - this is a header at the beginning of a logical > bitstream which has metadata that refers to the complete file; there > is a specification, which has been public for a long time and is the > de-facto standard that is (or should be) used by all software (see > http://xiph.org/vorbis/doc/v-comment.html) > > * cmml - this is a logical bitstream for time-continuous textual > annotations (metadata) for ogg files (see > http://wiki.xiph.org/index.php/CMML) > > * skeleton - this is an extension to the ogg bitstream format, which > has all the encapsulation-specific low-level metadata (see > http://wiki.xiph.org/index.php/Ogg_Skeleton) > > All of these are supported by xiph and may need further > work/extensions or potentially a replacement if they are not fit to > provide what is required. > > Before throwing out more random specifications, could we please look > at what you are trying to achieve with the new format? Can you tell us > where the existing technologies are lacking?What I want is a format to give a detailed description of the content in an Ogg stream. The usage would be improved searchability on local machines (possibly even on the web and file sharing clients too) and sorting in media management software such as Apple iPhoto, Amarok, and WinAmp. Currently only Vorbis comment describe the content. What I aim to is to replace Vorbis comments. Vorbis comments are very limited to a few field names for describing content. There is only a poorly developed look-a-like standard for describing audio files; and all other media formats are left alone. End users may indeed slap on additional field names, but no media management software no search engine know to look for them. Another thing this format describes is relations between media in an Ogg stream. See the audio:collection:artwork element for instance. (Imagine an audio:lyrics element too.) This random specification was intended to start development for a real metadata/content description format. This XML based thing I have put together in a few hours might not be the best. But it does provide a better way to detail describe I have no doubt that others can do this better. But as no one seamed to be working on a description format; I took it upon myself to start working on *something*. Hope this clarifies things. -- Daniel Aleksandersen
Ivo Emanuel Gonçalves
2007-Sep-09 08:28 UTC
[ogg-dev] The use for an XML based metadata format
I believe what Silvia tried to tell you, is that it would probably be a better idea if you analized the existing solutions (which are not JUST Vorbis Comments, as you stated) and tried to integrate your ideas, instead of starting a complete new project. Build upon on existing work is usually a better solution to create a better product. That's what Free Software is all (mostly) about. I'd suggest you to go over Skeleton and see what is lacking there and how your ideas may integrate there. If you cannot succeed in making Skeleton a better format for Metadata using your own project, then perhaps it may be the time to create a separate standard. Although, keep in mind that more standards means more confusion, less support, more headaches to developers and users alike, less success. -Ivo
Daniel, these are all good ideas and worth progressing. However, it may be better not to merge too many goals in one format (MPEG-7 did that and ended up as a big mess). So, I suggest to start by structuring the types of things you want - then finding out which parts belong where into existing formats such as vorbis comment, Skeleton and CMML, and only then start to develop a new format. For example: the relationships between the logical bitstreams is a very semantic description - it needs to be broad enough to enable different types of applications to do different things with it. E.g. a video editor will need to know that there are 3 audio channels in a file and how they overlap each other and also the video channel, while in contrast a speech recognizer might just want to be able to know about the one audio channel in there that is speech and a music player would be totally ignorant of the video channel. Just solving this generically would be a big feat. It would possibly need to find a place in skeleton. A similar argument goes for the encoding quality description and digital rights. In contrast, the improved description of the content as in: artist, band name, title, organisation involved, and people involved are things that improve upon vorbiscomment and should probably be included there directly. All I ask for is *not* to reinvent the wheel when there are already working, semi-complete metadata formats for Ogg that have been carefully prepared to fit with the existing Ogg framework. It would be a sheer nightmare to create another new one that does not fit with any of the existing ones and is not supported by any media application. OTOH, we could. undertake this cleaning exercise also at the end of your process when you have all the fields together that you're after. We would then sit down and discuss where they are best suitable, if you prefer that. This should be made clear though. Regards, Silvia. than the artist and or organisation descriptions. On 9/9/07, Daniel Aleksandersen <aleksandersen+xiphlists@runbox.com> wrote:> On 2007-09-09, Silvia wrote: > > Daniel, > > Hi Silvia, > > I realise I should have started with this. I got a little carried on with my > ideas. Apparently I am no good when it comes to sharing an idea. > > Short answer: The format should describe media content and relation between > them in an Ogg stream. Intended usage is media management and sorting > trough search and media manager software. > > Long answer: See below. > > > before you step over everything that has been done before, we need to > > determine what exactly is the use case for your new specification. > > > > What concerns metadata, we currently have: > > > > * vorbiscomment - this is a header at the beginning of a logical > > bitstream which has metadata that refers to the complete file; there > > is a specification, which has been public for a long time and is the > > de-facto standard that is (or should be) used by all software (see > > http://xiph.org/vorbis/doc/v-comment.html) > > > > * cmml - this is a logical bitstream for time-continuous textual > > annotations (metadata) for ogg files (see > > http://wiki.xiph.org/index.php/CMML) > > > > * skeleton - this is an extension to the ogg bitstream format, which > > has all the encapsulation-specific low-level metadata (see > > http://wiki.xiph.org/index.php/Ogg_Skeleton) > > > > All of these are supported by xiph and may need further > > work/extensions or potentially a replacement if they are not fit to > > provide what is required. > > > > Before throwing out more random specifications, could we please look > > at what you are trying to achieve with the new format? Can you tell us > > where the existing technologies are lacking? > > What I want is a format to give a detailed description of the content in an > Ogg stream. The usage would be improved searchability on local machines > (possibly even on the web and file sharing clients too) and sorting in > media management software such as Apple iPhoto, Amarok, and WinAmp. > > Currently only Vorbis comment describe the content. What I aim to is to > replace Vorbis comments. Vorbis comments are very limited to a few field > names for describing content. There is only a poorly developed look-a-like > standard for describing audio files; and all other media formats are left > alone. End users may indeed slap on additional field names, but no media > management software no search engine know to look for them. > > Another thing this format describes is relations between media in an Ogg > stream. See the audio:collection:artwork element for instance. (Imagine an > audio:lyrics element too.) > > This random specification was intended to start development for a real > metadata/content description format. This XML based thing I have put > together in a few hours might not be the best. But it does provide a better > way to detail describe > > > I have no doubt that others can do this better. But as no one seamed to be > working on a description format; I took it upon myself to start working on > *something*. > > Hope this clarifies things. > -- > Daniel Aleksandersen >
Silvia Pfeiffer wrote:> Daniel, > > before you step over everything that has been done before, we need to > determine what exactly is the use case for your new specification. > > What concerns metadata, we currently have: > > * vorbiscomment - this is a header at the beginning of a logical > bitstream which has metadata that refers to the complete file; there > is a specification, which has been public for a long time and is the > de-facto standard that is (or should be) used by all software (see > http://xiph.org/vorbis/doc/v-comment.html) > > * cmml - this is a logical bitstream for time-continuous textual > annotations (metadata) for ogg files (see > http://wiki.xiph.org/index.php/CMML) > > * skeleton - this is an extension to the ogg bitstream format, which > has all the encapsulation-specific low-level metadata (see > http://wiki.xiph.org/index.php/Ogg_Skeleton) > > All of these are supported by xiph and may need further > work/extensions or potentially a replacement if they are not fit to > provide what is required. > > Before throwing out more random specifications, could we please look > at what you are trying to achieve with the new format? Can you tell us > where the existing technologies are lacking? ><So, some of this has already been covered, but it's been sitting on my computer for two days and won't get sent if I have to rewrite it all.> My feeling has always been the human-relevant detailed information is missing. Skeleton is capable of describing all format related information (though external metadata descriptions of a resource may want to contain this information too). Vorbiscomment is good for the basics, mainly for rock or pop music which often fits nicely into the artist - trackname format. CMML does the time resolved things and clip descriptions, but not detailed relationships between tracks. Classical music is a rich source of examples; works are most often associated with a composer but you may still care who the performers or conductors were. A piece may be split across several movements which may be even further broken down into tracks on the source. However pop music actually provides some non-trivial cases, along the same lines but made more difficult by the fact they've been generally ignored, the most prominent is the problem of cover versions; you can have separate creators for the music, lyrics and performance. Sampled tracks are also an issue. Moving away from music we have plays (act, scene, writer, director, any music credits), plus supplementary information for multiplexed audio or video describing the individual streams (for example, which microphone, instrument or camera). The above are examples of the type of information you might want to put in that wouldn't fit well into existing examples (with the possible exception of the pseudo- temporal act/scene information). But obviously only the obsessive would want to include any of this without use cases. Some possibilities that come to mind: improved cataloguing of media, for example listening to a track and being able to get a list of songs with the same writer but not necessarily composer. You could do that with Vorbiscomment, but a proper metadata format allows you to use URIs to avoid false matches, typos etc. Streaming radio, provision of trackback-like references to connected resources. You could connect to a server and it could put out a metadata packet at intervals describing the current track, rather than having to restart the Vorbis stream to get a new comment header. This richer metadata could give links to the artists webpage or an online store. Online album delivery. I'd distinguish this from the streaming case. Here the metadata provides the equivalent of CD liner notes (for the track rather than an entire album, though it may refer to an external resource describing the collection which could, in turn, be a media metadata resource). Running out of time, if I think of anything else I'll be sure to mention it later. -- imalone
On 10 Sep 2007 at 18:42, Ian Malone wrote:> Classical music is a rich source of examples; works are > most often associated with a composer but you may still > care who the performers or conductors were. A piece may > be split across several movements which may be even further > broken down into tracks on the source.This things are easily solved with vorbis comments ... Provided you eschew the specific instrument(s) performed. Obviously, Daniel's format has that going for it. -- -:-:- David K. Gasaway -:-:- Email: dave@gasaway.org -:-:- Web : dave.gasaway.org