On 11/09/2007, Daniel Aleksandersen <aleksandersen+xiphlists@runbox.com> wrote:> On Tuesday 11. September 2007 01:34:35 Ian Malone wrote: > > Daniel Aleksandersen wrote: > > > By the way, I have bee discussing Dublin Core ('DC') with the > > > developers of the Atom 1.0 specification. It seams the reason they > > > created atom:rights instead of using dc:rights were just about what I > > > thought it was: They though DC was too loosely defined. Their own > > > atom:rights element were designed to more clearly define what the > > > element contained (escaped HTML, clear text, or whatever else). > > > > > > When it comes to other dc:elements the arguments were about the same: > > > Could be more clearly defined what they contain and remove redundant > > > attributes and children elements. > > > > (Sorry, should have replied to this at the same time as the last.) > > > > I'd be interested which ones. DC is a bit nebulous, but that gives > > you tremendous freedom too. Atom on the other hand has a very > > specific target for the things they describe (but they did take a > > very pragmatic approach to their problem from what I understand, > > which means they're probably good people to be talking to). > > Atom is a syndication format?like RSS?that carry short descriptions of > content and links to the full content. I only referred to their work > because I though it would be relevant. > > The Dublin Core Metadata Initiative are great for describing written > resources such as books, web pages, and indeed it would have worked in the > case of Atom as well. However it is no good when it comes to describing > audio and videos. Mostly because you have no method of describing > what 'role' people and organisations had in the production. Which is > precisely why I added the poorly defined role attribute to the person and > organisation elements.DC has provision for qualifiers, there is a proposed 'agent-role' <http://dublincore.org/usage/meetings/2002/05/Agent-roles.html> which, last time I looked, used the MARC relator list: <http://www.loc.gov/marc/sourcecode/relator/relatorlist.html> Two things to notice: 1. That is a massively long list. 2. It doesn't appear to do what we want. But it is there. However, no such scheme can reasonably provide support for one-of-roles such as 'Othello', this suggests that beyond simple role-refinement there are a number of mini-metadata specs required here. You've said elsewhere in reference to describing all media types:> There are drafts for including still images in Ogg streams? Surely they > would have to be described as well. What camera was used and who made it? > Who owns it? What do we see on this image?One question is; do we try to spec everything right from the start without a good idea of all the use cases or do we start to nail things down and leave some flexibility? If you want to describe everything that could go into a media file I think you benefit from something like DC to do a lot of the basics, but there /are/ bits missing (the library card vs programme issue). I think roles /is/ the right place to start, and the difficulty with the RDF model is roles refine relationships and relationships really need to be standardised to be any use. How about this instead (made up element names): <contributor> <person>John Smith</person> <role type="actor" name="The Doctor" /> </contributor> It's obviously possible to get into all kinds of contortions about what properties something like "person" should have and just how contributor and role &c. should fit around each other, but I think this allows cumulative, possibly unique, refinement at the same time as standardisation. I notice your description (probably intentionally) splits up into three separate issues: 1. Technical origin data. This is different from the bitrate/ dimensions issue. In this case there /must/ already be a photographic metadata format in existence. 2. Rights. In a way this is the simplest of them all, since 'no technical solutions for legal problems'. Owner, license, date. Responsibility/choice of the publisher to get them right, but has negligible effect on any legal situation if they're not comprehensive. 3. What do we see. Subject descriptive metadata is hard. This would be a synopsis in other contexts. But most decent metadata formats allow for a free-form description of content; from memory all of Atom, DC and CMML do. Additionally things like FOAF for people are possible. -- imalone
On 11/09/2007, Ian Malone <ibmalone@gmail.com> wrote:> On 11/09/2007, Daniel Aleksandersen <aleksandersen+xiphlists@runbox.com> wrote: > > On Tuesday 11. September 2007 01:34:35 Ian Malone wrote:> > > I'd be interested which ones. DC is a bit nebulous, but that gives > > > you tremendous freedom too. Atom on the other hand has a very > > > specific target for the things they describe (but they did take a > > > very pragmatic approach to their problem from what I understand, > > > which means they're probably good people to be talking to). > >> > The Dublin Core Metadata Initiative are great for describing written > > resources such as books, web pages, and indeed it would have worked in the > > case of Atom as well. However it is no good when it comes to describing > > audio and videos. Mostly because you have no method of describing > > what 'role' people and organisations had in the production. Which is > > precisely why I added the poorly defined role attribute to the person and > > organisation elements. > > DC has provision for qualifiers, there is a proposed 'agent-role' > <http://dublincore.org/usage/meetings/2002/05/Agent-roles.html> > which, last time I looked, used the MARC relator list: > <http://www.loc.gov/marc/sourcecode/relator/relatorlist.html> >> > But it is there. However, no such scheme can reasonably provide > support for one-of-roles such as 'Othello', this suggests that beyond > simple role-refinement there are a number of mini-metadata specs > required here. >Okay, I'm not sure if this is a revelatory inspiration or a crazy idea, but this morning it occurred to me that what might work is three degrees of refinement for a role: 1. General. This would be like the MARC relator list. If someone plays guitar it will say musician. Doesn't seem satisfactory? Go to 2. This would take us well beyond music alone and I think that generality is a good thing, rather than specifying a slightly pat and contrived list; 'guitar', 'vocals', 'drums', 'theremin', 'bowed guitar' 2. Refined. Where available pick from a given list of possibilities. This is where you'd put type of musical instrument for example. Options such as 'other instrument' or omitting altogether would be perfectly valid. 3. Free form refinement. E.g the 'bowed guitar' from above refining 'guitar'. The more refined you get the less machine readable it becomes, but that's okay because you can machine on the above refinements and still display the free-form. Would also specify characters. May use things such as URIs where defined categories exist; for example if someone produced a URN scheme for all characters in Shakespeare's works. Any thoughts or criticisms on this? RDF allows us to specify the same endpoint for multiple relationships, or we could handle roles in a more complex way than simple relationships, either way the above could be made to work. -- imalone
Daniel Aleksandersen
2007-Sep-18 06:06 UTC
[ogg-dev] The use for an XML based metadata format
On Tuesday 18. September 2007 14:32:45 Ian Malone wrote:> On 11/09/2007, Ian Malone <ibmalone@gmail.com> wrote: > > On 11/09/2007, Daniel Aleksandersen <aleksandersen+xiphlists@runbox.com>wrote:> > > On Tuesday 11. September 2007 01:34:35 Ian Malone wrote: > > > > I'd be interested which ones. DC is a bit nebulous, but that gives > > > > you tremendous freedom too. Atom on the other hand has a very > > > > specific target for the things they describe (but they did take a > > > > very pragmatic approach to their problem from what I understand, > > > > which means they're probably good people to be talking to). > > > > > > The Dublin Core Metadata Initiative are great for describing written > > > resources such as books, web pages, and indeed it would have worked > > > in the case of Atom as well. However it is no good when it comes to > > > describing audio and videos. Mostly because you have no method of > > > describing what 'role' people and organisations had in the > > > production. Which is precisely why I added the poorly defined role > > > attribute to the person and organisation elements. > > > > DC has provision for qualifiers, there is a proposed 'agent-role' > > <http://dublincore.org/usage/meetings/2002/05/Agent-roles.html> > > which, last time I looked, used the MARC relator list: > > <http://www.loc.gov/marc/sourcecode/relator/relatorlist.html> > > > > > > > > But it is there. However, no such scheme can reasonably provide > > support for one-of-roles such as 'Othello', this suggests that beyond > > simple role-refinement there are a number of mini-metadata specs > > required here. > > Okay, I'm not sure if this is a revelatory inspiration or a crazy idea, > but this morning it occurred to me that what might work is three > degrees of refinement for a role: > > 1. General. This would be like the MARC relator list. If someone > plays guitar it will say musician. Doesn't seem satisfactory? > Go to 2. This would take us well beyond music alone and I > think that generality is a good thing, rather than specifying a > slightly pat and contrived list; 'guitar', 'vocals', 'drums', > 'theremin', 'bowed guitar' > 2. Refined. Where available pick from a given list of possibilities. > This is where you'd put type of musical instrument for example. > Options such as 'other instrument' or omitting altogether would > be perfectly valid. > 3. Free form refinement. E.g the 'bowed guitar' from above refining > 'guitar'. The more refined you get the less machine readable it > becomes, but that's okay because you can machine on the above > refinements and still display the free-form. Would also specify > characters. May use things such as URIs where defined > categories exist; for example if someone produced a URN scheme > for all characters in Shakespeare's works. > > Any thoughts or criticisms on this? RDF allows us to specify > the same endpoint for multiple relationships, or we could handle > roles in a more complex way than simple relationships, either > way the above could be made to work.I do not know the English word for this, but I will try to explain: Would it not be satisfactory to give the category of instrument? I mean: Surely the English language does provide names for various instrument categories. As everyone seams so keen to keep Vorbis comments; how could this be applied using Vorbis comments? -- Daniel Aleksandersen