thr3ads.net - ogg dev - [ogg-dev] The use for an XML based metadata format [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Ian Malone

2007-Sep-11 08:28 UTC

[ogg-dev] The use for an XML based metadata format

On 11/09/2007, Daniel Aleksandersen <aleksandersen+xiphlists@runbox.com>
wrote:> On Tuesday 11. September 2007 01:34:35 Ian Malone wrote:
> > Daniel Aleksandersen wrote:
> > > By the way, I have bee discussing Dublin Core ('DC') with
the
> > > developers of the Atom 1.0 specification. It seams the reason
they
> > > created atom:rights instead of using dc:rights were just about
what I
> > > thought it was: They though DC was too loosely defined. Their own
> > > atom:rights element were designed to more clearly define what the
> > > element contained (escaped HTML, clear text, or whatever else).
> > >
> > > When it comes to other dc:elements the arguments were about the
same:
> > > Could be more clearly defined what they contain and remove
redundant
> > > attributes and children elements.
> >
> > (Sorry, should have replied to this at the same time as the last.)
> >
> > I'd be interested which ones. DC is a bit nebulous, but that gives
> > you tremendous freedom too. Atom on the other hand has a very
> > specific target for the things they describe (but they did take a
> > very pragmatic approach to their problem from what I understand,
> > which means they're probably good people to be talking to).
>
> Atom is a syndication format?like RSS?that carry short descriptions of
> content and links to the full content. I only referred to their work
> because I though it would be relevant.
>
> The Dublin Core Metadata Initiative are great for describing written
> resources such as books, web pages, and indeed it would have worked in the
> case of Atom as well. However it is no good when it comes to describing
> audio and videos. Mostly because you have no method of describing
> what 'role' people and organisations had in the production. Which
is
> precisely why I added the poorly defined role attribute to the person and
> organisation elements.
DC has provision for qualifiers, there is a proposed 'agent-role'
<http://dublincore.org/usage/meetings/2002/05/Agent-roles.html>
which, last time I looked, used the MARC relator list:
<http://www.loc.gov/marc/sourcecode/relator/relatorlist.html>

Two things to notice:
1.  That is a massively long list.
2.  It doesn't appear to do what we want.

But it is there.  However, no such scheme can reasonably provide
support for one-of-roles such as 'Othello', this suggests that beyond
simple role-refinement there are a number of mini-metadata specs
required here.

You've said elsewhere in reference to describing all media
types:> There are drafts for including still images in Ogg streams? Surely they
> would have to be described as well. What camera was used and who made it?
> Who owns it? What do we see on this image?
One question is; do we try to spec everything right from the
start without a good idea of all the use cases or do we start
to nail things down and leave some flexibility?  If you want
to describe everything that could go into a media file I think
you benefit from something like DC to do a lot of the basics,
but there /are/ bits missing (the library card vs programme
issue).  I think roles /is/ the right place to start, and the
difficulty with the RDF model is roles refine relationships
and relationships really need to be standardised to be any
use.

How about this instead (made up element names):
<contributor>
<person>John Smith</person>
<role type="actor"  name="The Doctor" />
</contributor>

It's obviously possible to get into all kinds of contortions
about what properties something like "person" should have
and just how contributor and role &c. should fit around each
other, but I think this allows  cumulative, possibly unique,
refinement at the same time as standardisation.

I notice your description (probably intentionally) splits up
into three separate issues:
1.  Technical origin data.  This is different from the bitrate/
   dimensions issue.  In this case there /must/ already be
   a photographic metadata format in existence.
2.  Rights.  In a way this is the simplest of them all, since
   'no technical solutions for legal problems'.  Owner, license,
   date.  Responsibility/choice of the publisher to get them
   right, but has negligible effect on any legal situation if they're
   not comprehensive.
3.  What do we see.  Subject descriptive metadata is hard.
   This would be a synopsis in other contexts.  But most
   decent metadata formats allow for a free-form description
   of content; from memory all of Atom, DC and CMML do.
   Additionally things like FOAF for people are possible.

-- 
imalone

Ian Malone

2007-Sep-18 05:34 UTC

head link

[ogg-dev] The use for an XML based metadata format

On 11/09/2007, Ian Malone <ibmalone@gmail.com>
wrote:> On 11/09/2007, Daniel Aleksandersen
<aleksandersen+xiphlists@runbox.com> wrote:
> > On Tuesday 11. September 2007 01:34:35 Ian Malone wrote:
> > > I'd be interested which ones. DC is a bit nebulous, but that
gives
> > > you tremendous freedom too. Atom on the other hand has a very
> > > specific target for the things they describe (but they did take a
> > > very pragmatic approach to their problem from what I understand,
> > > which means they're probably good people to be talking to).
> >
> > The Dublin Core Metadata Initiative are great for describing written
> > resources such as books, web pages, and indeed it would have worked in
the
> > case of Atom as well. However it is no good when it comes to
describing
> > audio and videos. Mostly because you have no method of describing
> > what 'role' people and organisations had in the production.
Which is
> > precisely why I added the poorly defined role attribute to the person
and
> > organisation elements.
>
> DC has provision for qualifiers, there is a proposed 'agent-role'
> <http://dublincore.org/usage/meetings/2002/05/Agent-roles.html>
> which, last time I looked, used the MARC relator list:
> <http://www.loc.gov/marc/sourcecode/relator/relatorlist.html>
>
>
> But it is there.  However, no such scheme can reasonably provide
> support for one-of-roles such as 'Othello', this suggests that
beyond
> simple role-refinement there are a number of mini-metadata specs
> required here.
>
Okay, I'm not sure if this is a revelatory inspiration or a crazy idea,
but this morning it occurred to me that what might work is three
degrees of refinement for a role:

1. General.  This would be like the MARC relator list.  If someone
   plays guitar it will say musician.  Doesn't seem satisfactory?
   Go to 2.  This would take us well beyond music alone and I
   think that generality is a good thing, rather than specifying a
   slightly pat and contrived list; 'guitar', 'vocals',
'drums', 'theremin',
   'bowed guitar'
2. Refined.  Where available pick from a given list of possibilities.
   This is where you'd put type of musical instrument for example.
   Options such as 'other instrument' or omitting altogether would
   be perfectly valid.
3. Free form refinement.  E.g the 'bowed guitar' from above refining
   'guitar'.  The more refined you get the less machine readable it
   becomes, but that's okay because you can machine on the above
   refinements and still display the free-form.  Would also specify
   characters.  May use things such as URIs where defined
   categories exist; for example if someone produced a URN scheme
   for all characters in Shakespeare's works.

Any thoughts or criticisms on this?  RDF allows us to specify
the same endpoint for multiple relationships, or we could handle
roles in a more complex way than simple relationships, either
way the above could be made to work.

-- 
imalone

Daniel Aleksandersen

2007-Sep-18 06:06 UTC

head link

[ogg-dev] The use for an XML based metadata format

On Tuesday 18. September 2007 14:32:45 Ian Malone wrote:> On 11/09/2007, Ian Malone <ibmalone@gmail.com> wrote:
> > On 11/09/2007, Daniel Aleksandersen
<aleksandersen+xiphlists@runbox.com>
wrote:> > > On Tuesday 11. September 2007 01:34:35 Ian Malone wrote:
> > > > I'd be interested which ones. DC is a bit nebulous, but
that gives
> > > > you tremendous freedom too. Atom on the other hand has a
very
> > > > specific target for the things they describe (but they did
take a
> > > > very pragmatic approach to their problem from what I
understand,
> > > > which means they're probably good people to be talking
to).
> > >
> > > The Dublin Core Metadata Initiative are great for describing
written
> > > resources such as books, web pages, and indeed it would have
worked
> > > in the case of Atom as well. However it is no good when it comes
to
> > > describing audio and videos. Mostly because you have no method of
> > > describing what 'role' people and organisations had in
the
> > > production. Which is precisely why I added the poorly defined
role
> > > attribute to the person and organisation elements.
> >
> > DC has provision for qualifiers, there is a proposed
'agent-role'
> > <http://dublincore.org/usage/meetings/2002/05/Agent-roles.html>
> > which, last time I looked, used the MARC relator list:
> > <http://www.loc.gov/marc/sourcecode/relator/relatorlist.html>
> >
> >
> >
> > But it is there.  However, no such scheme can reasonably provide
> > support for one-of-roles such as 'Othello', this suggests that
beyond
> > simple role-refinement there are a number of mini-metadata specs
> > required here.
>
> Okay, I'm not sure if this is a revelatory inspiration or a crazy idea,
> but this morning it occurred to me that what might work is three
> degrees of refinement for a role:
>
> 1. General.  This would be like the MARC relator list.  If someone
>    plays guitar it will say musician.  Doesn't seem satisfactory?
>    Go to 2.  This would take us well beyond music alone and I
>    think that generality is a good thing, rather than specifying a
>    slightly pat and contrived list; 'guitar', 'vocals',
'drums',
> 'theremin', 'bowed guitar'
> 2. Refined.  Where available pick from a given list of possibilities.
>    This is where you'd put type of musical instrument for example.
>    Options such as 'other instrument' or omitting altogether would
>    be perfectly valid.
> 3. Free form refinement.  E.g the 'bowed guitar' from above
refining
>    'guitar'.  The more refined you get the less machine readable it
>    becomes, but that's okay because you can machine on the above
>    refinements and still display the free-form.  Would also specify
>    characters.  May use things such as URIs where defined
>    categories exist; for example if someone produced a URN scheme
>    for all characters in Shakespeare's works.
>
> Any thoughts or criticisms on this?  RDF allows us to specify
> the same endpoint for multiple relationships, or we could handle
> roles in a more complex way than simple relationships, either
> way the above could be made to work.
I do not know the English word for this, but I will try to explain: Would it 
not be satisfactory to give the category of instrument? I mean: Surely the 
English language does provide names for various instrument categories.

As everyone seams so keen to keep Vorbis comments; how could this be applied 
using Vorbis comments?
-- 
Daniel Aleksandersen

Maybe Matching Threads

Search for more maybe matching threads

ogg dev - Sep 2007 - The use for an XML based metadata format

[ogg-dev] The use for an XML based metadata format

[ogg-dev] The use for an XML based metadata format

[ogg-dev] The use for an XML based metadata format

Maybe Matching Threads