On Mon, Sep 10, 2007 at 01:19:05PM +0100, Ian Malone wrote:> as artist=(composer) Beethoven), I think cast lists for films present > a similar problem. There is consistency and indexability to be > addressed (Ludvig van Beethoven; Beethoven, Ludvig van; > Beethoven).ID3 has a concept of "sort" tags, which provide a string for sorting purposes which is different from the (presumedly full name) of the usal tag. TSOP="Beethoven", TCOM="Ludwig van Beethoven". If you want something more precise, you have to link to unique identifiers for a particular artist, like a musicbrainz id. I gather that's not what you're interested in here?> Finally complex relationships are even harder to > handle such as specifying a resource's relationship to the rest > of a collection.Stepping back a bit, there are three levels of metadata models. 1) The first is just untyped attributes. This includes folksonomy[1] tag systems like flickr tags, as well as older systems like "keywords" or "PACS numbers"[2] used for subject indexing by some scientific communities. "Beethoven", "Moonlight", "Piano" 2) The second level adds typing to the attributes. This covers all (key, value) pair schemes, including Vorbis comments, EXIF and PNG metadata, and XML attributes. Composer="Beethoven", Title="Moonlight Sonata" 3) The third level is what is usually called the RDF model, where the metadata is described by a graph. The nodes are items that and the labelled edges describe a relationship between nodes. Audio title is Moonlight Sonata. Moonlight Sonata composed by Ludwig van Beethoven. Ludwig van Beethoven has a short name Beethoven Moonlight Sonata was composed in 1801. Audio performed by Arthur Rubinstein. Arthur Rubinstein born in 1887. I think we need to decide which of these models are implied by our requirements. Once we know what we have to encode, it will be easier to setting the encoding issues. There are other axes, such as whether the categories are ad-hoc, like in flickr tags, or reference an absolute collection like musicbrainz ids. -r
Ralph Giles wrote:> On Mon, Sep 10, 2007 at 01:19:05PM +0100, Ian Malone wrote: > >> as artist=(composer) Beethoven), I think cast lists for films present >> a similar problem. There is consistency and indexability to be >> addressed (Ludvig van Beethoven; Beethoven, Ludvig van; >> Beethoven). > > ID3 has a concept of "sort" tags, which provide a string for sorting > purposes which is different from the (presumedly full name) of the usal > tag. TSOP="Beethoven", TCOM="Ludwig van Beethoven". > > If you want something more precise, you have to link to unique > identifiers for a particular artist, like a musicbrainz id. I > gather that's not what you're interested in here? >Well, musicbrainz is great, but it won't identify actors (for example). More generally for organisation internal metadata where you might want to say 'Bob recorded this' musicbrainz is never going to give you an ID unless Bob also happens to be a successful recording artist. You could use an organisation internal ID scheme of course; a new band releasing a track online still faces a problem though. Actually it might be instructive to bring up (some of!) the tags from a file that I have on my hd: title=Bad Penny artist=Rory Gallagher MUSICBRAINZ_ALBUMID=ca0d3987-5d30-4f29-8863-599cc71b36d0 MUSICBRAINZ_ALBUMSTATUS=official MUSICBRAINZ_ALBUMTYPE=compilation MUSICBRAINZ_ARTISTID=933fdeae-ec68-48e9-a752-8bcfd44bc429 MUSICBRAINZ_NONALBUM=0 MUSICBRAINZ_SORTNAME=Gallagher, Rory MUSICBRAINZ_TRACKID=e7c53164-8d74-4306-87be-e78fac6fb1a8 MUSICBRAINZ_VARIOUSARTISTS=0 RELEASECOUNTRY=GB All capitalised tags are the result of an MB automatic tagging tool. The MB one includes all the uuids, but also pulls in a 'sortname'. (Incidentally the album name had been added when encoding, I suspect that's why it was unmodified.) Despite UUIDs there's an acknowledgement here that you don't want to have to query the MB database every time you want to search your music collection, the UUID's main feature as far as I can see is to help tie the database together at their end. What I think would be useful would be a way of specifying names which makes it easier to do structured sort. Having separate first name(s) and last name fields would work for most cases, but a vCard-like sortname field is probably more realistic. Make the sort name a mandatory part of the name description and suggest to application developers how it should be used. -- imalone
On Wed, Sep 12, 2007 at 11:55:04PM +0100, Ian Malone wrote:> All capitalised tags are the result of an MB automatic > tagging tool. The MB one includes all the uuids, but > also pulls in a 'sortname'. (Incidentally the album name > had been added when encoding, I suspect that's why it was > unmodified.) > Despite UUIDs there's an acknowledgement > here that you don't want to have to query the MB database > every time you want to search your music collection, the > UUID's main feature as far as I can see is to help tie > the database together at their end.*nod*> What I think would be useful would be a way of specifying > names which makes it easier to do structured sort. > Having separate first name(s) and last name fields would > work for most cases, but a vCard-like sortname field is > probably more realistic. Make the sort name a mandatory > part of the name description and suggest to application > developers how it should be used.I agree a sort name makes more sense. Separate name fields also don't internationalize well. But I don't think we can make any of this mandatory. Falling back to sorting on the full name is reasonable. -r