Kasper Souren
2003-Jan-25 05:45 UTC
[vorbis-dev] using vorbis for finding structure in music
Hi, The topic of my PhD is finding structure in a musical piece, starting from audio. The idea is to use audio descriptors in order to find repetition and maybe also transformations. Currently I am using descriptors that can be described as a small subset of the FFT of the FFT, which gives me a not-too-many-dimensional vector for one second of sound, with a stepsize of 0.1 second. I am also thinking of using MFCC, but it might be more interesting to start with the Vorbis data, and process them. Maybe do an FFT on them, in order to diminish the amount of data. The advantages are that all the soundfiles I want to process can be stored as .ogg, and the decoding into raw PCM audio can be skipped. Another advantage could be that the signal is already stripped from the - for human hearing - irrelevant information. This could however also turn out negatively, since the sounds that are masked for the hearing, might be heard anyway, and thus might also account for the cognition of structure. (Maybe I should think of an example for this, to make myself clearer.) So now I am studying the Vorbis specs. I was looking for a simple overview (preferably graphical!) to make things clearer, a bit less technical than the libvorbis-1.0/doc/ file. And more technical than most web-articles about Ogg Vorbis. I checked out http://citeseer.nj.nec.com/cs (which is excellent for finding science stuff) for ogg and vorbis, but unfortunately I found nothing. bye, Kasper --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.