I need some sort of utility to calculate a checksum of an Ogg file. Two differently encoded Ogg:s should give different checksums, but the same file with different tags should give the same result. (The serial number doesn't work here, obviously. I need something that is changed if a bit of the file is lost.) -- Björn Lindström <bkhl@elektrubadur.se> http://bkhl.elektrubadur.se/ --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Björn Lindström wrote:> I need some sort of utility to calculate a checksum of an Ogg file. > > Two differently encoded Ogg:s should give different checksums, but the > same file with different tags should give the same result. > > (The serial number doesn't work here, obviously. I need something that > is changed if a bit of the file is lost.) >md5sum doesn't suit your needs? It's available for most operating systems. Alan <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On Thu, Jan 08, 2004 at 02:07:58AM +0100, Björn Lindström did utter:> > Two differently encoded Ogg:s should give different checksums, but the > same file with different tags should give the same result. > > (The serial number doesn't work here, obviously. I need something that > is changed if a bit of the file is lost.)Might you be able to adapt spamsum to your purposes? http://junkcode.samba.org/ftp/unpacked/junkcode/spamsum/README It is a hash which is non-propogation - so a change of tags will only affect a very small part of the output hash. It's also robust to alignment changes - usefull if the two files have different length comment blocks at the beginning. I don't know how much different the serial number will make though... (ok, so I'm curious now. I go test... result: serial number differences are indeed enough to muss up spamsum. My extremely basic test results are included below. imho, I think you're gonna need some kind of ogg-aware checksum that can ignore serial numbers and so on... maybe an oggdiff? ;) .../Nemo -- ------------------------------------------ -------------------------- earth native <p># with random serial numbers: $ oggenc TheBeatles-Yesterday.wav -o output1.ogg $ oggenc TheBeatles-Yesterday.wav -a "TheBeatles" -G "MelodicPopRock" -d "196x" -t "Yesterday" -l"some beatles album the name of which escapes me right now" -o output2.ogg $ ls -o *.ogg -rw-r--r-- 1 nemo 1744426 Jan 8 16:23 output1.ogg -rw-r--r-- 1 nemo 1744570 Jan 8 16:23 output2.ogg $ spamsum *.ogg 49152:82W2/HoEDSkohS14GAa6r4uxaBPfZU3+K:NWSoMC/rdUBPf9K 49152:RYc0p1bS5v5IwtR/mM9m6ikvrhk9QtVMQ:+cHIGVk6dzhk9DQ # you can see the spamsum result are wildly different. <p># with identical serial numbers, but differing comments still: $ oggenc TheBeatles-Yesterday.wav -o output1.ogg -s 6 $ oggenc TheBeatles-Yesterday.wav -a "TheBeatles" -G "MelodicPopRock" -d "196x" -t "Yesterday" -l"some beatles album the name of which escapes me right now" -o output2.ogg -s 6 $ ls -o *.ogg -rw-r--r-- 1 nemo 1744426 Jan 8 16:16 output1.ogg -rw-r--r-- 1 nemo 1744570 Jan 8 16:17 output2.ogg $ spamsum *.ogg 49152:MzkAq7SoPdFjid0QXnLIKCk80uD2w0F8W:Ak7jw5EKJDuD2gW 49152:czkAq7SoPdFjid0QXnLIKCk80uD2w0F8W:wk7jw5EKJDuD2gW # you can see the spamsum result are almost identical, as expected <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On Thursday 08 January 2004 12:07, Björn Lindström wrote:> I need some sort of utility to calculate a checksum of an Ogg file. > > Two differently encoded Ogg:s should give different checksums, but the > same file with different tags should give the same result. > > (The serial number doesn't work here, obviously. I need something that > is changed if a bit of the file is lost.)Could you explain why you need this? The ogg framing layer has checksums already embedded in it - so if part of the file is corrupted or missing, it's easily detected (see, for example, ogginfo - it'll let you know some of this information, though not currently all). If you really need this for some reason - and I'm assuming you want a tool for vorbis, not for general-purpose-ogg - then it'd be very simple to write one. Just calculate some sort of hash (perhaps an MD5) over the packet data for all packets in the stream except the headers. Mike --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Björn Lindström wrote:> I need some sort of utility to calculate a checksum of an Ogg file. > > Two differently encoded Ogg:s should give different checksums, but the > same file with different tags should give the same result.What about these two projects, that aim to create checksums und bitprints for audio data and other files: * Bitzi (http://www.bitzi.com) has a utility called bitcollider, which creates all sorts of checksums for a given file. Among the checksums is one called bz:audio_sha1 which ignores non-audio parts of an audio file and seems to support Ogg Vorbis files Public CVS at http://cvs.sourceforge.net/viewcvs.py/bitcollider Sourcecode at http://bitzi.com/developer/code * MusicBrainz (http://www.musicbrainz.org) uses the mbtagger utility, which creates an audio signature for audio files (including Ogg Vorbis files). I have no idea how this exactly works, but part of the process uses TRM fingerprinting from relatable.com, which is closed source. Public CVS at http://cvs.musicbrainz.org/cvs/mb_tagger Hope this is useful. Martin --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.