Hi,
maybe this is of interest for someone out there. I found that taking the
difference between an original sample and its decoded Vorbis counterpart
is a cool way to mangle sounds (especially voice samples get a nice
weird touch :] ).
A side effect is that the result is not only exactly what information
the encoder omits, it also contains things that it adds to the sample
(artifacts). Leaving out inaudible data, especially lots of high
frequencies, give the resulting sample an extreme highpass filter and a
swooshy sound that propably comes from those masked sounds (things of
lower volume preceded by something loud cannot be percieved well, that's
why lossy codecs drop them... as far as I understood). All this is
completely normal and appreciated, because it saves us all lots of
diskspace, and with OGG even at high fidelity. :) As said above, this
experiment also shows what has been added by the encoder, in this case
the rumbling in the middle/lower frequency spectrum. Those who are
interested can now download differences.zip from
ftp://instinct.student.utwente.nl/pub/groups/kolabore/ogg/ . It contains
a single .WAV file.
You can make those difference-samples yourself pretty easy, too. All you
need is a wave editor that supports inverting and mixing two samples
plus some tool to decode your .ogg back to .wav. For my experiments, I
used CoolEdit and Winamp's diskwriter.
Either the original .wav or the decoded .ogg have to be "inverted",
which means that the sample is flipped at the x-axis. (Flipping it at
the y-axis would make the sample play backwards.) For example, if a
sample in the whole clip has the value +10000 (for signed 16bit
samples), this sample would have the value -10000 after inverting the
clip. This equals a phase shift of 180°.
Mixing two samples means adding their values. -10+5=-5 ... easy as that.
If the decoded .ogg and the .wav were 100% the same, the result would be
all zeroes, i.e. silence, if one of the two samples was inverted before.
Since this is not the case with OGG, we can hear the difference between
the two samples now. The lowpass filtering of the encoder cuts off some
high frequencies, so substracting its lowpassed reproduction from the
original that contains all frequencies gives us a highpassed version of
the original clip. ((lo+hi)-lo=hi) Another example would be when there
are two identical music clips, one with vocals, the other as an
instrumental version. Inverting one of those and then mixing them
together would result in the vocals without the music.
Back to the .wav in differences.zip - after doing what I just described,
I normalized it. No compression or similar, just normalization. The
actual differences are not at this high volume in the original.
Most certainly, this technique is nothing new to the developers, because
as far as I know this is similar to what you do first when making
(lossless) stereo coupling. Anyways, maybe you can even use it to tune
the encoder and/or analyze artifacts etc, just in case you didn't have
that idea already.
Thanks for reading and all that ...
Moritz
P.S.: Unfortunately, this doesn't work with MP3 files. Because of the
short silence that is inserted in the beginning of each decoded MP3, the
samples aren't where their inverted counterparts are. Since I don't know
any decoding tools that create identical-to-the-original filesizes from
MP3s, and I really don't feel like counting wtfknowshowmany samples in
CoolEdit at the maximum zoom level by hand ... you know. SuXx0r. :) I
can't code, that's why I won't do it. :D
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body. No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.