Hello I had an idea about judging the quality of ogg vorbis (or any other lossy codec) I took a wave-file and encoded it to ogg. Then decoded ogg to wav and inverted it's phase. When mixing the original wav with the phase-inverted decoded ogg-file, any identic parts of compressed and uncompressed audio should be eliminated. Of course there's always a "rest" of sound because the encoder is not lossless. Could this "rest" be an indicator for the quality of the encoder ? I know that evaluating the quality of a perceptual encoder with any technical equipment is nearly impossible and it would be better to let the ear do this. But I'd like to know what you think about this idea . Greetings Stoffke <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
See http://www.hydrogenaudio.org/index.php?act=ST&f=1&t=3314&hl=invert,and,q uality&s=419608f687d5452e48fe4c330432c565 for why this is a terrible idea. <p><p>-----Original Message----- From: owner-vorbis@xiph.org [mailto:owner-vorbis@xiph.org] On Behalf Of Stoffke Sent: 24 January 2003 16:16 To: vorbis@xiph.org Subject: [vorbis] just an idea about quality evaluation <p>Hello I had an idea about judging the quality of ogg vorbis (or any other lossy codec) I took a wave-file and encoded it to ogg. Then decoded ogg to wav and inverted it's phase. When mixing the original wav with the phase-inverted decoded ogg-file, any identic parts of compressed and uncompressed audio should be eliminated. Of course there's always a "rest" of sound because the encoder is not lossless. Could this "rest" be an indicator for the quality of the encoder ? I know that evaluating the quality of a perceptual encoder with any technical equipment is nearly impossible and it would be better to let the ear do this. But I'd like to know what you think about this idea . Greetings Stoffke <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered. <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
The problem with using technical methods to judge the quality of a perceptual codec is that the codec is designed to be tuned to the human ear. For example, an mp3 or Vorbis file encoded at a bitrate (mp3) of 128 or quality setting (Vorbis) of 4, which are roughly the same file size, when an application is applied to them that removes center material from the stereo image, will have some swishing in the left-over vocals/removed material that remains due to an echo or similar occurance. However, the human ear, when the center material is not removed doesn't even hear this effect. However, a difference between the two file formats does become aparent when this is done. Incidentally, when using a vocal remover application on both files, the swishing that occurs in a Vorbis file has no affect on the rest of the music. However, when you listen to the mp3 file, the swishing that occurs greatly distorts the surrounding music. However, these are all just technical differences. You have to let the ear hear what it hears to actually judge the perceptual quality of the encoded file. It just so happens that some difference in the quality of the two above mentioned files is actually apparent to most listeners. At this bitrate/quality setting, the Vorbis file actually does perceptually sound much better than the mp3. but this was meant to be an example of how much the application of technical equipment and operations on decoded files can greatly distort the actual perceptual quality of the encoded file when heard by the human ear. However, when comparing two file formats, this method may be used to judge the technical aspects of the encoded files. It can give you a good idea of just how much of the actual music is being taken out by the compression scheme and the psychoacoustic models of the encoders. Most likely, the differences you described are caused by the channel coupling of the encoder. The encoder compares the left and right channels and couples them according to the perception of the listener in an attempt to achieve greater quality in a smaller file. But when technical applications such as phase inverters and vocal removers are used, the mathematical differences are exploited and the sound you hear becomes both perceptually and mathematically much different from the original. Regards, Lorenzo ----- Original Message ----- From: "Stoffke" <stoffke@directbox.com> To: <vorbis@xiph.org> Sent: Friday, January 24, 2003 11:16 AM Subject: [vorbis] just an idea about quality evaluation <p>Hello I had an idea about judging the quality of ogg vorbis (or any other lossy codec) I took a wave-file and encoded it to ogg. Then decoded ogg to wav and inverted it's phase. When mixing the original wav with the phase-inverted decoded ogg-file, any identic parts of compressed and uncompressed audio should be eliminated. Of course there's always a "rest" of sound because the encoder is not lossless. Could this "rest" be an indicator for the quality of the encoder ? I know that evaluating the quality of a perceptual encoder with any technical equipment is nearly impossible and it would be better to let the ear do this. But I'd like to know what you think about this idea . Greetings Stoffke <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered. <p><p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
It seems many people are interpreting this as a way to calculate the actual QUALITY of the encoded file. I don't think that is what he meant, I feel he was refering mainly to a thread that is actually kind of recent about trying to find the average bitrate of an encoded file. I think he's saying you could fairly easily and accurately determine the q level the file was encoded at. Also, I don't think this would work because of the wide variance of how well the encoder will work depending on each individual song. --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
> Then decoded ogg to wav and inverted it's phase. When mixing the original > wav with the phase-inverted decoded ogg-file, any identic parts of compressed and > uncompressed audio should be eliminated. > Of course there's always a "rest" of sound because the encoder is not lossless. > Could this "rest" be an indicator for the quality of the encoder ?The best encoder according to this indicator would be the one creating the least mean square error in either the frequency or the time domain. That encoder is simple to write. But it would sound like shit ;) Hence your indicator must be flawed. Christian <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On ven, 2003-01-24 at 16:16, Stoffke wrote:> Hello > > I had an idea about judging the quality of ogg vorbis > (or any other lossy codec) > > I took a wave-file and encoded it to ogg. > Then decoded ogg to wav and inverted it's phase. When mixing the original > wav with the phase-inverted decoded ogg-file, any identic parts of compressed and uncompressed audio should be eliminated. > Of course there's always a "rest" of sound because the encoder is not lossless. > Could this "rest" be an indicator for the quality of the encoder ?No, because vorbis removes stereo information to compress the file and thit causes big differences between the original and the compressed file. Psycoacoustic model used by vorbis is much more complex than a simple diffecence between two numbers. Bye. <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Daniel Schregenberger
2003-Jan-25 13:32 UTC
[vorbis] just an idea about quality evaluation
('binary' encoding is not supported, stored as-is) Lorenzo Prince wrote:> The problem with using technical methods to judge the quality of a > perceptual codec is that the codec is designed to be tuned to the human ear.hmmm...maybe this is a bad idea too, but couldn't I use it then to see the difference of two lossy formats? I mean encode a wav with q4 and q5 and see how different the output was with the method Stoffke described. Or maybe to find some decent quality settings to convert mp3s to ogg: If I have a 256k mp3, what quality should I use to a) not lose to much (converting is bad I know, but 256k -> q10 shouldn't be that bad) b) not waste too much space (maybe 256k -> q8 results in almost the same) -- Daniel <p>"Not only is this incomprehensible, but the ink is ugly and the paper is from the wrong kind of tree." -- Professor W. --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HEAD><META http-equiv=Content-Type content="text/html; charset=iso-8859-1"></HEAD> <BODY bgColor=#ffffff><FONT face="verdana,arial" size="2"> Hello,</P> </P> Thanks for your replies. I see - psychoacoustic might be too complex</P> to be evaluated with some technical tricks.</P> </P> But I still wonder, where these leftover-sounds come from.</P> It's interesting, that the signal is almost non-tonal. For example a voice would sound like whisper.</P> Ogg-files encoded with lower '-q' values would result into more</P> leftover-information (though it says nothing about the subjective sound</P> quality)</P> </P> Regards</P> Stoffke</P></FONT></BODY> --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On Fri, Jan 24, 2003 at 05:16:10PM +0100, Stoffke wrote:> Hello > > I had an idea about judging the quality of ogg vorbis > (or any other lossy codec)<snip> Others have commented on why this is a bad test for perceptual lossless testing, *however*, this test does have it's place. If you're not comparing how the codecs sound (which most people are interested in), but what data they throw away, this test is very valuable. I did a linguistics research problem on lossy encodings at low bitrates with mp3. We used this technique to compare what energy was lost, and if that loss was acceptable for linguists who wanted to do phonetics research with lossy compression. Spectral comparisons are also very useful in doing this kind of "what am I actually throwing out" comparisons. Note though, that these comparisons really don't tell you anything about how something sounds. -- Ross Vandegrift ross@willow.seitz.com A Pope has a Water Cannon. It is a Water Cannon. He fires Holy-Water from it. It is a Holy-Water Cannon. He Blesses it. It is a Holy Holy-Water Cannon. He Blesses the Hell out of it. It is a Wholly Holy Holy-Water Cannon. He has it pierced. It is a Holey Wholly Holy Holy-Water Cannon. He makes it official. It is a Canon Holey Wholly Holy Holy-Water Cannon. Batman and Robin arrive. He shoots them. --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.