a couple points before I look at this too hard: "For VP3, since you can't do a constant quality encode" what? that should not be the case. Could be a QT-specific bug. Theora at any rate is certainly capable of doing constant-quality encode. This probably explains your standard deviation complaint -- VP3 is running a rate control alg, where the other codecs are just shooting for consistent PSNR. Apples to bears. "VP3 preferred rgb" very suspicious, as YUV 12 (ie planar YUV with U and V each subsampled at 1/2 resolution in both directions) is the native colorspace. These sorts of conversions are a huge source of innacuracies on PSNR tests. To be frank, it's great that you went to the trouble of doing this, but the only way to get accurate results would be to work with code modules that are provably devoid of any color conversion, rate control, non-standard quality enhancements (sharpening, gamma), etc. and so on. In this sort of situation, where you have a huge architecture (QT) between you and the codec, I think you would be much better off relying on subjective measurements rather than PSNR. And as a final note, PSNR is really a terrible way to judge codec quality. Among many other sins, it completely fails to penalize codecs that stomp on frequency response rather than try to approximate the spectral nature of an image. (TBH, the present VP3 encoder is guilty of this, though a future encoder could be much smarter). I could go on but it's late. <p> ___ Dan Miller (++,) Founder, CTO, On2 Technologies> -----Original Message----- > From: Colin Mckellar [mailto:c.mckellar@student.murdoch.edu.au] > Sent: Sunday, March 23, 2003 12:20 PM > To: theora@xiph.org > Subject: Re: [theora] A comparison of VP3, and two MPEG-4 variants > > > > On Monday, March 24, 2003, at 01:14 AM, Christoph Lampert wrote: > > > On Mon, 24 Mar 2003, Colin Mckellar wrote: > > > >> I recently did a test of different settings of VP3, and two MPEG-4 > >> variants: 3ivx, and Apple MPEG-4. My results are at the > following URL. > >> > >> I am interested in people's thoughts, and comments on it. > >> > >> Colin. > > > > It is just me or did you really forget to give the URL ? > > woop. > > In my defense, it is late, and I am tired :-) > > <http://mornmist.2y.net/~blibbler/codecpsnr.html> > > Colin. > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Colin Mckellar
2003-Mar-23 19:14 UTC
[theora] A comparison of VP3, and two MPEG-4 variants
On Monday, March 24, 2003, at 10:24 AM, Dan Miller wrote:> a couple points before I look at this too hard: > > "For VP3, since you can't do a constant quality encode" > > what? that should not be the case. Could be a QT-specific bug. > Theora at any rate is certainly capable of doing constant-quality > encode. This probably explains your standard deviation complaint -- > VP3 is running a rate control alg, where the other codecs are just > shooting for consistent PSNR. Apples to bears.This has been the case with every VP3 quicktime codec. In the very first release, (3.2.0.1 beta) the quality slider was enabled, but it did not vary the quality, as much as it determined how far the actual bitrate could vary from the given bitrate. Since that release, the quality slider has been disabled altogether. If I could encode with a constant quantizer using Quicktime into VP3, I would be very very happy. I have long suspected that it was just overlooked in the quicktime release.> "VP3 preferred rgb" > > very suspicious, as YUV 12 (ie planar YUV with U and V each subsampled > at 1/2 resolution in both directions) is the native colorspace. These > sorts of conversions are a huge source of innacuracies on PSNR tests.I was surprised too. The difference between the psnr of the rgb and the yuv was only about 1 or 2, though. I am interested in whether the quicktime codec accepts YUV video directly, and outputs YUV video directly (opposed from doing the conversion internally)> To be frank, it's great that you went to the trouble of doing this, > but the only way to get accurate results would be to work with code > modules that are provably devoid of any color conversion, rate > control, non-standard quality enhancements (sharpening, gamma), etc. > and so on. In this sort of situation, where you have a huge > architecture (QT) between you and the codec, I think you would be much > better off relying on subjective measurements rather than PSNR.Fair enough. The problem with subjective measurements is that they are... well, subjective. I have seen many codec comparisons that look at 5 frames from a clip, and comment on what the author thought as the best quality. PSNR is one method of "objectively" testing. I tried to not include any results that were obviously broken (such as sorenson 2 and 3 codecs.) It would be good if I could test VP3 without the pre-processing. It would be good to do a test of the quality of encodes before a bunch of random people... but I don't really have the resources or time (not to mention, it is much more difficult to do the pretty graphs)> And as a final note, PSNR is really a terrible way to judge codec > quality. Among many other sins, it completely fails to penalize > codecs that stomp on frequency response rather than try to approximate > the spectral nature of an image. (TBH, the present VP3 encoder is > guilty of this, though a future encoder could be much smarter).I guess that is true. I have not done any studies of PSNR... I am aware that you know more about this than I will ever know. Do you know of any test that that a computer do that take those into account? I would be very happy if you could point me in the direction of a better test.> I could go on but it's late.Please do (if you have time) I am interested in your comments. I have played around with video encoding a lot over the last 2-3 years, but I have never written a codec, or done any other low level stuff with encoding video. I am interested in your thoughts on this matter. Colin. --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Mr. Miller It's nice to see somebody of significance in here. I'm certainly in no position to discuss the technical merits of objective testing. I simply do not know that much about the various methods. But I do have to take exception to your statement:>I think you would be much better off relying on subjective >measurements rather than PSNR.Considering the incredible vaguness in what's considered "good enough", any decent testing method is going to *have* to do some sort of objective, reproducable measurments. (Unless, of course, people are going to be satisified with some group of 'experts' making declarations of what is 'best'.) Sure, there will always be some subjectiveness about what's best (which will also depend on the specific situation), but you can't depend on that. It's not really any different from the audio compression format wars. You still have people arguing over Real vs. Microsoft vs. Ogg vs. Mp3 vs Mp3pro etc. All because it's totally subjective and everybody is using different sound clips and codecs to base their opinions on. With video it's even worse. And for somebody like myself, who has eyesight problems, what I would consider to be 'good' would probably be laughed at by others, simply because I have trouble detecting the subtle differences. A purely subjective comparison is worthless. Now, I certainly can't discuss the technical merits of the PSNR method (or any other method), but since you seem to be very very familiar with that method, perhaps you could help devise a tolerable way to objectively measure the differences, along with subjective comparisions... When Theora does finally produce a 'shipping' product, people are going to want to know how good it is, and I really don't think most are going to be too satisfied with "its better than what you think" or some such. They will probably want to know what its strengths and weaknesses are, along with what compression levels give comparable visual quality compared to other methods. <p><p><p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
... [dan:]> > And as a final note, PSNR is really a terrible way to judge codec > > quality. Among many other sins, it completely fails to penalize > > codecs that stomp on frequency response rather than try to > approximate > > the spectral nature of an image. (TBH, the present VP3 encoder is > > guilty of this, though a future encoder could be much smarter). >[colin:]> I guess that is true. I have not done any studies of PSNR... > I am aware > that you know more about this than I will ever know. Do you > know of any > test that that a computer do that take those into account? I would be > very happy if you could point me in the direction of a better test.ee http://www.its.bldrdoc.gov/n3/video/vqmsoftware.htm the ITS solution is a front-runner for standardization by ITU/VQEG. It does substantially better than PSNR, especially at lower datarates where PSNR really breaks down. Their software can be downloaded free but only for eval and research (ie 'non-commercial' products). It is also unfortunately heavily patented. Therefore, while Xiph will probably not use it as part of their core development, it seems perfectly OK for some independent individual to use it to do an evaluation and report the results. Their tool is PC or Linux only, and imports AVI's but not QT files. There are several competing systems, the best 2 or 3 of which apparently work according to similar principles and have shown very similar results in comparisons with extensive subjective tests on a wide range of materials. I suspect that once this stuff is standardized, PSNR will finally be relegated to a useful but clearly insufficient tool for objective quality measurement.> > > I could go on but it's late. > > Please do (if you have time) I am interested in your comments. I have > played around with video encoding a lot over the last 2-3 > years, but I > have never written a codec, or done any other low level stuff with > encoding video. I am interested in your thoughts on this matter. > > Colin. > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
good points, but I think constant Q is pretty much the standard for VBR/storage type applications. Doesn't Vorbis basically use a constant Q model for the most part? it's only when you have hard limits on transmission speeds that you need to employ rate control algorithms that vary Q over time. ___ Dan Miller (++,) Founder, CTO, On2 Technologies <p>> -----Original Message-----> From: Marco Al [mailto:marco@simplex.nl] > Sent: Monday, March 24, 2003 12:43 PM > To: theora@xiph.org > Subject: Re: [theora] A comparison of VP3, and two MPEG-4 variants > > > From: "Freun Laven" <FreunLaven@earthlink.net> > > > >I think you would be much better off relying on subjective > > >measurements rather than PSNR. > > > > Considering the incredible vaguness in what's considered > "good enough", > > any decent testing method is going to *have* to do some sort of > > objective, reproducable measurments. (Unless, of course, people are > > going to be satisified with some group of 'experts' making > declarations > > of what is 'best'.) > > Not from a group of experts, but a group of layman yes. > Experts can have > preconceptions based on objective measures and can tie them > to specific > codecs by recognising specific artifacts. > > MOS is the benchmark to which all objective measures are > compared. To any > individual his subjective measure is the only one which > counts ... how then > can you look at the big picture and declare subjective > measures meaningless? > Obviously the average subjective impression is the only > measure which has > any meaning at all ... > > > With video it's even worse. And for somebody like myself, who has > > eyesight problems, what I would consider to be 'good' would > probably be > > laughed at by others, simply because I have trouble > detecting the subtle > > differences. > > That is a rather extreme example, on average over all > potential users these > kind of things even out. Although since with subjective tests > you usually > have a rather small group your opinion would indeed probably > not be usefull > to include :/ > > > A purely subjective comparison is worthless. > > Actually it is the only comparison of value :) Indeed, the value of > objective measures themselves is measured by how well they > correlate with > subjective scores. > > On a related matter, I dont quite see the relevance of > constant quantizer > measurements ... they are usefull as micro benchmarks during codec > development to compare a codec against its previous version, > but does anyone > actually use constant quantizer encoding in practice? If not > how are the > results relevant for comparing codecs against eachother? > > Id find the results more relevant if the codecs were compared > as they would > be used. Which means seperate tests for streaming (CBR/ABR) > and storage > applications (VBR/2-pass encoding if available ... CBR/ABR > coding with the > rate set to what is needed for the required size if not). > > Marco > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
2-pass encoding may still imply constant Q. The problem is to find the right Q; it's not necessary to vary it over the whole file. ___ Dan Miller (++,) Founder, CTO, On2 Technologies <p>> -----Original Message-----> From: Christoph Lampert [mailto:chl@math.uni-bonn.de] > Sent: Monday, March 24, 2003 2:00 PM > To: theora@xiph.org > Subject: RE: [theora] A comparison of VP3, and two MPEG-4 variants > > > On Mon, 24 Mar 2003, Dan Miller wrote: > > > good points, but I think constant Q is pretty much the standard for > > VBR/storage type applications. Doesn't Vorbis basically use a > > constant Q model for the most part? > > > > it's only when you have hard limits on transmission speeds that you > > need to employ rate control algorithms that vary Q over time. > > Encoding with constant quantizer is almost impossible for > storage if file > size has to be controlable. Even if size is allowed to vary, > it's not true > that fixed quantizer means fixed quality (as you can see from > PSNR plots). > Natural video with different scenes may have visual artefacts > in one scene > and no artefacts in the next with same quantizer. Ratecontrol > is difficult > business, and it's no wonder that most available high quality MPEG2 or > MPEG4 material is (at least) two-pass encoded. > > Christoph > > > > > -----Original Message----- > > > From: Marco Al [mailto:marco@simplex.nl] > > > Sent: Monday, March 24, 2003 12:43 PM > > > To: theora@xiph.org > > > Subject: Re: [theora] A comparison of VP3, and two MPEG-4 variants > > > > > > > > > From: "Freun Laven" <FreunLaven@earthlink.net> > > > > > > > >I think you would be much better off relying on subjective > > > > >measurements rather than PSNR. > > > > > > > > Considering the incredible vaguness in what's considered > > > "good enough", > > > > any decent testing method is going to *have* to do some sort of > > > > objective, reproducable measurments. (Unless, of > course, people are > > > > going to be satisified with some group of 'experts' making > > > declarations > > > > of what is 'best'.) > > > > > > Not from a group of experts, but a group of layman yes. > > > Experts can have > > > preconceptions based on objective measures and can tie them > > > to specific > > > codecs by recognising specific artifacts. > > > > > > MOS is the benchmark to which all objective measures are > > > compared. To any > > > individual his subjective measure is the only one which > > > counts ... how then > > > can you look at the big picture and declare subjective > > > measures meaningless? > > > Obviously the average subjective impression is the only > > > measure which has > > > any meaning at all ... > > > > > > > With video it's even worse. And for somebody like > myself, who has > > > > eyesight problems, what I would consider to be 'good' would > > > probably be > > > > laughed at by others, simply because I have trouble > > > detecting the subtle > > > > differences. > > > > > > That is a rather extreme example, on average over all > > > potential users these > > > kind of things even out. Although since with subjective tests > > > you usually > > > have a rather small group your opinion would indeed probably > > > not be usefull > > > to include :/ > > > > > > > A purely subjective comparison is worthless. > > > > > > Actually it is the only comparison of value :) Indeed, > the value of > > > objective measures themselves is measured by how well they > > > correlate with > > > subjective scores. > > > > > > On a related matter, I dont quite see the relevance of > > > constant quantizer > > > measurements ... they are usefull as micro benchmarks during codec > > > development to compare a codec against its previous version, > > > but does anyone > > > actually use constant quantizer encoding in practice? If not > > > how are the > > > results relevant for comparing codecs against eachother? > > > > > > Id find the results more relevant if the codecs were compared > > > as they would > > > be used. Which means seperate tests for streaming (CBR/ABR) > > > and storage > > > applications (VBR/2-pass encoding if available ... CBR/ABR > > > coding with the > > > rate set to what is needed for the required size if not). > > > > > > Marco > > > > > > --- >8 ---- > > > List archives: http://www.xiph.org/archives/ > > > Ogg project homepage: http://www.xiph.org/ogg/ > > > To unsubscribe from this list, send a message to > > > 'theora-request@xiph.org' > > > containing only the word 'unsubscribe' in the body. No > > > subject is needed. > > > Unsubscribe messages sent to the list will be ignored/filtered. > > > > > --- >8 ---- > > List archives: http://www.xiph.org/archives/ > > Ogg project homepage: http://www.xiph.org/ogg/ > > To unsubscribe from this list, send a message to > 'theora-request@xiph.org' > > containing only the word 'unsubscribe' in the body. No > subject is needed. > > Unsubscribe messages sent to the list will be ignored/filtered. > > > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
well we can agree to disagree. There is no reason why a properly designed multi-pass encoder cannot compress a file to a very close approximation of the desired total bit allocation using a fixed-Q approach. At worst, it may dither between two sequential Q values simply because the codec's Q levels are too coarse. ___ Dan Miller (++,) Founder, CTO, On2 Technologies> -----Original Message----- > From: Christoph Lampert [mailto:chl@math.uni-bonn.de] > Sent: Monday, March 24, 2003 5:53 PM > To: theora@xiph.org > Subject: RE: [theora] A comparison of VP3, and two MPEG-4 variants > > > On Mon, 24 Mar 2003, Dan Miller wrote: > > > 2-pass encoding may still imply constant Q. The problem is to find > > the right Q; it's not necessary to vary it over the whole file. > > I doubt it. > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.