Timothy B. Terriberry wrote:>Chapter 4 of the Theora specification does a reasonable job of laying >out all of the possible parameters for a Y'CbCr-style color space, which >includes as a subset those needed for RGB. Much more detailed >information is available from Charles Poynton's Color and Gamma FAQs: >http://www.poynton.com/Poynton-color.html >If you wish to do any serious video work, you should at a minimum >undestand these. > >In terms of colorspaces, it seems to me that the only way to completely describe the colorspace is to provide the transform matricies to or from some reference colorspace. Is this a valid statement?>For a lossless codec, the luxury of a "small number of useful formats" >may not be advisable. I can't tell you how many times I've had some raw >data and been completely unable to play it with e.g., mplayer, because >mplayer did not have an apporpriate fourcc. And mplayer has made up many >of their own non-standard fourcc's (which not even all of mplayer >support) to cover for the gaping holes left after counting illi's >supposed "90% of cases on one hand". It is a common but deadly mistake >to assume that what is important to you is what is important to everyone >else. Creating a video format system around the fourcc model has always >struck me as a very, very bad idea. > >Perhaps the answer is a hybrid then.. Come up with a structure containing all the metadata necessary to identify an image's colorspace, sampling parameters, and storage method. Use fourcc or some other enumeration as a key to a table that contains default values for all these parameters. If you don't specify an enumerated type, the specified values could be used. Somewhere someone's going to write down all the values to fill in for the standard fourcc's anyway, might as well make it centralized. It's much more pragmatic, since fourcc describes a lot of data out there already, and most of the more "obscure" metadata has been lost and would have to be invented to fill out this new structure entirely. Better to keep invented data in common, IMHO..>You'll also note the specification says nothing about packed formats. >Packed vs. planar is completely orthogonal to the rest of the issues, >and only arises when storing the raw pixel data directly. Supporting >either is relatively easy in software with an xstride/ystride for each >component, so there is hardly a reason not to (Theora doesn't because it >greatly simplifies several inner loops to be able to assume xstride=1; a >raw codec should not be as affected by such an issue). And there is >definitely a reason _to_ support them in a raw format, since switching >from one to the other is relatively difficult for hardware. > >Agreed.. Though it's worth pointing out that it's possible to have images wthere the xstride/ystride between components is not constant (endian issues, UVYY packings, etc). How to handle interlacing is another problem, if you're trying to make a super generic format. A line has to be drawn somewhere, and it's hard to say where that is.
On Tue, Nov 08, 2005 at 03:33:57PM -0500, John Koleszar wrote:> > In terms of colorspaces, it seems to me that the only way to completely > describe the colorspace is to provide the transform matricies to or from > some reference colorspace. Is this a valid statement?Except there are not enough colorspaces in use to need to do this, as far as I've read at least.. a set of common ones should do, I think.> Use fourcc or some other enumeration as a key to a table that contains default > values for all these parameters.I know by now I must sound like I'm beating this issue to death, but people are still using fourcc to refer to identifying any video parameter, as if by using fourcc we would be saved from this work. FourCC is a codec identifier, nothing more. It's four letters which can be referenced against a table to see what codec it is, it's a standard used by RIFF (aka, wav and avi) and older Macintosh formats, and in my opinion, its obsolete. It does not tell us anything about the colorspace, other than which codec it came from (and thus, we could say that "AXYZ" usually uses 4:4:4), just as knowing that something is of the "WAVE" fourcc, also known as a .wav file, doesn't tell us what samplerate, bits per sample, or number of channels is used. In Ogg, we tell compatability by page0/packet0. The entire first packet, including all it's configuration variables, versions, etc is used to detirmine compatability. The fields we're talking about will go into packet0, so that a list of which plugins can accept it can be easily generated.> Agreed.. Though it's worth pointing out that it's possible to have > images wthere the xstride/ystride between components is not constant > (endian issues, UVYY packings, etc). How to handle interlacing is > another problem, if you're trying to make a super generic format. A line > has to be drawn somewhere, and it's hard to say where that is.Interlace certainly needs to be supported since converting to/from it is not a lossless operation. The interchange format should be lossless, and that means supporting interlace and the chroma sampling it's likely to be needed for. This is also interesting work in that, once we're done, we'll have a standard to use for a compressed lossless format.. a FLAC for video, for editing or archival purposes, similar to HuffYUV. I'm drafting OggYUV and OggRGB on the wiki right now, so everyone can be on the same page for what the result of this will look like. The questions we need to answer are which colorspaces are needed and which encoding methods are needed. -- The recognition of individual possibility, to allow each to be what she and he can be, rests inherently upon the availability of knowledge; The perpetuation of ignorance is the beginning of slavery. from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought by Eben Moglen, General council of the Free Software Foundation
> > FourCC is a codec identifier, nothing more. It's four letters which can > be > referenced against a table to see what codec it is, it's a standard used > by RIFF > (aka, wav and avi) and older Macintosh formats, and in my opinion, its > obsolete. >While it's true there are a bunch of FOURCC's that represent non-raw formats like DIVX etc, the ones which represent raw YUV types are pretty well defined. Yes you certainly still need the width, height, aspect ratio and frame rate.> > This is also interesting work in that, once we're done, we'll have a > standard to > use for a compressed lossless format.. a FLAC for video, for editing or > archival > purposes, similar to HuffYUV.I think that's something completely different. Any specialised lossless compression, as opposed to raw data will have it's own special setup requirements.> I'm drafting OggYUV and OggRGB on the wiki right now, so everyone can be > on the > same page for what the result of this will look like. The questions we > need to > answer are which colorspaces are needed and which encoding methods are > needed.At the end of the day, the reason such a codec doesn't exist, is not because no one has written a spec... it's because no-one has written the code yet. I know from my perspective, a far more useful format in directshow would be a simple rawFOURCC code for the yuv types, if we insist on keeping RGB and yuv separate. As far as i see it there's no reason not to use such a simple fourcc based format for the common cases, and use a more complete/complicated format for the strange cases where people want to get fancy. Since all the data being worked with in directshow is either a fourcc yuv, or one of a handful of rgb types. Zen.
On Tue, Nov 08, 2005 at 04:59:33PM -0800, Arc wrote:> This is also interesting work in that, once we're done, we'll have a standard to > use for a compressed lossless format.. a FLAC for video, for editing or archival > purposes, similar to HuffYUV.Well, we're talking about an uncompressed format. OggMNG is already "FLAC for video". However, it doesn't do YUV (well, not without abusing the ICC chunk, anyway) or interlaced. BTW, there are already a number of general "raw" video formats in professional use; it's not just AVI we have as prior art. :) I'd like to see some discussion of the merits of adopting one of those if we want something more sophisticated than yuv4mpeg. And for the record, it's actually possible to write valid, uncompressed data in both MNG and FLAC: both formats have a no-predictor mode, and zlib itself has a verbatim encoding option to deal with "pathological" input blocks. So in theory we already have an uncompressed format too. Unfortunately one often has to bit-shift the input to do this, so it's not ideal, and of course the primary advantage of an uncompressed format is the simplicity of implementation. FWIW, -r