John Koleszar wrote:> I can''t speak for what Theora can support today, but the VP3 source it > derived from supported UYVY, YVYU, YUY2, and RGB24/32 source data as well.Of course, it did this by providing its own conversion routines (which, particularly for the RGB spaces, make assumptions about the source color space that may NOT be true). Such routines can and should be independent of the actual codec itself, as you suggest for the decoding (or "extraction") side, but MS''s VfW API model did not encourage this. Chapter 4 of the Theora specification does a reasonable job of laying out all of the possible parameters for a Y''CbCr-style color space, which includes as a subset those needed for RGB. Much more detailed information is available from Charles Poynton''s Color and Gamma FAQs: http://www.poynton.com/Poynton-color.html If you wish to do any serious video work, you should at a minimum undestand these. Note that Theora takes the "small number of useful formats" approach, and can get away with it because it is _lossy_. It is expected that the encoder will convert to the closest format Theora acutally supports, and any information loss introduced in the process is usually trivial compared to the quantization that will occur later. JPEG, for example, takes a similar stance (though it supports many more formats than Theora does). I''ll note here also that YUV4MPEG actually supports much more than 4:2:0. theora-exp''s encoder_example has conversions from everything (non-interlaced) that mjpegtools did support back when I wrote it, including correcting the chroma field offsets. It''s possible they''ve added more formats since. For a lossless codec, the luxury of a "small number of useful formats" may not be advisable. I can''t tell you how many times I''ve had some raw data and been completely unable to play it with e.g., mplayer, because mplayer did not have an apporpriate fourcc. And mplayer has made up many of their own non-standard fourcc''s (which not even all of mplayer support) to cover for the gaping holes left after counting illi''s supposed "90% of cases on one hand". It is a common but deadly mistake to assume that what is important to you is what is important to everyone else. Creating a video format system around the fourcc model has always struck me as a very, very bad idea. Go take a look at the H.264 specification (one of the later draft versions should be sufficient and still publicly available) for a much broader view of the possible pixel formats and color spaces available, and that''s still just for a lossy format (though people are building lossless compressors on top of H.264 now). And IIRC, even this doesn''t support some of the rarer 3:1:1, etc. pixel formats. You''ll also note the specification says nothing about packed formats. Packed vs. planar is completely orthogonal to the rest of the issues, and only arises when storing the raw pixel data directly. Supporting either is relatively easy in software with an xstride/ystride for each component, so there is hardly a reason not to (Theora doesn''t because it greatly simplifies several inner loops to be able to assume xstride=1; a raw codec should not be as affected by such an issue). And there is definitely a reason _to_ support them in a raw format, since switching from one to the other is relatively difficult for hardware.
At the request of a few people, let's move this thread to just ogg-dev. If you're not current subscribed to ogg-dev, and wish to track this thread further, please join the list here: http://lists.xiph.org/mailman/listinfo/ogg-dev If you reply to a message on this thread, please only send it to ogg-dev. Thanks. :-) -- The recognition of individual possibility, to allow each to be what she and he can be, rests inherently upon the availability of knowledge; The perpetuation of ignorance is the beginning of slavery. from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought by Eben Moglen, General council of the Free Software Foundation
Timothy B. Terriberry wrote:>Chapter 4 of the Theora specification does a reasonable job of laying >out all of the possible parameters for a Y'CbCr-style color space, which >includes as a subset those needed for RGB. Much more detailed >information is available from Charles Poynton's Color and Gamma FAQs: >http://www.poynton.com/Poynton-color.html >If you wish to do any serious video work, you should at a minimum >undestand these. > >In terms of colorspaces, it seems to me that the only way to completely describe the colorspace is to provide the transform matricies to or from some reference colorspace. Is this a valid statement?>For a lossless codec, the luxury of a "small number of useful formats" >may not be advisable. I can't tell you how many times I've had some raw >data and been completely unable to play it with e.g., mplayer, because >mplayer did not have an apporpriate fourcc. And mplayer has made up many >of their own non-standard fourcc's (which not even all of mplayer >support) to cover for the gaping holes left after counting illi's >supposed "90% of cases on one hand". It is a common but deadly mistake >to assume that what is important to you is what is important to everyone >else. Creating a video format system around the fourcc model has always >struck me as a very, very bad idea. > >Perhaps the answer is a hybrid then.. Come up with a structure containing all the metadata necessary to identify an image's colorspace, sampling parameters, and storage method. Use fourcc or some other enumeration as a key to a table that contains default values for all these parameters. If you don't specify an enumerated type, the specified values could be used. Somewhere someone's going to write down all the values to fill in for the standard fourcc's anyway, might as well make it centralized. It's much more pragmatic, since fourcc describes a lot of data out there already, and most of the more "obscure" metadata has been lost and would have to be invented to fill out this new structure entirely. Better to keep invented data in common, IMHO..>You'll also note the specification says nothing about packed formats. >Packed vs. planar is completely orthogonal to the rest of the issues, >and only arises when storing the raw pixel data directly. Supporting >either is relatively easy in software with an xstride/ystride for each >component, so there is hardly a reason not to (Theora doesn't because it >greatly simplifies several inner loops to be able to assume xstride=1; a >raw codec should not be as affected by such an issue). And there is >definitely a reason _to_ support them in a raw format, since switching >from one to the other is relatively difficult for hardware. > >Agreed.. Though it's worth pointing out that it's possible to have images wthere the xstride/ystride between components is not constant (endian issues, UVYY packings, etc). How to handle interlacing is another problem, if you're trying to make a super generic format. A line has to be drawn somewhere, and it's hard to say where that is.
> For a lossless codec, the luxury of a "small number of useful formats" > may not be advisable. I can't tell you how many times I've had some raw > data and been completely unable to play it with e.g., mplayer, because > mplayer did not have an apporpriate fourcc. And mplayer has made up many > of their own non-standard fourcc's (which not even all of mplayer > support) to cover for the gaping holes left after counting illi's > supposed "90% of cases on one hand". It is a common but deadly mistake > to assume that what is important to you is what is important to everyone > else. Creating a video format system around the fourcc model has always > struck me as a very, very bad idea.Well i guess it depends what it's being used for. My understanding is the usefullness of such a raw format is to firstly get raw data from some hardware device, be it a video camera/webcam etc... which correct me if i'm wrong, all output raw data in a fourcc format or one of the common rgb formats. Secondly be able to store that raw data, for the purposes of either inputting it to another codec, or to a hardware display device. As to displays, again correct me if i'm wrong, but most don't support arbitrary formats of video buffers, they support some subset of fourcc and rgb types. To me the idea of a raw format, is for a time-efficient method to store and display raw data and/or input it it somewhere else. It seems to me, if you want the fully-specified-arbitrary model, then there's a lot of extra processing work that has to be done in between. And just from a MS/directshow perspective, all the input devices, output devices and codecs support a fourcc, or common rgb model. It's all well and good to represent accurately some other colourspace, but considering no codec or output device will display it exactly as is, nor will it ever generate such data, i'm not sure i see the point in these implementations. Zen.