hi folks! once again i am trying to decode a yuv_buffer to a 24 bit RGB buffer. last time nobody seemed willing to tell me how to do this, so i am trying again. i try to make my questions more simple. what i need to know is: how many bytes are in each y, u, v array? what for are these strides? what exactly is a "plane" in a frame, and what does it do? what i want to achieve is getting the Y, U and V data for each single pixel, so i can convert it to one R, G and B value. i am trying to apply one of the conversion formulas presented at http://www.fourcc.org/fccyvrgb.php, but as of now was never successful since the description of the buffer format doesn't help me getting the idea.... i would really be very thankful if someone could help me with this :) regards, ---david
----- Original Message ----- From: "David Kment" <davidkment@web.de> To: <theora@xiph.org> Sent: Wednesday, September 28, 2005 3:59 AM Subject: [Theora] problems understanding yuv_buffer format> hi folks! > > once again i am trying to decode a yuv_buffer to a 24 bit RGB buffer. > last time nobody seemed willing to tell me how to do this, so i am trying > again.Type... convert yuv to rgb into google and press i'm feeling lucky. Also in here... if you look through the code here... you'll find various conversions... primarily going RGB to YUV... but it also has diagrams of the memory layouts of most of the different yuv formats in comments. http://svn.xiph.org/trunk/oggdsf/src/lib/codecs/theora/filters/dsfTheoraEncoder/TheoraEncodeInputPin.cpp> i try to make my questions more simple. what i need to know is: > > how many bytes are in each y, u, v array?Depends on the type of YUV... in theora's yuv_buffer, almost always stride*height in y and stride*height/4 in each of u and v... look for info on YV12... it's the fourcc format most similar to what theora uses.> what for are these strides?A stride is the width of a contiguous block of memory... eg if the width was 7... it might be advantageous to have the stride be 8 for various reasons... so the extra byte is not used... basically the stride is the memory distance between the start of each line.> what i want to achieve is getting the Y, U and V data for each single > pixel, so i can convert it to one R, G and B value. >Well because of the way it's sampled, there will be a Y sample for every pixel... but each U and V will be shared by four pixels Zen.
Hi David,> what i want to achieve is getting the Y, U and V data for each single > pixel, so i can convert it to one R, G and B value. > > i am trying to apply one of the conversion formulas presented at > http://www.fourcc.org/fccyvrgb.php, but as of now was never successful > since the description of the buffer format doesn't help me getting the > idea....If you are talking about designing an implementation, perhaps GStreamer (http://gstreamer.freedesktop.org/) could help. I don't really know enough about this, but saw some good comments on: http://sourceforge.net/mailarchive/forum.php?forum_id=5947&max_rows=25&style=nested&viewmonth=200402 Find the thread on 'colorspace conversion' and you'll have a start; if nothing else some people to ask. They seem to refer to some other projects which may be of use to you. The thread is a bit old, however, so some of the things which they are talking about may be in existence now. Perhaps liboil may be of interest: http://www.schleef.org/liboil/desktopcon-2005.pdf Even http://lists.matroska.org/pipermail/media-api/2003-December/000265.html may be of some use. As I say, I don't really know this area, but it seems more sensible to me to have a generic GST YUV -> RGB converter and just construct a pipeline from the GST-Theora plugin to the converter. Hope that helps, Aaron
Paul Foley wrote:>How about using ImageMagick's "convert" utility to turn it into an PPM >or PNG or whatever's most useful to you? > > >i need conversion during runtime, so this is not an option.>Assume you have a 640x480 frame. There are then 307200 (=640?480) >bytes of Y data followed by 76800 (=320?240) bytes of U data followed >by 76800 bytes of V data. > >A "plane" is just one set of data - the Y or the U or the V. It would >be possible to have them interspersed: one pixel or Y followed by one >pixel of U followed by one pixel of V, or one scan-line of Y followed >by one scan-line of U, ..., so then there would be no planes. > > >ok, thanks,. this information is really useful!>>what i want to achieve is getting the Y, U and V data for each single >>pixel, so i can convert it to one R, G and B value. >> >> > >If you just use the Y data, you get a nice grayscale image. > >hmm, this sounds very interesting for a start. i tried it, by simply copying the Y array to a file, and viewed it in IrfanView. it worked somewhat, since the picture was recognizable, but it was garbled and repeated on itself, overwrapping. this would mean the pixels are not in a perfect row (like expected from e.g. a raw RGB buffer). so i need to find out how they are positioned. can i at least assume that a straight line of pixels (euqaling a stride, so exactly the frames width) is always non-interrupted, so i can take it the way it is without modifying?
> this would mean the pixels are not in a perfect row (like expected from > e.g. a raw RGB buffer). so i need to find out how they are positioned. > can i at least assume that a straight line of pixels (euqaling a stride, > so exactly the frames width) is always non-interrupted, so i can take it > the way it is without modifying?Each row is indeed contiguous in memory, with buffer->y_width pixels in each row. Note that this (the frame width) is NOT equal to the stride. The stride gives you the offset between each row. Additionally, note that the actual picture is, in the general case, a subset of this total size. You need to look at theora_info->offset_x and offset_y, and frame_width and frame_height (this is so that you can specify an arbitrary size, even though the format requires that the dimensions be a multiple of 16. Mike