KarasevAS@aol.com
2004-Mar-14 09:03 UTC
[theora-dev] Higher quality video - supporting greater than 8 bit color depth
Hi, I was wondering what are this forum's collective thoughts on the best way to support video color fidelity greater than what we have today. I am not a video developer myself. I edit video. A type of problems I come across fairly often, have to do with the limited color depth of the digital video medium. They often manifest themselves as "cartoonish" areas of adjacent flat colors observable in a low-noise footage of a smooth-colored subject such as sky or a non-textured wall. And trouble is, ANY footage becomes subject to color depth limitations whenever it is needed to adjust its gamma level or affect its historgam in other ways. While most digital video acquisition methods available today (save for film-to-video rank transfers) still yield the low bit rate file, the evolution clearly must be towards the greater fidelity, similar to what happened in film/flatbed scanners (that are now 16bit/channel color depth) and digital cameras (most prosumer models 12bit/color depoth). Even if source AND finished product are 8 bit/channel, there's still a huge benefit in having the processing be in the higher bit depth domain, forcleaner support of complex filter trains including things like PAL/NTSC conversion, tone curve / gamma adjustment, video noise reduction, etc. As a user, I would like to see the following things in the next high quality video format: 1. Ability to support greater than 8 bits per RGB or YPrPb channel color depth, either arbitrarily defined (preferred) or as a selection of "good" values or "green or luminance"/other color depth pairs, to have say 12 bit Y and 8 bit Pr, Pb. You be the judges what would be the good values to support; I'd sure love to see e.g. 8, 12, 16, 24 bits per channel. 2. Ability to specify the "authoring" gamma value and color temperature in the video header. Thus the video would "know" the settings at which it was authored. On the other hand, the playback device could know its own gamma and color temp. Thus when playing a given video file, the playback device can make adjustments (through either adjusting itself, or applying the proper filtration to the video), so the video looks the way it was intended. 3. Robust internal support for interleaved subtitles, so each word is tied to a section of a video. Header could include "recommended" font / color / text field size & position within frame / text alignment within field / modification (bold/italic/underline/outline/strikethrough) / transparency / merging type (direct/add/subtract/multiply). These should be midstream-adjustable so sections of text could appear with different settings. The player chould be able to override some or all of these settings. When intercutting multiple files, would be great if subtitles could be cleanly sliced along with corresponding video. 4. Speaking of subtitles, would be good to have multiple streams of ... EVERYTHING! I mean, multiple angles of video; multiple audio tracks; multiple subtitle tracks. 5. Would be very forward-looking, I think, to support ability to qualify the video components in the video header (i.e. in RGB terms, "what kind of green is this, really"). This means defining filter's median wave length and breadth. Something even more elegant to scal eto scientific applications perhaps would be, for each channel, specify the filter curve and response curve. ... and support potentially more (or less) than three channels such as our plain (R G B or Y Pr Pb). There may be applications incluing IR or UV channels, or simply additional channels at specific other wavelengths. It would be up to the player to interpret these in a user-defined way, by ignoring or mixing them in as specific colors or applying threshold or logical operations. Regards, Alexander --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Maik Merten
2004-Mar-14 10:31 UTC
[theora-dev] Higher quality video - supporting greater than 8 bit color depth
Disclaimer: I´m not a theora-developer. Some information in this mail may be plain wrong. You have been warned ;-) KarasevAS@aol.com wrote:> 1. Ability to support greater than 8 bits per RGB or YPrPb channel color > depth, either arbitrarily defined (preferred) or as a selection of > "good" values or "green or luminance"/other color depth pairs, to have > say 12 bit Y and 8 bit Pr, Pb. You be the judges what would be the good > values to support; I'd sure love to see e.g. 8, 12, 16, 24 bits per channel.My limited understanding of _lossy_ video compression makes me think it is not possible to think of "8, 12, 16, 24 bits per channel". Theora uses YV12 colorspace. Every plane is transformed via Discrete Cosine Transform (DCT) and compressed using quantization. How much information is thrown away is determined via a psychovisual model. There is no fixed "color resolution" AFAIK. For video-editing a lossless video codec is a wiser choice anyway IMO.> 2. Ability to specify the "authoring" gamma value and color temperature > in the video header.Perhaps a metadata-stream could contain this information.> 3. Robust internal support for interleaved subtitles, so each word is > tied to a section of a video. Header could include "recommended" font / > color / text field size & position within frame / text alignment within > field / modification (bold/italic/underline/outline/strikethrough) / > transparency / merging type (direct/add/subtract/multiply).This belongs into a subtitle-stream seperate from the video-stream. No need to change video-coding for that. :-)> 4. Speaking of subtitles, would be good to have multiple streams of ... > EVERYTHING! I mean, multiple angles of video; multiple audio > tracks; multiple subtitle tracks.It´s already possible to mux several logical audio- and video-streams into one physical Ogg-stream.> 5. Would be very forward-looking, I think, to support ability to qualify > the video components in the video header (i.e. in RGB terms, "what kind > of green is this, really"). This means defining filter's median wave > length and breadth. Something even more elegant to scal eto scientific > applications perhaps would be, for each channel, specify the filter > curve and response curve. ... and support potentially more (or less) > than three channels such as our plain (R G B or Y Pr Pb). There may be > applications incluing IR or UV channels, or simply additional channels > at specific other wavelengths. It would be up to the player to interpret > these in a user-defined way, by ignoring or mixing them in as specific > colors or applying threshold or logical operations.I´m not sure if _lossy_ video compression is suitable for scientific applications at all... at least the files will be very small as the psychovisual model will make sure all IR and UV channels will be completely thrown away ;-) Maik --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Ralph Giles
2004-Mar-14 11:20 UTC
[theora-dev] Higher quality video - supporting greater than 8 bit color depth
On Sun, Mar 14, 2004 at 12:03:41PM -0500, KarasevAS@aol.com wrote:> I am not a video developer myself. I edit video. A type of problems I come > across fairly often, have to do with the limited color depth of the digital > video medium. They often manifest themselves as "cartoonish" areas of adjacent > flat colors observable in a low-noise footage of a smooth-colored subject such as > sky or a non-textured wall. And trouble is, ANY footage becomes subject to > color depth limitations whenever it is needed to adjust its gamma level or > affect its historgam in other ways.Yes, absolutely. If you mean, 'it would be nice if theora could do better,' that's not really its mission. While it's a poor source format, 8 bits per channel with proper gamma is just barely enough to represent the final images. Likewise, theora is a lossy codec intended for digital distribution of a final edit, so 2 or 3 bytes per pixel is a good match. Using low bitrate compressed video as an editing source is always going to have a 'lo-fi' effect regardless of what image depth it supports. Certainly one can make do, or embrace the effect artistically as a number of filmmakers have done with the DV format. So, I definitely agree with you that source capture and editing systems should use deeper images, as has been the procedure in the film world for some time. The current crop of 'native dv' editors will eventually seem limited because of this; they're popular because it's the native format of widely available 'cheap' cameras, and because it reduces the data rate to something much more comfortable for current computers. Hopefully trickle down from film, the influence of us in the software world and plain old Moore's Law will convince video engineers to see the light. :) Unfortunately, because of the file sizes, using deep, lossless source formats means for most people it's still cheaper to send tapes/dvd-r/harddisks through the post than to do the kind of internet collaboration for editing that's now possible at the distribution level.> As a user, I would like to see the following things in the next high quality > video format: > > 1. Ability to support greater than 8 bits per RGB or YPrPb channel color > depth, either arbitrarily defined (preferred) or as a selection of "good" values > or "green or luminance"/other color depth pairs, to have say 12 bit Y and 8 bit > Pr, Pb. You be the judges what would be the good values to support; I'd sure > love to see e.g. 8, 12, 16, 24 bits per channel.Ideally, a compression format uses the minimum space required to represent the information it's given, so there's no penalty to submitting 12 bit source as 16 bit (as long as you don't dither, anyway). I'd suggest the interesting channel formats are 8 bit integer, 16 bit integer, and 32 bit float. I've not heard of anyone outside a scientific context using 24 bits per channel.> 2. Ability to specify the "authoring" gamma value and color temperature in > the video header. Thus the video would "know" the settings at which it was > authored. On the other hand, the playback device could know its own gamma and color > temp. Thus when playing a given video file, the playback device can make > adjustments (through either adjusting itself, or applying the proper filtration to > the video), so the video looks the way it was intended.We do include some proper colourspace markers in theora, so at least an attempt at reproducible colour can be made. The problem is much more complicated than just gamma and color temperature though. The print world is standardizing on ICC profiles, which a number of image formats already support; I'd suggest this is the way to go for any new editing-level formats.> 3. Robust internal support for interleaved subtitles, so each word is tied to > a section of a video. Header could include "recommended" font / color / text > field size & position within frame / text alignment within field / > modification (bold/italic/underline/outline/strikethrough) / transparency / merging type > (direct/add/subtract/multiply). These should be midstream-adjustable so > sections of text could appear with different settings. The player chould be able to > override some or all of these settings. When intercutting multiple files, > would be great if subtitles could be cleanly sliced along with corresponding video.This is a real can of worms. There's no clear stopping point between 'plain ascii text' and the features of a full-blown, timecoded, multilingual text and graphics layout system. There are *numerous* formats with overlapping feature sets. As you're probably aware, even in professional broadcast video there are a number of standards, each with more special characters and formatting flags than the last. But, as has been mentioned already, you can interleave whatever subtitle format you want in the Ogg container format. Our traditional opinion has been that an efficient subdivision of subtitle features is to two extremes: very basic timecoded text with no formatting, and animated graphical overlays so the authoring system can be as fancy as it likes. The good news is that the profile for Ogg Theora will include support for one of each.> 4. Speaking of subtitles, would be good to have multiple streams of ... > EVERYTHING! I mean, multiple angles of video; multiple audio tracks; multiple > subtitle tracks.Likewise, you can already to this with ogg. FWIW, I've for a long time advocated MNG+flac interleaved in an ogg bitstream as a high-quality source format. MNG supports lossless compression, 16 bits per channel, white point and gamma correction. Flac is offers lossless audio compression. And they're both free from intellectual property restrictions on implementation as xiph recommends for all formats. Anyway, glad to hear from you. The input of professional users is very valuable to us on the development side. Cheers, -r --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.