Chris Cooksey
2009-Nov-16 21:19 UTC
[theora-dev] Theora Fast Quality Reduction Transcoding
Hi. I have been working on a tool whose goal is to reduce the bit rate of theora video by decoding to DCT coefficients, reducing the entropy of the coefficients, and re-tokenizing the stream. I have successfully used the decoder source to extract the DCT coefficients for each block, and I am able to capture any and all relevant information about where the block of coefficients falls in the image, the frag list, the MCU and so on. That is to say the entire decoder state for a given block of coefficients is available to me in a callback. I had thought at this point to use the tokenizer directly to construct a new token stream. However the complexity of that module alone is daunting. It relies on state information retained and perhaps also generated by the encoder. I think the retained information like the huff tables and maybe even lambda can be constructed easily enough. But I am concerned about the possible use of generated data since we are not doing a full encode, and that there may be traps waiting to be sprung on me. I have also considered running a full encoder but without any real image data being fed to it. The processing required to generate coefficients would be replaced with a callback to retrieve coefficients generated by the decoder. However, this approach is very likely to be unworkable as, at the very least, the inter frame motion vectors and quite probably other useful information would be missing without further intervention from me. So I am still inclined to use the tokenizer directly on the modified coefficient stream. I would be very grateful if anyone knowledgeable about the tokenizing process or the encode process in general could offer advice or warnings about either approach I described. Thanks in advance, Chris
Timothy B. Terriberry
2009-Nov-17 14:18 UTC
[theora-dev] Theora Fast Quality Reduction Transcoding
Chris Cooksey wrote:> I had thought at this point to use the tokenizer directly to construct a new > token stream. However the complexity of that module alone is daunting. ItYou may be interested to know about rehuff, an experimental tool for changing the codebooks in a Theora stream. The tool itself is here: http://svn.xiph.org/trunk/theora-exp/ with examples/rehuff.c being the main driver and lib/recode.c doing most of the actual work. This is probably too simplistic for you, since it cannot, for example, change a block from coded to uncoded if you quantize all of its coefficients to 0 (assuming it does not require a MV), and does not break the decoded coefficients out by the block they originally came from. But it may be a simpler starting point than the full encoder.> I have also considered running a full encoder but without any real image > data being fed to it. The processing required to generate coefficients would > be replaced with a callback to retrieve coefficients generated by theThe problem here is that it would also have to choose the same MB modes, MVs, qi values, and coded block flags for the coefficients from the original stream to make any sense. These decisions are made with reference to the original input frame, which you do not have. I agree with your assessment that this approach is probably unworkable.
Chris Cooksey
2009-Nov-18 17:06 UTC
[theora-dev] Theora Fast Quality Reduction Transcoding
Hi Timothy. Thanks for the information. It confirms most of what I have found.> You may be interested to know about rehuff, an experimental tool for > changing the codebooks in a Theora stream. The tool itself is here:I do know about the rehuff tool. It was the foundation for what I have done so far, which includes reporting each coefficient block and the surrounding decoder state. I also updated it to Theora 1.1. As for trying to encode without actual image data, yes that is a non-starter it looks like. I am still holding out hope for using the tokenizer directly on the modified coefficients. But you did identify the major area of concern I have: if the coefficients of an inter frame block all go to zero, I will need to update the coded block list. For that matter, what if the entire inter frame becomes uncoded? This greatly complicates what would otherwise be a seemingly simple task. The rehuff tool works by simply duplicating the entire frame header and then emitting a recoded huff stream. This won't work if I need to change the coded block list, which ideally I would like to if possible. It is here that I am kind of stuck right now. My understanding of that part of the spec is limited. Any hints regarding that part of the process as its implemented in theoralib would be helpful, i.e. what are the relevant structures, etc. I will be working through it myself to try to better understand it. Thanks, Chris.