I hope this list is small enough for not beeing banned when introducing oneself. So... Hi, I just subscribed to the theora-lists. I'm from the XVID project, so I know a deal about video and MPEG-4 in particular, but nothing about Vorbis/Theora etc., That's why I'm here, because I'm interested in patent-free alternatives. I guess there is no documentation of the current status of Theora and it's basic (technical) concepts, is there? Because reading sources is okay, but sometimes a few explanations make it much easier. I'm particularly interested in how you manage to keep everything free of patents, in particular since I know that there are patents (claimed) on such trivial stuff like motion vector prediction and decision when to encode and when to skip a block. Christoph -- Christoph H. Lampert chl@math,uni-bonn,de | Diese Signature ist maschinell Beringstr. 6, Zi. 15, 53115 Bonn, Germany | erstellt und auch ohne Unter- Tel. (0228) 73-4708 Fax. +49 228 73-7916 | schrift wirksam. AZ 27B/6 --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On Wed, 2003-02-26 at 04:37, Christoph Lampert wrote:> I'm particularly interested in how you manage to keep everything free of > patents, in particular since I know that there are patents (claimed) on > such trivial stuff like motion vector prediction and decision when to > encode and when to skip a block.There is a little technicality that I'd like to point out, even though it makes no difference as far as developing and using Theora is concerned. VP3, the codec that Theora is based on, is actually not strictly patent free. However, quoting from the "VP3 Legal Terms" section on http://www.theora.org/cvs.html: "On2 represents and warrants that it shall not assert any rights relating to infringement of On2's registered patents, nor initiate any litigation asserting such rights, against any person who, or entity which utilizes the On2 VP3 Codec Software, including any use, distribution, and sale of said Software; which make changes, modifications, and improvements in said Software; and to use, distribute, and sell said changes as well as applications for other fields of use." In short, On2 have effectively neutered the VP3 patents, so even though VP3 is technically patented, it is not subject to patent licensing. As far as what methods and/or technologies these patents encompass, I have no idea. Maybe Dan can elaborate on this. Best regards, Carsten Haese Ogg Traffic Editor, Xiph.Org Foundation http://www.vorbis.com/ot <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On Wed, 26 Feb 2003, Christoph Lampert wrote:> I guess there is no documentation of the current status of Theora > and it's basic (technical) concepts, is there? Because reading sources is > okay, but sometimes a few explanations make it much easier.I'm working on a description for the VP3 decode system as I develop a fresh decoder implementation for the ffmpeg project (ffmpeg.sf.net). Here is what I can tell you about the decode process (from which you can make educated guesses about the encode process): * decoding a VP3 frame: * decode frame header (keyframe, quantization level, version #) * unpack superblock/macroblock/fragment encoding data * unpack encoding mode information for encoded blocks * unpack motion vectors * unpack DC coefficients for all coded fragments * unpack 1st AC coefficients for all coded fragments * unpack 2nd AC coefficients for all coded fragments * ... * unpack 63rd AC coefficients for all coded fragments * reconstruct frame, which entails: * prediction for the DC coefficient * calculating relevant motion vector, if applicable * IDCT to obtain fragment or fragment diff * apply diff to motion block, if applicable That's basically how decoding works. Note that I still haven't gotten heavy into the interframe coding process, so I might have some details confused on the motion prediction stuff.> I'm particularly interested in how you manage to keep everything free of > patents, in particular since I know that there are patents (claimed) on > such trivial stuff like motion vector prediction and decision when to > encode and when to skip a block.I often wonder the same. So here are some observations on things that probably set this algorithm apart: * You probably noticed that the algorithm encodes all the DC coeffs, then all the 1st AC coeffs, etc. MPEG certainly doesn't do that and I don't know of any other coding methods that do. * Block coding: In addition to 8x8 blocks (fragments) and 16x16 macroblocks, VP3 also uses superblocks which are 32x32, encapsulating 4 macroblocks. The order of unpacking the fragments is also fairly unique. * DC prediction: On2 used to be Duck and if you are familiar with their algorithms (Duck Truemotion variants and the DK ADPCM codecs), you know that they are huge DPCM fanatics. To that end, there seems to be a lot that goes on for DC prediction. I know that MPEG uses some DC prediction, but I don't think it's anywhere near this level. A fragment can predict DC using DC elements from the left, up-left, up, and up-right fragments, if present and coded in the current frame. Additionally, there is some more code that is effectively disabled that does a special search through previous fragments to the up and left of the current fragment looking for *some* block that has been coded in this frame from which DC can be predicted. Aggressive strategy. * Golden frame: VP3 calls its keyframes "golden frames". The codec needs to maintain the last golden frame in addition to the previous decoded frame since motion vectors can be predicted from either. If I have gotten anything wrong about anything stated above, please yell very loudly right now. But I hope this helps. Thanks... -- -Mike Melanson <p><p><p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Excellent overview. As to the original question (how do we *know* a codec is patent-free), the simple answer is, we never really do. Unfortunately, the situation is this: Xiph does its best to put out a codec that does not infringe on anyone's patent claims. However it is not feasible to check all relevant patents, and even if they could, there are patents not yet issued (and therefore not accessible under US law for the most part). Furthermore, experts can and do differ on interpreting claims and priorities of patents. In fact there is extensive litigation on patents all the time over conflicting interpretations of claims. The results of litigation (if it goes to trial), and even the results of settled actions, can have a huge effect on the perceived validity or lack thereof of specific patents. No one assumes that the patent examiner did a perfect job in finding all prior art, or interpreting it correctly (nor should they; examiners are underpaid and overworked). The fact that On2 has some patents (issued and pending) covering VP3 is actually a good thing, because it lessens the field of potential claims by 3rd parties. Another point of reference is that On2/Duck has been developing its codec technology for 10 years, and has never, to my knowledge (I've been CTO since its inception, and CEO for two years during 1999-2000) received any correspondence indicating a challenge to its IP (Intellectual Property) from any 3rd party seeking patent royalties or attempting to interfere with the sale of On2's products in any way. While there are indeed patents that seem to claim rather fundamental aspects of video coding, many experts in the field feel that most of these patents are over-reaching and would not stand up to a serious court challenge. This fact is known to the patent-holders, who are very careful not to seek to enforce their patents unless they are reasonably sure they will win the fight, as if the patent is found lacking in court it becomes essentially worthless. For the most part, these patents are part of the pool of MPEG patents administered by Dolby and MPEG-LA (strength in numbers -- the whole issue of whether creating such a patent pool is even legal under antitrust law is another discussion). I am not aware of any case where these patents were used to claim infringement or demand royalties _except_ to extract royalties from users of the MPEG specifications. In other words, for the most part the patent holders seem interested in going after MPEG users, not creators or users of non-MPEG codecs such as VP3 (this would seem to make MPEG-based technologies like DivX and Xvid more likely targets for action than Theora). This could of course change, but it appears to have been the strategy so far. After all, it doesn't make sense to sue someone unless you expect to make some money, and most of the users of open-source multimedia software are unwilling and/or unable to pay licensing fees -- they'll just move on to another technology (that's why they came here in the first place). They're just not an attractive target for this sort of litigation. I hope this helps clear the air somewhat. I know that engineers and others would rather hear some sort of blanket assurance that there could never be a problem with patent-free software, but the fact is that the system is rigged so as to make such a claim impossible to make in good faith. In the US in particular (I'm not as familiar with the European or Asian systems, though I suspect things are similar), anyone can file a patent for relatively little money, get it issued with only rudimentary research into its validity, and sit around waiting for someone to make some money who they can claim is infringing their patent. It happens all the time, in every industry (see for example http://www.stblaw.com/FSL5CS/memos/memos1218.asp). -dbm> -----Original Message----- > From: Mike Melanson [mailto:melanson@pcisys.net] > Sent: Wednesday, February 26, 2003 10:16 AM > To: theora-dev@xiph.org > Subject: Re: [theora-dev] [OT] Just saying hi! > > > On Wed, 26 Feb 2003, Christoph Lampert wrote: > > > I guess there is no documentation of the current status of Theora > > and it's basic (technical) concepts, is there? Because > reading sources is > > okay, but sometimes a few explanations make it much easier. > > I'm working on a description for the VP3 decode system as I > develop a fresh decoder implementation for the ffmpeg project > (ffmpeg.sf.net). Here is what I can tell you about the decode process > (from which you can make educated guesses about the encode process): > > * decoding a VP3 frame: > * decode frame header (keyframe, quantization level, version #) > * unpack superblock/macroblock/fragment encoding data > * unpack encoding mode information for encoded blocks > * unpack motion vectors > * unpack DC coefficients for all coded fragments > * unpack 1st AC coefficients for all coded fragments > * unpack 2nd AC coefficients for all coded fragments > * ... > * unpack 63rd AC coefficients for all coded fragments > * reconstruct frame, which entails: > * prediction for the DC coefficient > * calculating relevant motion vector, if applicable > * IDCT to obtain fragment or fragment diff > * apply diff to motion block, if applicable > > That's basically how decoding works. Note that I still haven't gotten > heavy into the interframe coding process, so I might have some details > confused on the motion prediction stuff. > > > I'm particularly interested in how you manage to keep > everything free of > > patents, in particular since I know that there are patents > (claimed) on > > such trivial stuff like motion vector prediction and > decision when to > > encode and when to skip a block. > > I often wonder the same. So here are some observations on things > that probably set this algorithm apart: > > * You probably noticed that the algorithm encodes all the DC > coeffs, then > all the 1st AC coeffs, etc. MPEG certainly doesn't do that and I don't > know of any other coding methods that do. > > * Block coding: In addition to 8x8 blocks (fragments) and 16x16 > macroblocks, VP3 also uses superblocks which are 32x32, > encapsulating 4 > macroblocks. The order of unpacking the fragments is also > fairly unique. > > * DC prediction: On2 used to be Duck and if you are familiar > with their > algorithms (Duck Truemotion variants and the DK ADPCM > codecs), you know > that they are huge DPCM fanatics. To that end, there seems to be a lot > that goes on for DC prediction. I know that MPEG uses some DC > prediction, > but I don't think it's anywhere near this level. A fragment > can predict DC > using DC elements from the left, up-left, up, and up-right > fragments, if > present and coded in the current frame. Additionally, there > is some more > code that is effectively disabled that does a special search through > previous fragments to the up and left of the current fragment > looking for > *some* block that has been coded in this frame from which DC can be > predicted. Aggressive strategy. > > * Golden frame: VP3 calls its keyframes "golden frames". The > codec needs > to maintain the last golden frame in addition to the previous decoded > frame since motion vectors can be predicted from either. > > If I have gotten anything wrong about anything stated above, > please yell very loudly right now. But I hope this helps. > > Thanks... > -- > -Mike Melanson > > > > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-dev-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
quick points -- on the coeff layout, both patent and datarate considerations exist. The original VP3 was intended as a low-resolution, low-datarate codec, so the performance issues were not paramount. On modern PC's, even a 320x240 image can mostly be kept in cache, so it's not too bad a hit. It helps datarate given that the entropy coder is not so sophisticated. And it definitely avoids some of the most difficult patents. Downside is playback performance on higher-res material is probably not as great as we would like. However machines are pretty fast these days, and other codecs have other deficiencies that slow them down, so it's probably a wash. motion: VP3 uses only full-pel and 1/2 pel bilinear MV's. The wierd case is when X and Y vectors are both 1/2 pel aligned; in this case we interpolate along the diagonal (look at the code to find out exactly how that works). It's very fast with SIMD. VP3 only has I and P frames, no B frames. B frames traditionally don't help much on datarate per se; they are there mostly to avoid a 'pulsing' that can occur when keyframes come after alot of interframes (theoretically they could help with dissolves, but that's very encoder-specific). However good rate controls and smart encoding help to avoid the pulsing problem. B-frames are a huge religious issue among video coder geeks. We (at On2) are in the "B-frames are stupid and unnecessary" camp. They certainly make encoders & decoders much more complicated. And they kill latency. this just in: NUM_HUFF_TABLES = 80, MAX_ENTROPY_TOKENS = 32, tot = 2560> -----Original Message----- > From: Christoph Lampert [mailto:chl@math.uni-bonn.de] > Sent: Wednesday, February 26, 2003 2:48 PM > To: theora-dev@xiph.org > Subject: Re: [theora-dev] [OT] Just saying hi! > > > On Wed, 26 Feb 2003, Mike Melanson wrote: > > > I myself am most involved into motion estimation, so as > final questions > > > before letting you get back to work: Is the general method for ME > > > similar to MPEG? I-frames (golden... I like that... ;-) , > P-frames, > > > B-frames? Motion estimation is done blockwise? Is there > support for > > > global motion? > > > > I haven't gotten too much into the motion stuff > (getting close to > > decoding keyframes right now). And I'm still a little fuzzy > on motion > > esitmation/compensation in general. Can you give a brief > synopsis of how > > ME/MC stuff operates in MPEG variants, particularly where > half-pel and > > quarter-pel prediction figure in and how SIMD averaging > instructions help > > with that? > > ;-) Hey, I thought _I_ was the one to be asking questions... > > I am not 100% sure about other MPEG variants (because MPEG-4 > also contains > H263 stuff), but in general it's like this: > > P-frames: > For every 16x16 macroblock a motion vector is determined. As a special > case it's also possible to save 4 vector, one for each 8x8 > luminance block > (chroma vector is averaged from those). Vectors are stored in > bitstream > relative to a median prediction (left, top, top-right). > Maximum range is > controlled by a "fcode" parameter, between -16/+16 and -128/+128. > > The vector points to a position in the reference frame (previous, I or > P) and the corresponding block there is used as basis for the > block in the > new frame. In the bitstream the (encoded, quantized, etc.) residue is > coded and added to the reference image. > The motion vector(s) can be in half- or quarterpel resolution. > > Halfpel: For positions that corresponds to fullpel positions > (both vector > components are even), the block is simply copied from the reference > frame. > For positions that correspond to halfpel positions (odd components) > a simple bilinear interpolation is used to determine the values of the > block to be used. > > Quarterpel is more difficult: Fulpel is again just copied, For halfpel > positions there is a complicated (and slow and ugly) process > of filtering > and "real" quarterpel is calculated from the filtered halfpel > positions > by bilear interpolation again. > > > SIMD instruction are helpful for the bilinear parts, of > course. In fact, > for encoding we calculate 3 full images of horizontally, > vertically and > horizontally&vertically bilinear filtered images and decode based on > even/odd values of vector components from which to simply > copy the block > (or from which to calcualte SUM-of-absolute-errors). > For Quarterpel this is more difficult, I don't now of any way > to really > use SIMD instructions (would of course be possible, but difficult and > maybe not much speedup). > > > B-frames are more difficult again. > Every block can be encoded like in a P-frame, with a vector pointing > backwards, or with a vector pointing forwards, or with two > vectors, one to > front, one to back, and the average of both blocks is used (SIMD!). > The last and most efficient mode is "direct mode", there the motion > vector is calculated from the vector of a co-located P-frame > (scaled), > and only a differential vector is save. The compensation process is > similar again, halfpel&quarterpel etc. > > > > I can tell you that VP3 has I-frames that have to persist until > > the next I-frame is decoded. Every successive P-frame can > use information > > from either the previous P-frame or from the most recent > I-frame. I can > > also tell you that the maximum range of a motion vector > appears to be > > +/-31, according to my old notes on the decoder: > > http://www.pcisys.net/~melanson/codecs/vp3-notes.txt > > Also, the decoder has code to support sub-pixel motion > compensation, but I > > have not gotten to investigate that too deeply yet (which > is why I was > > hoping you will be able to fill me in on the MPEG methods). > > In general, halfpel is very efficient and not very slow (for > MMX CPUs). > Efficiency of Quarterpel depends on the input material, and the MPEG-4 > rules for it really s*ck. Maybe bilinear interpolation would have been > better, again... > > > Christoph > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-dev-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
> One of the target applications I have in mind for thiscodec is a> Sega Dreamcast port. The DC has a 200 MHz HitachiSH-4 CPU. I have no idea> how VP3 will perform but I am very eager to find out.I'm not sure if you are aware of this, but the DCDivX player is already capable of playing VP3: http://www.moosegate.com/betaboy/dcdivx/ Source is available. I'm not sure about performance, but it seems to work okay. -Henry Mason -- _______________________________________________ http://www.operamail.com Now with OperaMail Premium for only US$29.99/yr Powered by Outblaze --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Hi, this thread has passed away long ago, but I just came up with an small idea I'd like to ask you about: On Wed, 26 Feb 2003, Mike Melanson wrote:> * Golden frame: VP3 calls its keyframes "golden frames". The codec needs > to maintain the last golden frame in addition to the previous decoded > frame since motion vectors can be predicted from either.Why did you fix this to be the last keyframe? The image could/will be outdated after a few moving frames, and then never be useful again. Instead it might be helpful to let the encoder decide which frame should be put into this extra buffer. A one bit flag in bitstream (or different header ID) could tell that the current frame should replace previous golden frame when it would normally be discarded. E.g. sometimes it might be helpful to always buffer the last 2 frames, or to have a frame from about 1 second ago available, to reconstruct background behind moving objects. It's perfectly simple to be made backwards compatible (if the flag is never set, only the keyframes are buffered), it's no waste of bits, the decoder doesn't have to be changed much (essentially one extra memcpy), and if the encoder doesn't know about it, it doesn't matter, either, everything stays as it is now. But if the encoder is clever, it might really save some bitrate. Later versions can extend it to n buffers (FIFO style), again without extra overhead, except for mildly higher memory consumption. Christoph -- Christoph H. Lampert chl@math,uni-bonn,de | Diese Signature ist maschinell Beringstr. 6, Zi. 15, 53115 Bonn, Germany | erstellt und auch ohne Unter- Tel. (0228) 73-4708 Fax. +49 228 73-7916 | schrift wirksam. AZ 27B/6 <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
It's not a bad idea (and I think something like that was in previous versions of the code), but at this point I can't recommend any bitstream changes for version 1.0. Regardless of how simple it sounds, it would require reasonably extensive testing with a modified encoder to be considered stable. <p> ___ Dan Miller (++,) Founder, CTO, On2 Technologies> -----Original Message----- > From: Christoph Lampert [mailto:chl@math.uni-bonn.de] > Sent: Wednesday, March 26, 2003 10:52 AM > To: theora-dev@xiph.org > Subject: Re: [theora-dev] [OT] Just saying hi! > > > Hi, > > this thread has passed away long ago, but I just came up with an small > idea I'd like to ask you about: > > On Wed, 26 Feb 2003, Mike Melanson wrote: > > * Golden frame: VP3 calls its keyframes "golden frames". > The codec needs > > to maintain the last golden frame in addition to the > previous decoded > > frame since motion vectors can be predicted from either. > > Why did you fix this to be the last keyframe? The image could/will be > outdated after a few moving frames, and then never be useful again. > > Instead it might be helpful to let the encoder decide which > frame should > be put into this extra buffer. A one bit flag in bitstream (or > different header ID) could tell that the current frame should replace > previous golden frame when it would normally be discarded. > E.g. sometimes it might be helpful to always buffer the last > 2 frames, > or to have a frame from about 1 second ago available, to reconstruct > background behind moving objects. > It's perfectly simple to be made backwards compatible (if the flag is > never set, only the keyframes are buffered), it's no waste of bits, > the decoder doesn't have to be changed much (essentially one extra > memcpy), and if the encoder doesn't know about it, it doesn't matter, > either, everything stays as it is now. > But if the encoder is clever, it might really save some bitrate. > Later versions can extend it to n buffers (FIFO style), again without > extra overhead, except for mildly higher memory consumption. > > Christoph > > -- > Christoph H. Lampert chl@math,uni-bonn,de | Diese Signature > ist maschinell > Beringstr. 6, Zi. 15, 53115 Bonn, Germany | erstellt und auch > ohne Unter- > Tel. (0228) 73-4708 Fax. +49 228 73-7916 | schrift wirksam. > AZ 27B/6 > > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to > 'theora-dev-request@xiph.org' > containing only the word 'unsubscribe' in the body. No > subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered. >--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.