Hi Vorbis-Dev, I'm investigating various WebM/Vorbis bugs in chromium. AFAIK muxing Vorbis inside of WebM does not have an official specification, so I'm using ffmpeg's implementation to try to answer 2 questions: 1. Under what circumstances is it valid to find WebM Blocks containing Vorbis data with zero duration? (This would mean the next Block in the Cluster has the exact same timecode). 2. FFmpeg seems to use granulepos for presentation timestamp - is this correct? See here: https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libvorbisenc.c#L345l To me it seems the breakdown of ffmpeg's libvorbis_encode_frame is: - get an ogg_packet from libvorbis (I think this contains a *single* vorbis block, right?) - store just the data from that packet (no header) in ffmpeg's own AVPacket struct - eventually this AVPacket data will be inserted as the contents of a WebM (Matroska) block: https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/matroskaenc.c#L1584 Starting with question 1, some important context comes from this excerpt of the vorbis spec: *Data is not returned from the first frame; it must be used to ?prime? the decode engine. The encoder accounts for this priming when calculating PCM offsets; after the first frame, the proper PCM output offset is ?0? - http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1 <http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1>* Am I right that "frame" and "block" are interchangable in the vorbis spec? If so, I would then expect the granulepos of the first ogg_packet processed by ffmpeg to be 0. I'm using this definition of granulepos: *This is the last sample, frame or other unit of information ('granule') that can be completely decoded from this packet - https://xiph.org/ogg/doc/libogg/ogg_packet.html <https://xiph.org/ogg/doc/libogg/ogg_packet.html>* IIUC, we can derive duration from granulepos by simply scaling the count of samples up by the number of samples / time unit. So my answer to question 1 would be: the first block could / should have 0 duration. And my answer to question 2 would be: they should not use granulepos for presentation time - they will always be off by the duration of a packet. What do you guys think? What am I missing? Thanks! Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/vorbis-dev/attachments/20150618/c5532dfe/attachment.htm
Hey all, friendly ping :) Any help is much appreciated. On Thu, Jun 18, 2015 at 6:56 PM, Chris Cunningham <chcunningham at chromium.org> wrote:> Hi Vorbis-Dev, > > I'm investigating various WebM/Vorbis bugs in chromium. AFAIK muxing > Vorbis inside of WebM does not have an official specification, so I'm using > ffmpeg's implementation to try to answer 2 questions: > > 1. Under what circumstances is it valid to find WebM Blocks containing > Vorbis data with zero duration? (This would mean the next Block in the > Cluster has the exact same timecode). > > 2. FFmpeg seems to use granulepos for presentation timestamp - is this > correct? See here: > > https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libvorbisenc.c#L345l > > To me it seems the breakdown of ffmpeg's libvorbis_encode_frame is: > > - get an ogg_packet from libvorbis (I think this contains a *single* > vorbis block, right?) > - store just the data from that packet (no header) in ffmpeg's own > AVPacket struct > - eventually this AVPacket data will be inserted as the contents of a > WebM (Matroska) block: > > https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/matroskaenc.c#L1584 > > Starting with question 1, some important context comes from this excerpt > of the vorbis spec: > > > *Data is not returned from the first frame; it must be used to ?prime? the > decode engine. The encoder accounts for this priming when calculating PCM > offsets; after the first frame, the proper PCM output offset is ?0? > - http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1 > <http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1>* > > Am I right that "frame" and "block" are interchangable in the vorbis spec? > If so, I would then expect the granulepos of the first ogg_packet processed > by ffmpeg to be 0. I'm using this definition of granulepos: > > *This is the last sample, frame or other unit of information ('granule') > that can be completely decoded from this packet > - https://xiph.org/ogg/doc/libogg/ogg_packet.html > <https://xiph.org/ogg/doc/libogg/ogg_packet.html>* > > IIUC, we can derive duration from granulepos by simply scaling the count > of samples up by the number of samples / time unit. So my answer to > question 1 would be: the first block could / should have 0 duration. And my > answer to question 2 would be: they should not use granulepos for > presentation time - they will always be off by the duration of a packet. > > What do you guys think? What am I missing? > > Thanks! > Chris >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/vorbis-dev/attachments/20150624/6ef424ad/attachment.htm
On Wed, 24 Jun 2015, Chris Cunningham wrote:> Hey all, friendly ping :) Any help is much appreciated.Presumably, vorbis is muxed however mkvmerge has traditionally handled it. I should think that ffmpeg follows that (possibly better than mkvmerge, given some examples I've seen lately). I wonder if a matroska list/forum wouldn't be a better place to ask?