Hi Vorbis-Dev,
I'm investigating various WebM/Vorbis bugs in chromium. AFAIK muxing Vorbis
inside of WebM does not have an official specification, so I'm using
ffmpeg's implementation to try to answer 2 questions:
   1. Under what circumstances is it valid to find WebM Blocks containing
   Vorbis data with zero duration? (This would mean the next Block in the
   Cluster has the exact same timecode).
   2. FFmpeg seems to use granulepos for presentation timestamp - is this
   correct? See here:
   https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libvorbisenc.c#L345l
To me it seems the breakdown of ffmpeg's libvorbis_encode_frame is:
   - get an ogg_packet from libvorbis (I think this contains a *single*
   vorbis block, right?)
   - store just the data from that packet (no header) in ffmpeg's own
   AVPacket struct
   - eventually this AVPacket data will be inserted as the contents of a
   WebM (Matroska) block:
   https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/matroskaenc.c#L1584
Starting with question 1, some important context comes from this excerpt of
the vorbis spec:
*Data is not returned from the first frame; it must be used to ?prime? the
decode engine. The encoder accounts for this priming when calculating PCM
offsets; after the first frame, the proper PCM output offset is ?0?
- http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1
<http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1>*
Am I right that "frame" and "block" are interchangable in
the vorbis spec?
If so, I would then expect the granulepos of the first ogg_packet processed
by ffmpeg to be 0. I'm using this definition of granulepos:
*This is the last sample, frame or other unit of information ('granule')
that can be completely decoded from this packet
- https://xiph.org/ogg/doc/libogg/ogg_packet.html
<https://xiph.org/ogg/doc/libogg/ogg_packet.html>*
IIUC, we can derive duration from granulepos by simply scaling the count of
samples up by the number of samples / time unit. So my answer to question 1
would be: the first block could / should have 0 duration. And my answer to
question 2 would be: they should not use granulepos for presentation time -
they will always be off by the duration of a packet.
What do you guys think? What am I missing?
Thanks!
Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.xiph.org/pipermail/vorbis-dev/attachments/20150618/c5532dfe/attachment.htm
Hey all, friendly ping :) Any help is much appreciated. On Thu, Jun 18, 2015 at 6:56 PM, Chris Cunningham <chcunningham at chromium.org> wrote:> Hi Vorbis-Dev, > > I'm investigating various WebM/Vorbis bugs in chromium. AFAIK muxing > Vorbis inside of WebM does not have an official specification, so I'm using > ffmpeg's implementation to try to answer 2 questions: > > 1. Under what circumstances is it valid to find WebM Blocks containing > Vorbis data with zero duration? (This would mean the next Block in the > Cluster has the exact same timecode). > > 2. FFmpeg seems to use granulepos for presentation timestamp - is this > correct? See here: > > https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libvorbisenc.c#L345l > > To me it seems the breakdown of ffmpeg's libvorbis_encode_frame is: > > - get an ogg_packet from libvorbis (I think this contains a *single* > vorbis block, right?) > - store just the data from that packet (no header) in ffmpeg's own > AVPacket struct > - eventually this AVPacket data will be inserted as the contents of a > WebM (Matroska) block: > > https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/matroskaenc.c#L1584 > > Starting with question 1, some important context comes from this excerpt > of the vorbis spec: > > > *Data is not returned from the first frame; it must be used to ?prime? the > decode engine. The encoder accounts for this priming when calculating PCM > offsets; after the first frame, the proper PCM output offset is ?0? > - http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1 > <http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1>* > > Am I right that "frame" and "block" are interchangable in the vorbis spec? > If so, I would then expect the granulepos of the first ogg_packet processed > by ffmpeg to be 0. I'm using this definition of granulepos: > > *This is the last sample, frame or other unit of information ('granule') > that can be completely decoded from this packet > - https://xiph.org/ogg/doc/libogg/ogg_packet.html > <https://xiph.org/ogg/doc/libogg/ogg_packet.html>* > > IIUC, we can derive duration from granulepos by simply scaling the count > of samples up by the number of samples / time unit. So my answer to > question 1 would be: the first block could / should have 0 duration. And my > answer to question 2 would be: they should not use granulepos for > presentation time - they will always be off by the duration of a packet. > > What do you guys think? What am I missing? > > Thanks! > Chris >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/vorbis-dev/attachments/20150624/6ef424ad/attachment.htm
On Wed, 24 Jun 2015, Chris Cunningham wrote:> Hey all, friendly ping :) Any help is much appreciated.Presumably, vorbis is muxed however mkvmerge has traditionally handled it. I should think that ffmpeg follows that (possibly better than mkvmerge, given some examples I've seen lately). I wonder if a matroska list/forum wouldn't be a better place to ask?