Nico Sabbi
2005-Apr-28 02:23 UTC
[Vorbis-dev] Vorbis bistream definition / separation from ogg
Hi, I have some questions: 1) in the sources of the encoder_example (and of course in oggenc.c, too) I see a lot of dependance on ogg. I want to get totally rid of ogg and use the vorbis bitstream alone. From this reason stems the second question: 2) is there a definition of the vorbis bitstream somewhere? I don't need to know every single detail, just the informations necessary to isolate frame boundaries, frame size, and frame duration in ms. 3) does vorbis always use a variable number of samples per frame? If so, is there a way to know from every frame how many samples are used? Is there any disadvantage at using a constant number of samples per frame (as in mpeg audio, aac, musepack and so on) ? Thanks, Nico
Michael Smith
2005-Apr-28 02:54 UTC
[Vorbis-dev] Re: Vorbis bistream definition / separation from ogg
Hi Nico, The vorbis specification has the complete definition of the bitstream format. It also explains how this bitstream gets embedded in ogg, but that's a seperate section, you can ignore it if you want. Note that from the raw vorbis bitstream, you CANNOT isolate all frame boundaries. You must explicitly provide that information in another layer (typically, ogg is used for this, but you could use anything else). The spec will give you the other information you want here. Within a given stream, vorbis frames are always one of two sizes. What those two sizes are is information given in the primary vorbis header (though to figure out which one a particular block uses, you need info from the setup header as well - I guess that answers that part of your question). There are many disadvantages to using only a single block size (number of samples per frame). You are incorrect in thinking that things like mp3, aac, etc. have constant frame sizes - like vorbis, they use two frame sizes (at least mp3 does, the others definitely use more than one, but I'm not certain that it's two). Mike On 4/28/05, Nico Sabbi <nsabbi@tiscali.it> wrote:> Hi, > I have some questions: > > 1) in the sources of the encoder_example (and of course in oggenc.c, > too) I see a lot of dependance > on ogg. I want to get totally rid of ogg and use the vorbis bitstream > alone. > From this reason stems the second question: > > 2) is there a definition of the vorbis bitstream somewhere? I don't need > to know every single > detail, just the informations necessary to isolate frame boundaries, > frame size, and frame duration in ms. > > 3) does vorbis always use a variable number of samples per frame? If so, > is there a way to know > from every frame how many samples are used? Is there any disadvantage at > using a constant > number of samples per frame (as in mpeg audio, aac, musepack and so on) ? > > Thanks, > > Nico > > > _______________________________________________ > Vorbis-dev mailing list > Vorbis-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/vorbis-dev >
Ralph Giles
2005-Apr-28 10:42 UTC
[Vorbis-dev] Vorbis bistream definition / separation from ogg
On Thu, Apr 28, 2005 at 11:21:37AM +0200, Nico Sabbi wrote:> 1) in the sources of the encoder_example (and of course in oggenc.c, > too) I see a lot of dependance > on ogg. I want to get totally rid of ogg and use the vorbis bitstream > alone.Note also that the reference implementations uses libogg's bitpacking routines, so you need this dependency for the sake of the code even if you don't use the Ogg container. These are completely separate issues, though; the bitpacker is just abstracted into libogg for conveninence.> 3) does vorbis always use a variable number of samples per frame? If so, > is there a way to know > from every frame how many samples are used? Is there any disadvantage at > using a constant > number of samples per frame (as in mpeg audio, aac, musepack and so on) ?Unfortunately it is some work to get the number of samples per frame (packet in vorbis terminology). The two possibly lengths are given as a simple field in the info header, but which on a given data packet uses is looked up through a table from the setup header, using the mode number at the start of the packet as an index. So you have to do a fast parse of the setup header to be able to determine this without doing a full decode. Hope that helps, -r