Hello learned ogg folks, and welcome to 2007. Sadly I am back at work already, and I'd like to seek your advice. We need to store raw RTP packets on disk as they are received from the network. There will be multiple streams of media--at least one audio and one video--that all need to go in the same file. We have decided to use ogg because it is the simplest container format that meets our modest needs, and because the members of its dev mailing list are so darned good looking. The other sort-of contenders were pcap and mpeg4, btw. Currently ours is the only application that will be reading these packets back off disk, so there are no interoperability concerns, so one valid answer to anything I ask now is "do whatever you damned well want and let us get back to our drinking." Nevertheless I will continue. My original plan was to store one RTP packet per ogg packet, one packet per page, with the granule position of the page to the arrival time of the RTP packet, and metadata in a custom format in the BOS packet. [ Which reminds me of an early trap for young players I fell right into: why on earth is the granule position part of the packet in the API when it is actually part of a page? ] But this naive design gnawed at my conscience. Other ogg parsers would make no sense of our files. I should use liboggz and a skeleton stream. But I like the simplicity of libogg and I would like to avoid depending on a second external library. Does anyone else have experience to share about using Ogg for their own in-house applications? Or any advice they can give? Should I consider libogg2 for a commercial app due mid-year? Thanks in advance, and happy hangovers.
On Wed, Jan 03, 2007 at 03:47:38PM +1300, Andrew Donkin wrote:> My original plan was to store one RTP packet per ogg packet, one packet > per page, with the granule position of the page to the arrival time > of the RTP packet, and metadata in a custom format in the BOS packet.That's fine. SSRC as stream serialno? Note that by using arrival time as the granulepos you're making seeking for playback harder on other implementors. But as you say, it's just your application so far.> [ Which reminds me of an early trap for young players I fell right into: > why on earth is the granule position part of the packet in the API > when it is actually part of a page? ]The idea is that you read and write packets and let libogg worry about the pages. Writers set a granulepos on each packet they creat, readers don't panic if some packets have an unset (-1) granulepos. I guess we kind of assume you know how Ogg works.> But this naive design gnawed at my conscience. Other ogg parsers would > make no sense of our files. I should use liboggz and a skeleton stream. > But I like the simplicity of libogg and I would like to avoid depending > on a second external library.You can use skeleton without using liboggz, just write your own. You can still use liboggz for compliance checks.> Should I consider libogg2 for a commercial app due mid-year?I wouldn't recommend it. It would be fine if you want to spend some time on qa and bug fixing, but it's not nearly as well tested as libogg. Not that it wouldn't be great to have an application developr pushing it along. FWIW, -r
Ralph Giles wrote:> That's fine. SSRC as stream serialno?Probably not - in the absence of a skeleton stream I'll abuse the serial number to identify the streams: audio, video, etc.> Note that by using arrival time as the granulepos you're making seeking > for playback harder on other implementors.Do you mean because granulepos will not reference the previous key frame? I should have been more specific there: I was going to use the timestamp in the RTP packet for granulepos, rather than arrival time, which is a bit ropey. But now I'm having second thoughts.> The idea is that you read and write packets and let libogg worry about > the pages. Writers set a granulepos on each packet they creat, readers > don't panic if some packets have an unset (-1) granulepos.I don't quite get that. If I'm reading it correctly, pageout() gobbles 4K of data or 255+ packets before spitting out a page. At speech rates, 4096 bytes is half a second and five(ish) packets, which is way too long between "granules" I would have thought. How would a reader seek to within that half second? Would it find the packet with the granulepos it was after, being the last in the page with that granulepos, then use codec-specific timing inside the preceding packets to find the accurate starting point? This is a genuine question, by the way, for my information. It seems (please correct me) that my options are - flush one packet per page, use granulepos for accurate time stamping, - let libogg handle the paging and, well, to hell with granulepos - or a combination, flush after each RTP packet with its marker bit set. Which is every audio packet, I believe. Thanks for your reply, and thank in advance for your next :-)