Silvia.Pfeiffer@csiro.au
2005-Nov-10 12:39 UTC
[ogg-dev] OggPCM version / header finalization
Hi John, all, I still have at least 3 issues: 1) What are we trying to achieve with the "source-ID"? 8 [uint] Source ID (Unique amongst all OggPCM streams in the physical stream) Are we trying to separate the different channels that may be interleaved with each other inside the flat multi-channel sample stream? Interpretation 1: ----------------- So, would each channel be in a separate OggPCM logical bitstream and interleaving would happen at the ogg level? If that is the case, then this is unnecessary since every logical bistream in Ogg framing already gets attributed a different serial number in the page header. To illustrate: (and just for the record: I don't think this is the way to go). logical bitstream with samples for 6 channels ---------------------------------------------------------------- > |1|2|3|4|5|6|1|2|3|4|5|6|1|2|3|4|5|6|1|2|3|4|5|6|1|2|3|4|5|6|< ---------------------------------------------------------------- | reordering into 6 logical bitstreams v logical bitstreams for 6 channels --------------- --------------- --------------- --------------- > |1|1|1|1|1| < > |2|2|2|2|2| < > |3|3|3|3|3| < > |4|4|4|4|4| < etc. --------------- --------------- --------------- --------------- | segmentation for each logical bitstream v packet_1 (1) packet_2 (1) ------------------------------ ------------------- .. |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3| .. ------------------------------ ------------------- packet_1 (2) packet_2 (2) ------------------------------ ------------------- .. |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3| .. ------------------------------ ------------------- etc. | page encapsulation v page_1 (pkt_1(1) data) page_2 (pkt_1 data) page_3 (pkt_2 data) ------------------------ ---------------- ------------------------ |H|------------------- | |H|----------- | |H|------------------- | |D||seg_1|seg_2|seg_3| | |D|seg_4|s_5 | | |D||seg_1|seg_2|seg_3| | ... |R|------------------- | |R|----------- | |R|------------------- | ------------------------ ---------------- ------------------------ page_1 (pkt_1(2) data) page_2 (pkt_1 data) page_3 (pkt_2 data) ------------------------ ---------------- ------------------------ |H|------------------- | |H|----------- | |H|------------------- | |D||seg_1|seg_2|seg_3| | |D|seg_4|s_5 | | |D||seg_1|seg_2|seg_3| | ... |R|------------------- | |R|----------- | |R|------------------- | ------------------------ ---------------- ------------------------ etc. | multiplexing v physical bitstream (1) (2) ------------------------ ------------------------ |H|------------------- | |H|------------------- | |D||seg_1|seg_2|seg_3| | |D||seg_1|seg_2|seg_3| | ... etc. |R|------------------- | |R|------------------- | ------------------------ ------------------------ Interpretation 2: ----------------- Or are we trying to describe how the data of the different channels are interleaved with each other inside the one one flat logical bitstream? Then I think this one field should *not* provide an ID but rather an interleaving description. Since usually channels are just consecutively ordered (if I'm not mistaken), this information may not be necessary at all. To illustrate: logical bitstream with samples for 6 channels ---------------------------------------------------------------- > |1|2|3|4|5|6|1|2|3|4|5|6|1|2|3|4|5|6|1|2|3|4|5|6|1|2|3|4|5|6|< ---------------------------------------------------------------- | packet definition v e.g. take 2 complete samples as a packet (just to illustrate!) packet_1 packet_2 ------------------------- ------------------------- ... |1|2|3|4|5|6|1|2|3|4|5|6| |1|2|3|4|5|6|1|2|3|4|5|6| ... ------------------------- ------------------------- | segmentation v packet_1 packet_2 -------------------------------- ------------------------------- .. |seg_1|seg_2|seg_3|seg_4|seg_5 | |seg_1|seg_2|seg_3|seg_4|seg_5| .. -------------------------------- ------------------------------- | page encapsulation v page_1 (pkt_1 data) page_2 (pkt_1 data) page_3 (pkt_2 data) ------------------------ ----------------- ------------------------ |H|------------------- | |H|------------ | |H|------------------- | |D||seg_1|seg_2|seg_3| | |D|seg_4|seg_5| | |D||seg_1|seg_2|seg_3| | ... |R|------------------- | |R|------------ | |R|------------------- | ------------------------ ----------------- ------------------------ 2) I don't understand what the "channel block" is for? What is it trying to describe? 3) I also still don't understand why packets require an extra landmark (i.e. the data packet header). I don't buy into Arc's argument that this framing is necessary to keep space for potential future additional header fiels. No other uncompressed audio format requires extra framing and header information for the samples (FAIK), so I cannot see how future additional header fields would require to be added. It should be made clear in the bos page how many samples go into a packet and thus packed framing is unnecessary. This field is just complicating decoding through an unnecessary extra parsing step IMHO. Cheers, Silvia. -----Original Message----- From: ogg-dev-bounces@xiph.org on behalf of John Koleszar Sent: Fri 11/11/2005 6:43 AM To: ogg-dev@xiph.org Cc: Subject: [ogg-dev] OggPCM version / header finalization I have OggPCM (as currently defined) support implemented in mencoder and mplayer. I'd like to request that we settle on modifications to this header by the middle of next week or freeze the current header as the official major version 1.0, so I can get the patches cleaned up and released. We will be shipping a separate product based on this work in the near-term future, and compatability with a community standard is a desired bullet. Thanks.. John _______________________________________________ ogg-dev mailing list ogg-dev@xiph.org http://lists.xiph.org/mailman/listinfo/ogg-dev
Silvia.Pfeiffer@csiro.au wrote:>1) What are we trying to achieve with the "source-ID"? > >Interpretation 1: >----------------- >So, would each channel be in a separate OggPCM logical bitstream and interleaving would happen at the ogg level? If that is the case, then this is unnecessary since every logical bistream in Ogg framing already gets attributed a different serial number in the page header. > > >Interpretation 2: >----------------- >Or are we trying to describe how the data of the different channels are interleaved with each other inside the one one flat logical bitstream? Then I think this one field should *not* provide an ID but rather an interleaving description. Since usually channels are just consecutively ordered (if I'm not mistaken), this information may not be necessary at all. > >Basically I was trying to provide a method where a logical bitstream could contain only a subset of the total number of channels of the source. For instance, a 5.1 signal could be broken up into 3 logical bitstreams (A high fidelity stereo pair, three channels of CD quality audio, and a LFE channel, all with different sampling parameters) The source ID would be used to remux them. This is similar to the page serial number, I know, but I don't think that's sufficient if data from the same source is spread across multiple logical streams, and you want to support multiple sources in the same overall stream. A source is considered to be a group of correlated channels, for instance a movie player app, a music player app, or a voice chat app. This field isn't needed if you require a stream to consist of data only from a single source, or you rely on cooperative applications to agree on a page serial number scheme. I'm not convinced it's needed, but the discussion should be had. I know that splitting a source up across multiple logical streams is ugly, but I can't think of any clean way to provide multiple sampling parameters within a single stream. In most cases, all channels will have the same sample parameters, so they will all be in a single logical stream. I really think that requiring fixed sample parameters per logical stream is a smart constraint to make.>2) I don't understand what the "channel block" is for? >What is it trying to describe? > >Mostly doing something useful with extra bits :) It was a way to have more than 16 channels per source, since I'm sure some crazy person thinks that's not enough :) I just proposed banking them, so you'd have 256 banks of 16.. Think of it as the 8 most significant bits of a 12 bit channel id.>3) I also still don't understand why packets require an extra landmark (i.e. the data packet header). >I don't buy into Arc's argument that this framing is necessary to keep space for potential future additional header fiels. No other uncompressed audio format requires extra framing and header information for the samples (FAIK), so I cannot see how future additional header fields would require to be added. It should be made clear in the bos page how many samples go into a packet and thus packed framing is unnecessary. This field is just complicating decoding through an unnecessary extra parsing step IMHO. > >I do think it's important that there be some mechanism to put extra pages between the page 0 header packet and the first data packet. That could be done by adding a field indicating the number of header packets, or as is currently proposed. These potential header packets could contain things like the CDDB ID, ID3 tag like info, etc. It's good future proofing. Note that if an application wants to use this data like an array in memory, it probably has to memcpy it to it's own allocated buffer before accessing it more than a byte at a time. This is because, AFAICT, libogg makes no guarantee about the alignment of the data pointed to by the ogg_packet.packet member, and there are architecture specific requirements about unaligned multibyte accesses. If you're going to have to copy the buffer somewhere anyway, I don't think it's a huge burdon to advance the pointer 4 bytes. Either way is fine with me though.