thr3ads.net - ogg dev - [ogg-dev] OggPCM2 : chunked vs interleaved data [Nov 2005]

If this information is useful, please help other people find it:
Share via:

Jean-Marc Valin

2005-Nov-15 05:02 UTC

[ogg-dev] OggPCM2 : chunked vs interleaved data

I made a few updates to OggPCM2 http://wiki.xiph.org/index.php/OggPCM2
reflecting the latest discussions. Could everyone have a look at it and
see if they agree. Otherwise, what do you feel should be changed? Anyone
wants to speak in support of chunked PCM?

For all those that are just tired of this mess like me, please express
yourself in the new spec I created: OggPCM3
http://wiki.xiph.org/index.php/OggPCM3

	Jean-Marc

P.S. So far, I think we have OggPCM2 5, OggPCM 0. Please vote for
OggPCM3! :-)

Le mardi 15 novembre 2005 ? 11:21 +0100, Michael Smith a ?crit
:> On 11/15/05, Erik de Castro Lopo <mle+xiph@mega-nerd.com> wrote:
> > Hi all,
> >
> > The remaining issue to be decided for the OggPCM2 spec is the support
> > of chunked vs interleaved data.
> 
> I think interleaved is the obvious choice - that's what most audio
> applications are used to dealing with, it's what we need to feed to
> audio hardware in the end usually, etc.
> 
> Whilst I accept that there are many good uses for chunked data, I
> think the transformation is trivial, particularly given certain
> characteristics of the Ogg container. Remember, the data, if you read
> an ogg stream into memory, is _already_ likely to be non-contiguous,
> due to ogg's structure. It's trivial, and has insignificant
additional
> overhead, to de-interleave as you read it into a packet buffer.
> 
> So, you've forced additional implementation complexity onto all
> implementations, but the benefits aren't obviously significant.
> 
> Oh, and if it's not already obvious, I support this spec, rather than
Arc's.
> 
> Mike
> _______________________________________________
> ogg-dev mailing list
> ogg-dev@xiph.org
> http://lists.xiph.org/mailman/listinfo/ogg-dev
>

Rene Herman

2005-Nov-15 06:20 UTC

head link

[ogg-dev] OggPCM2 : chunked vs interleaved data

Jean-Marc Valin wrote:
> I made a few updates to OggPCM2 http://wiki.xiph.org/index.php/OggPCM2
> reflecting the latest discussions. Could everyone have a look at it and
> see if they agree. Otherwise, what do you feel should be changed?
You guys are probably off on some IRC channel somewhere discussing these 
things, but... why 64 bits for the CODEC identifier? 32 ("PCM ")
should
be fine?

Why store N-bit in the most significant bits and not least? Doesn't that 
mean an application would likely need to shift everything down again?

Pedantic: the sentence "Format IDs below 0x80000000 are reserved for use 
by Xiph and all the ones above are allowed for application-specific 
formats" leaves the use of 0x80000000 itself unspecified.

Rene.

John Koleszar

2005-Nov-15 06:51 UTC

head link

[ogg-dev] OggPCM2 : chunked vs interleaved data

Rene Herman wrote:
> Why store N-bit in the most significant bits and not least? Doesn't 
> that mean an application would likely need to shift everything down 
> again?
One advantage of storing in the MSB's is that the relative value remains 
correct when processed as the larger word size. For instance, a signed 
12 bit integer would use 0x400 to represent +50% amplitude. By packing 
this value into the MSB's of a 16 bit word, you get 0x4000, which still 
represents a +50% amplitude. This way any software that can work on 16 
bit samples will "do the right thing" on samples with lower
resolution.

One thing that should probably be added to the wiki is that the extra 
bits should be set in a round-towards-zero fashion - eg, 0 for positive 
numbers, 1 for negative numbers. This is probably worth discussing. 
Should we do it as I propose here, or is truncation a better way to go?
> Pedantic: the sentence "Format IDs below 0x80000000 are reserved for 
> use by Xiph and all the ones above are allowed for 
> application-specific formats" leaves the use of 0x80000000 itself 
> unspecified.
>Agreed. Perhaps "Format IDs with the most significant bit cleared are 
reserved for use by Xiph. Other formats are considered to be application 
specific, and MUST have this bit set." Objections?

John

Jean-Marc Valin

2005-Nov-15 13:37 UTC

head link

[ogg-dev] OggPCM2 : chunked vs interleaved data

> You guys are probably off on some IRC channel somewhere discussing these 
> things, but... why 64 bits for the CODEC identifier? 32 ("PCM ")
should
> be fine?
The only reason for having 64 bits is that most other Xiph codecs tend
to have about that length. I don't think it causes a real problem
anyway.

	Jean-Marc

Sampo Syreeni

2005-Nov-15 15:25 UTC

head link

[ogg-dev] OggPCM2 : chunked vs interleaved data

On 2005-11-16, Jean-Marc Valin wrote:
> Otherwise, what do you feel should be changed?
One obvious thing that seems to be lacking is the granulepos mapping. As 
suggested in Ogg documentation, for audio a simple sampling frame number 
ought to suffice, but I think the convention should still be spelled 
out.

Secondly, I'd like to see the channel map fleshed out in more detail. 
(Beware of the pet peeve...) IMO the mapping should cover at least the 
channel assignments possible in WAVE files, the most common Ambisonic 
ones, and perhaps some added channel interpretations like "surround" 
which are commonly used but lacking in most file formats. (For example, 
THX does not treat surround as a directional source, so the correct 
semantics cannot be captured e.g. by WAVE files. Surprisingly neither 
can the fact that some pair of channels is Dolby Surround encoded, as 
opposed to some form of vanilla stereo.)

(As a further idea prompted by ambisonic compatibility encodings, I'd 
also like to explore the possibility of multiple tagging. For example, 
Dolby Surround, Circle Surround, Logic 7 and ambisonic BHJ are all 
designed to be stereo compatible so that a legacy decoder can play them 
as-is. But if they are tagged as something besides normal stereo, such a 
decoder will probably just ignore them. So, there's a case to be made 
for overlapping, preferential tags, one telling the decoder that the 
data *can* be played as stereo, another one telling that it *should* be 
interpreted as, say, BHJ, and so on. Object minded folks can think of 
this as type inheritance of a kind. But of course this is more 
food-for-thought than must-have-feature since nobody else is doing 
anything of the sort at the moment.)
> Anyone wants to speak in support of chunked PCM?
Actually I'd like to add a general point against it. The chunked vs. 
interleaved question is an instance of the more general problem of 
efficiently linearizing a multidimensional structure. We want to do this 
so that typical access patterns (and in particular locality of access) 
translate gracefully and efficiently. Thus we group primarily by time 
(interleaving) when locality is by time (accessing a sample with a given 
sampling time most increases the odds that a sample with a close by 
sampling time is soon accessed) and primarily by channel (chunking) when 
locality is by channel (accessing a channel will make it probable that 
the same channel is accessed again); we also try to preserve rough order 
of access.

Ogg is primarily a streaming delivery application, so we usually access 
Ogg data by ascending time. Ogg does not support nonlinear space 
allocation or in-place modification, so editors which are probably the 
most important application in need of independently accessible channels 
will not be using it as an intermediate format in any case. We're also 
talking about multichannel audio delivery where the different channels 
are best thought of as part of a single multidimensional signal, not a 
library-in-a-file type collection of independent signals, so it can be 
argued that the individual channels do not really make sense in 
isolation. In this case access won't merely be localised in time, but in 
fact the natural access pattern for recorders, transmitters, players and 
even some filters is a dense, temporally ascending scan over some 
interleaved channel ordering.

If we think of Ogg as a line format, all this translates into lower 
packetization latency and memory requirements (buffer per multichannel 
stream vs. buffer per channel) for interleaved data; if we think of Ogg 
as a file format it translates into fewer seeks and less framing 
overhead while streaming from disk. In most cases a chunked layout has 
no countervailing benefits. Even interfaces which go with separate 
channels aren't such a good reason to offer a chunking option because 
were probably designed with some other application (like interactive 
gaming or offloading processing load onto a peripheral) in mind, or 
might simply be badly engineered (just about anything from MS).

Furthermore, if we really encounter an application which would benefit 
from grouping by channel (say, language variants of the same 
soundtrack), that can already be accomplished via multiple logical 
streams. In fact the multiplexing machinery is there for this precise 
purpose: the packet structure is a deliberate tradeoff between the 
temporal order always present in streaming files and the conflicting 
interest in limiting latency, error propagation and buffer consumption, 
brought on by parallelism, correlations and indivisibilities over 
dimensions other than time. If the channels are so independent of each 
other or so internally cohesive that chunking is justified, then they 
ought to be independent enough for standalone use and for placement in 
separate logical streams, or even separate files. Whatever 
interdependencies they might have ought to be exposed to the consumer 
via OggSkeleton or external metadata in any case. Thus whatever we want 
to accomplish by chunking is probably better accomplished by the broader 
Ogg framework, or by some mechanism besides Ogg altogether.

The only valid reason to chunk the data I can think of is bitrate 
peeling: chunking means that entire chunks/packets can be skipped to 
drop channels. But this clearly isn't the best way to go about peeling 
because, as I said, audio channels tend to be tightly coupled. We don't 
go from stereo to mono by cleaving off the right or left channel, but by 
summing, and if we simply drop a surround channel, we'll also break any 
multichannel panning law. Thus if we want to enable peeling, we have to 
use things akin to mid/side coding (like the UHJ hierarchy) or joint 
progressive coding over the entire set of channels (e.g. Vorbis's 
progressive vector quantization), and only then reorder and chunk the 
data. As a result this sort of stuff will always be encoding dependent 
and it shouldn't be specified at a higher level of generalization where 
the machinery could end up being used for the wrong sort of encoding 
(e.g. vanilla 5.1) and would impose its overheads (e.g. latency) 
indiscriminately.

Not surprisingly this is how it's already done in Ogg: at least Vorbis 
specifies that peeling is to be carried out by a codec specific peeler 
operating within packets. The considerations which yielded this decision 
apply directly to an intermediate level abstraction like OggPCM (below 
Ogg multiplexing but also above a specific PCM coding like 16-bit big 
endian B-format), so I think incorporating a chunking option here would 
really represent a case of reinventing the wheel, square.

(Newbie intro: I'm a 27-year old Finnish math/CS student and coder, with 
a long term personal interest in both audio processing and external 
memory algorithms, yet without an open source implementation background. 
I joined the list after OggPCM was mentioned on sursound, so it's also 
safe to assume I'm an ambisonic bigot.)
-- 
Sampo Syreeni, aka decoy - mailto:decoy@iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

Jean-Marc Valin

2005-Nov-15 15:38 UTC

head link

[ogg-dev] OggPCM2 : chunked vs interleaved data

> One obvious thing that seems to be lacking is the granulepos mapping. As 
> suggested in Ogg documentation, for audio a simple sampling frame number 
> ought to suffice, but I think the convention should still be spelled 
> out.
I was under the (maybe wrong) impression that the Ogg spec already
covered everything that's needed for granulepos. If that's not the case,
please suggest some text.
> Secondly, I'd like to see the channel map fleshed out in more detail. 
> (Beware of the pet peeve...) IMO the mapping should cover at least the 
> channel assignments possible in WAVE files, the most common Ambisonic 
> ones, and perhaps some added channel interpretations like
"surround"
> which are commonly used but lacking in most file formats. (For example, 
> THX does not treat surround as a directional source, so the correct 
> semantics cannot be captured e.g. by WAVE files. Surprisingly neither 
> can the fact that some pair of channels is Dolby Surround encoded, as 
> opposed to some form of vanilla stereo.)
You mean describing the enums for "Channel Mapping Header" just like
we
have the the format? Yes, this definitely needs to be done. My comment
about OggPCM2 being nearly done obviously didn't apply to the extra
headers (which can still be defined afterwards anyway). Some default
mappings may be useful too (e.g. by default, 2 channels is stereo and
left is encoded first).
> (As a further idea prompted by ambisonic compatibility encodings, I'd 
> also like to explore the possibility of multiple tagging. For example, 
> Dolby Surround, Circle Surround, Logic 7 and ambisonic BHJ are all 
> designed to be stereo compatible so that a legacy decoder can play them 
> as-is. But if they are tagged as something besides normal stereo, such a 
> decoder will probably just ignore them. So, there's a case to be made 
> for overlapping, preferential tags, one telling the decoder that the 
> data *can* be played as stereo, another one telling that it *should* be 
> interpreted as, say, BHJ, and so on. Object minded folks can think of 
> this as type inheritance of a kind. But of course this is more 
> food-for-thought than must-have-feature since nobody else is doing 
> anything of the sort at the moment.)
I would say that this can probably be handled by the "Channel Conversion
Header", don't you think. I was also wondering if it was a good to
actually suggest (as in "implementers SHOULD") certain default
mappings,
for example in down-sampling from stereo to mono and all.
> > Anyone wants to speak in support of chunked PCM?
> 
> Actually I'd like to add a general point against it.
Good :-)

	Jean-Marc

Erik de Castro Lopo

2005-Nov-15 15:49 UTC

head link

[ogg-dev] OggPCM2 : chunked vs interleaved data

Sampo Syreeni wrote:
> Secondly, I'd like to see the channel map fleshed out in more detail. 
Sampo, I'm the one who came up with the channel mapping this.

Let me flesh it out a bit more this evening.
> (As a further idea prompted by ambisonic compatibility encodings, I'd 
> also like to explore the possibility of multiple tagging. For example, 
> Dolby Surround, Circle Surround, Logic 7 and ambisonic BHJ are all 
> designed to be stereo compatible so that a legacy decoder can play them 
> as-is. But if they are tagged as something besides normal stereo, such a 
> decoder will probably just ignore them. So, there's a case to be made 
> for overlapping, preferential tags, one telling the decoder that the 
> data *can* be played as stereo, another one telling that it *should* be 
> interpreted as, say, BHJ, and so on. Object minded folks can think of 
> this as type inheritance of a kind. But of course this is more 
> food-for-thought than must-have-feature since nobody else is doing 
> anything of the sort at the moment.)
Doesn't the Channel Conversion Header fulfil this need? Maybe
it needs a bit more explanation and an example.
> > Anyone wants to speak in support of chunked PCM?
> 
> Actually I'd like to add a general point against it.
Thanks for speaking up. Opinion noted.

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
A Microsoft Certified System Engineer is to computing what a
MacDonalds Certified Food Specialist is to fine cuisine.

Erik de Castro Lopo

2005-Nov-17 02:00 UTC

head link

[ogg-dev] OggPCM2 : chunked vs interleaved data

Sampo Syreeni wrote:
> Secondly, I'd like to see the channel map fleshed out in more detail. 
Sampo,

I did flesh out the wiki a **little** more. Is the intent clearer now?
> (Beware of the pet peeve...)
What is that pet peeve?
> IMO the mapping should cover at least the 
> channel assignments possible in WAVE files, the most common Ambisonic 
> ones, and perhaps some added channel interpretations like
"surround"
> which are commonly used but lacking in most file formats.
I haven't enumerated them all, but we should be able to without too
much trouble,

 (For example, > THX does not treat surround as a directional source, so the correct 
> semantics cannot be captured e.g. by WAVE files.
Do you have any more info about THX? I've searched the web and found
little of any worth.
> (As a further idea prompted by ambisonic compatibility encodings, I'd 
> also like to explore the possibility of multiple tagging. For example, 
> Dolby Surround, Circle Surround, Logic 7 and ambisonic BHJ are all 
> designed to be stereo compatible so that a legacy decoder can play them 
> as-is.
Does the Channel Conversion Header cover this?

Cheers,
Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
"The lusers I know are so clueless, that if they were dipped in
 clue musk and dropped in the middle of pack of horny clues, on
 clue prom night during clue happy hour, they still couldn't get
 a clue."   --Michael Girdwood, in the monastery

Seemingly Similar Threads

Search for more maybe matching threads

ogg dev - Nov 2005 - OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

[ogg-dev] OggPCM2 : chunked vs interleaved data

Seemingly Similar Threads