thr3ads.net - ogg dev - [Vorbis-dev] Ambisonics in Ogg Vorbis [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Martin Leese

2007-Apr-14 21:24 UTC

[Vorbis-dev] Ambisonics in Ogg Vorbis

On 2/28/07, Ivo Emanuel Gon?alves <justivo@gmail.com> wrote:
> On 2/28/07, Ralph Giles <giles@xiph.org> wrote:
> > Well, there are todo pages at wiki.xiph.org, but I meant more in the
> > community folklore sense. My point is a roadmap doesn't help much
unless
> > there are people committed to making things happen. That's been
the
> > problem with a lot of this stuff, and why it's been so nice to see
the
> > ambisonics work happening.
>
> The situation on Ambisonics is tricky, because it depends on someone
> coding a whole API for the different Xiph projects AND Monty being
> available to apply whatever changes are need in Vorbis.
I have been giving some thought to how to
include Ambisonics in Ogg Vorbis.  There is a
question at the end, so please plough on.

As I understand it, all that is needed is some
machine parseable metadata to identify the
audio data as being Ambsionics.  The channel
coupling wont be optimal and the phase may
get a bit munged (Ambisonics is big on
low-frequency phase), but it will work.  And the
missing bits can then be worked on in Ghost
at peoples' leisure.

Now, Vorbis comments aren't intended for
machine parseable metadata, so the metadata
will need to go in the Ogg container as a
separate (chained) stream.  This scheme will
not only work for Ogg Vorbis, but for Ogg
<anything>.  There currently isn't a standard
for a metadata stream to go into Ogg, but
there is a draft standard at:
http://wiki.xiph.org/index.php/Metadata

According to this draft standard, all I need to
do is to invent some XML which includes the
required information, and we are away.

Now for the question; how much did I get wrong?

Many thanks,
Martin
-- 
Martin J Leese
E-mail: martin.leese@stanfordalumni.org
Web: http://members.tripod.com/martin_leese/

Ian Malone

2007-Apr-15 05:47 UTC

head link

[Vorbis-dev] Re: [ogg-dev] Ambisonics in Ogg Vorbis

Martin Leese wrote:> On 2/28/07, Ivo Emanuel Gon?alves <justivo@gmail.com> wrote:
> 
>> On 2/28/07, Ralph Giles <giles@xiph.org> wrote:
>> > Well, there are todo pages at wiki.xiph.org, but I meant more in
the
>> > community folklore sense. My point is a roadmap doesn't help
much
>> unless
>> > there are people committed to making things happen. That's
been the
>> > problem with a lot of this stuff, and why it's been so nice to
see the
>> > ambisonics work happening.
>>
>> The situation on Ambisonics is tricky, because it depends on someone
>> coding a whole API for the different Xiph projects AND Monty being
>> available to apply whatever changes are need in Vorbis.
> 
> I have been giving some thought to how to
> include Ambisonics in Ogg Vorbis.  There is a
> question at the end, so please plough on.
> 
> As I understand it, all that is needed is some
> machine parseable metadata to identify the
> audio data as being Ambsionics.  The channel
> coupling wont be optimal and the phase may
> get a bit munged (Ambisonics is big on
> low-frequency phase), but it will work.  And the
> missing bits can then be worked on in Ghost
> at peoples' leisure.
> 
> Now, Vorbis comments aren't intended for
> machine parseable metadata, so the metadata
> will need to go in the Ogg container as a
> separate (chained) stream.  This scheme will
> not only work for Ogg Vorbis, but for Ogg
> <anything>.  There currently isn't a standard
> for a metadata stream to go into Ogg, but
> there is a draft standard at:
> http://wiki.xiph.org/index.php/Metadata
> 
> According to this draft standard, all I need to
> do is to invent some XML which includes the
> required information, and we are away.
> 
> Now for the question; how much did I get wrong?
> 
It depends what your aim is.  The mapping type
in the vorbis setup header is meant for
this[1],[2].  Of course a nonzero mapping type will
cause a lot of players to give up, but so will
including the XML stream.  I believe this is how
is was intended multi-channel would be handled.

As you say using a separate metadata stream would
allow all codecs to use the same scheme, but the
codec would need to communicate this with the muxer
if it wanted to use knowledge of the mapping.
Vorbis and OggPCM have their own mapping information,
which also means that they can be put in containers
other than Ogg without losing the mapping.  (I think
FLAC does too.)

If you want to go the separate metadata route there's
the choice of metadata stream.  Skeleton[3],[4] is
already implemented in some places and typically
contains metadata relevant to stream decoding.  This
is mainly temporal information, but also, "allows for
attachment of message header fields given as name-
value pairs that contain some sort of protocol
messages about the logical bitstream, e.g. the screen
size for a video bitstream or the number of channels
for an audio bitstream."[5]

The metadata split that seems to be emerging is
decode related stuff goes in skeleton and other
metdata (e.g. indexing) goes into CMML/currently-
non-existent-XML-streams[6],[7].

Without knowing what you need the metadata to record
(I assume it can be fairly strictly defined?) I'd say
of the two metadata approaches going the Skeleton route
is the easier task here.  It avoids needing to parse XML
and Skeleton is more strictly defined as being in the
right place for decode steup.

[1]<http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#id2510452>
[2]<http://lists.xiph.org/pipermail/vorbis-dev/2007-February/018697.html>.
[3]<http://wiki.xiph.org/index.php/Ogg_Skeleton>
[4]<http://annodex.net/TR/draft-pfeiffer-annodex-02.html#anchor8>
[5]<http://annodex.net/TR/draft-pfeiffer-annodex-02.html#anchor6>
[6] And vorbiscomments for the basic TITLE,
     ARTIST, etc. stuff.
[7] This is probably because: a) work has been done on
     Skeleton, b) it's more obvious what decode related
     information is needed and how it should be used.

-- 
imalone

Martin Leese

2007-Apr-17 13:30 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

Paul Martin <pm@nowster.zetnet.co.uk> wrote:
> Getting stereo from Ambisonics is a very simple job -- just a matter of
> which matrix you multiply with. The crudest way would be to use L=(W-X)/2
> and R=(W+X)/2 and ignore all other channels.
Looks like a typo crept in.  The X-channel points
forward, so I suspect you meant Y which points
left.

As I explain at:
http://wiki.xiph.org/index.php/Ambisonics#Default_channel_conversions_from_B-Format
   "Starting from B-Format, it is possible to
    synthesize any mic response pointing in any
    direction. Hence, it is possible to synthesize
    all coincident stereo mic techniques."

What is suggested (with Y instead of X) will work,
but is just one possible mix from an infinite set.

Regards,
Martin
-- 
Martin J Leese
E-mail: martin.leese@stanfordalumni.org
Web: http://members.tripod.com/martin_leese/

Richard Lee

2007-Apr-21 00:05 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

Gregory Maxwell gmaxwell@gmail.com wrote
>It would be nice if someone with a working ambisonic playback rigwould give me some feedback on the decodes of coupled encodes. :)

Has anyone done this; tried coupled encode / decoder of ambisonic signals?

I get the impression from this forum, including Monty's post, that no one
has tried ANY coupling on a multi-channel file.  And only point and lossless are
in present use for stereo.

Monty said 8 phase & 4 phase coupling were once used in the dim &
distant past (for stereo) but not anymore.  Does anyone know how much
improvement in packing efficiency they give?

After Monty's post, Lossy coupling on Ambi files is VERBOTTEN !


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.463 / Virus Database: 269.5.6/770 - Release Date: 20/04/07 18:43

Richard Lee

2007-Apr-22 03:03 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

Paul Martin wrote :
> If you haven't got a true Ambisonic setup, you can get a flavour by
using the following script to convert a 4 channel .amb to a 4 channel
"square" speaker arrangement. (The latest SoX has a bug in the mixer
function, which stops it taking 16 numbers.)
> sox -V -S $1 -c 4 -3 -r 48000 $2 \ mixer \ 0.3536,0.3536,0.3536,0.3536,\ 0.1768,0.1768,-0.1768,-0.1768,\ 0.1768,-0.1768,-0.1768,0.1768,\ 0,0,0,0 \
 rabbit stat
> This then gives you a file with the channels in the format Left Front,
Right Front, Left Rear, Right Rear.
This is a "cardioid" decode; fairly non-descript.  Used mainly for
large area performance.  If your material is meant to be used at home, an
"Energy" decoder is better and can be obtained by replacing +-0.1768
with +- 0.25 in the above script.

"SHELF FILTERS for Ambisonic Decoders" from

www.ambisonia.net\Members\ricardo

Gregory Maxwell wrote :
>Gains for lossless packing were pretty high. Higher than for typicalstereo files.. at least on the ambisonic files I tested.
>From the nature of Ambisonic signals, I expect at least 2:1 gain if not more
for lossless multichannel coupling of Ambisonic B-format.


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.463 / Virus Database: 269.5.7/771 - Release Date: 21/04/07 11:56

Richard Lee

2007-Apr-23 15:15 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

> > sox -V -S $1 -c 4 -3 -r 48000 $2 \ mixer \
0.3536,0.3536,0.3536,0.3536,\
>  0.1768,0.1768,-0.1768,-0.1768,\ 0.1768,-0.1768,-0.1768,0.1768,\ 0,0,0,0 \
>  rabbit stat
> 
> > This then gives you a file with the channels in the format Left Front,
Right Front, Left Rear, Right Rear.
> 
> This is a "cardioid" decode; fairly non-descript.  Used mainly
for
> large area performance.  If your material is meant to be used at
> home, an "Energy" decoder is better and can be obtained by
> replacing +-0.1768 with +- 0.25 in the above script.
> Are you sure? Amb files store the W channel at 3dB down.
Please read 

"SHELF FILTERS for Ambisonic Decoders" from

www.ambisonia.net\Members\ricardo	for a "simple" explanation.

The full theoretical treatement is in "General Metatheory ... " which
is referred to in the above paper.  Bear in mind that "General
Metatheory... " doesn't use WXYZ in the same way as conventional
Ambisonics.



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.463 / Virus Database: 269.5.10/774 - Release Date: 23/04/07 17:26

Tuomo Latto

2007-Apr-23 15:34 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

Richard Lee wrote:> Please read 
> "SHELF FILTERS for Ambisonic Decoders" from
> www.ambisonia.net\Members\ricardo	for a "simple" explanation.
http://www.ambisonia.com/Members/ricardo
                     ^^^
seems to work better...


-- 
Tuomo

... Why is the alphabet in that order?  Is it because of that song?

Richard Lee

2007-Apr-24 16:59 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

> I have done. How does that relate to my pointing out the fact thatthe W channel is stored 3dB down in an .amb file? Well, it changes
the decode, doesn't it?

In "SHELF FILTERS ... ", I use WXYZ in its strict Ambisonic sense as
defined in

http://www.york.ac.uk/inst/mustech/3d_audio/secondor.html

Please use these strict definitions when talking about Ambi decodes.  I
don't use anything else.  If I do, I point it out clearly.
>By the way, that document boils down to putting a low-pass filter onthe W channel.

No.  What my document says is that if you make these very simple changes to the
decoding (and proper Shelf filters are simple only if you are a DSP guru) you
will get better localisation.  See

"Localization in Horizontal-Only Ambisonic Systems" - Benjamin, Lee
& Heller  AES oct06 San Francisco

from

www.ai.sri.com/ajh/ambisonics

has the experimental evidence.

Your decoder is OK but an Energy decoder is better and an Energy decoder with
Shelf filters is better still.


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.463 / Virus Database: 269.6.0/775 - Release Date: 24/04/07 17:43

Richard Lee

2007-Apr-25 15:47 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

Paul Martin wrote :
>> In "SHELF FILTERS ... ", I use WXYZ in its strict Ambisonic
sense as defined in
 http://www.york.ac.uk/inst/mustech/3d_audio/secondor.html
>You're dancing round the question. The specification of an AMB file says
that the W channel will be stored at a level of -3dB.
http://www.ambisonia.com/Members/etienne/Members/mleese/file-format-for-b-format
> "The W channel is attenuated by -3 dB (1/sqrt(2)) for all orders. 
That is to say, a source at 45 degrees azimuth (zero elevation) wouldproduce
equal gains in W, X, and Y."
> This means that my decode *is* the "Energy" one.
You've just described an ENCODING issue.

Your DECODER
> 0.3536,0.3536,0.3536,0.3536,\
> 0.1768,0.1768,-0.1768,-0.1768,\
> 0.1768,- 0.1768,-0.1768,0.1768,\
as Sebastian Olter points out is a "Cardioid" decoder.  Its OK but an
"Energy" decoder is better and an "Energy" decoder with
Shelf filters is better still.




-- 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.463 / Virus Database: 269.6.1/776 - Release Date: 25/04/07 12:19

Richard Lee

2007-Apr-26 16:53 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

> It's interesting that Richard Lee's document has the note:
>  IMPORTANT CORRECTION
>  The rectangular decoders in the early 5mar06 edition is (sic) wrong.  Please destroy any copies of that document.

This referred to a slightly inaccurate version of the RECTANGULAR decoder which
was superseded by Aaron's exact solution.

It has no bearing on what we are discussing on Square decoders.
__________________
> If the W channel is stored in the AMB file at a reduced level, and I'm
decoding from an AMB file, do you expect me to ignore the effect of that reduced
level of the W channel when decoding?
YES.  You have to assume the AMB file is correct.  W is NOT "stored in the
AMB file at a reduced level".  The signal in the AMB file IS W.

The correct Rationalised Energy Square decoder is

LB = W' - 0.7X' + 0.7Y' etc	Eqn 4.2 Rationalised Square Decoder

from "SHELF FILTERS ..."   or a scaled version of this.  This gives
best results if you do not use Shelf filters.

I have asked Martin Leese to correct his

http://www.ambisonia.com/Members/etienne/Members/mleese/file-format-for-b-format

page to clarify all this.



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.463 / Virus Database: 269.6.1/777 - Release Date: 26/04/07 15:23

xiphmont@xiph.org

2007-Apr-27 13:31 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

Not meaning to distract from the more constructive conversation....

I assume, somewhere, someone has a compendium of recommended hardware
for ambisonics?  Something to sanity check/inform equipment selections
of someone building My First Ambisonics Rig?

I can roll my own of course, but I like to avoid mistakes others have
already made.

Monty

Richard Lee

2007-Apr-27 23:55 UTC

head link

[Vorbis-dev] Re: Ambisonics in Ogg Vorbis

Dear Martin,
>
http://www.ambisonia.com/Members/etienne/Members/mleese/file-format-for-b-format
> "The W channel is attenuated by -3 dB (1/sqrt(2)) for all orders. 
That is to say, a source at 45 degrees azimuth (zero elevation) would produce
equal gains in W, X, and Y."
> As others have said, the available documentation is either opaque or uses
conflicting notation.
Could you change the above statement on your page to say

"The W channel is a perfect omnidirectional microphone whose response is
-3dB with respect to the on-axis response of the X, Y and Z signals.  A source
at 45 degrees azimuth with zero elevation would produce equal signals in W, X
and Y."

The original statement implies that there is an "original" W which is
then attenuated to something which is not proper W in B-format.

This is causing lots of confusion in the Vorbis decoder discussions both on the
forum and in private.

Thanks
Richard




-- 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.467 / Virus Database: 269.6.1/778 - Release Date: 27/04/07 13:39

Martin Leese

2007-Apr-28 12:36 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

"Gregory Maxwell" <gmaxwell@gmail.com> wrote:
> On 4/27/07, xiphmont@xiph.org <xiphmont@xiph.org> wrote:
> > (eg, the FAQ doesn't answer a few simple questions like
'should I be
> > using monopole speakers, or are bipoles better like in other
> > systems?')
>
> Yech. Other than a few (possibly crazy?) people, no one recommends
> dipole speaker for ambisonic playback... they don't work, at least
> mathematically.  There has been some argument based on a few people's
> experience that in some rooms they might work okay, but it's not
> entirely clear why.
>
> Small, full range, flat monitors, are probably best. Small, because
> big is the enemy of being able to place them correctly.   Consistent
> phase response is important. (i.e. reversing phase on a driver breaks
> it completely). Because the goal is to reconstruct the soundfield,
> consistent and predictable speaker behavior is useful.
Gregory's advice is sound.  The work that
suggested using dipole speakers used all
dipoles, front as well as back.  One pole of
each speaker faced the centre of the room and
the other the wall, on which was stuff to absorb
or disperse the reflections.  This is entirely
different from the use of dipoles in conventional
surround sound.

So, in brief, use monopoles as Gregory
suggests.  The key is that all speakers
cooperate to localise a sound, so all speakers
are equally important.  They must be phased
matched; the easiest way of doing this is to use
identical units.

As Gregory suggests, take care with wiring; if
one speaker phase is reversed then all is lost.
For the same reason, if you use different
power-amps, make sure all are non-inverting
(or all inverting).  Placement in the room can
also matter.  Left-right symmetry of the room is
more important than front-back.

When it works you will know.  The soundfield
will "gel" and your ears will relax.

Regards,
Martin
-- 
Martin J Leese
E-mail: martin.leese@stanfordalumni.org
Web: http://members.tripod.com/martin_leese/

Martin Leese

2007-Apr-28 21:07 UTC

head link

[Vorbis-dev] Ambisonics in Ogg Vorbis

xiphmont at xiph.org xiphmont at xiph.org wrote:
...>> So most people use monopoles?
Yes.

Regards,
Martin
-- 
Martin J Leese
E-mail: martin.leese@stanfordalumni.org
Web: http://members.tripod.com/martin_leese/

Apparently Analagous Threads

Search for more maybe matching threads

ogg dev - Apr 2007 - Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Re: [ogg-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Re: Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

[Vorbis-dev] Ambisonics in Ogg Vorbis

Apparently Analagous Threads