thr3ads.net - Vorbis dev - [Vorbis-dev] 5.1 surround channel coupling [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Richard Lee

2007-Feb-22 00:35 UTC

[Vorbis-dev] 5.1 surround channel coupling

>Yesterday I have finished writing the ambisonic pan filter for oggenc.
May I ask what this "pan filter" is?

I made some tentative suggestions for coupling Ambisonic B-format in a post

"Vorbis Ambisonic coupling" on 4feb07

I gather from the last monthly meeting, that some of you, including Monty, had
problems with the phase behaviour of B-format.

Would anyone like a simpler explanation?

I'd like to make firmer recommendations for Ambisonic encoding in Vorbis. 
For this I need more info.
__________________

Is there a document which details the frequency banding strategy used by Vorbis?

Is it more usual to adopt eg "8 phase" at a lower frequency band
before going to "4 phase"?

Do the terms "8 phase", "4 phase", "point" still
make sense when we are coupling 4 channels?

Is there much advantage doing "4 phase" coupling above 18kHz only
compared to "lossless" coupling of the 4 channels right up to 20kHz?
__________________

Am I asking these questions in the right forum?

Richard Lee


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.3/696 - Release Date: 21/02/07 15:19

Sebastian Olter

2007-Feb-22 06:02 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

> May I ask what this "pan filter" is?
It's an Ambisonic encoder which uses Furse-Malham equations to convert
WAVE-EX files (the format itself is a set of speaker feeds) or a .wav
file produced by an A52 decoder (mplayer is preferable as it does not
truncate the LFE channel), into a set of Ambisonic channels, up to 2nd
order. It follows the .AMB file specification, so the channel mapping
depends on their number. It basically acts as a pan effect (assuming
the Ambisonic stream is played back without decoding), thus I call it
so to not confuse people not involved in Ambisonics.

The patch applies to vorbis-tools-1.1.1. It includes an option parser,
so the actual input setup can be programmed by the user and can be any
multichannel source, with up to 255 channels placed on a sphere.

Some parts require discussion (such as stereo downmix - currently it
creates a stereo image that is completely flat, i.e. without any
depth), so I finally decided to post it here. Patch and enjoy. If you
encounter a trouble with linux, try to remove #ifdefs over my
reimplementation of strsep() for windows(R).

[file "pan.c"]
/* Ambisonic encoder module for OggEnc
*  Copyright (C) 2007 Sebastian Olter
*
*  This program is free software; you can redistribute it and/or
*  modify it under the terms of the GNU General Public License
*  as published by the Free Software Foundation; either version 2
*  of the License, or (at your option) any later version.
*
*  This program is distributed in the hope that it will be useful,
*  but WITHOUT ANY WARRANTY; without even the implied warranty of
*  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
*  GNU General Public License for more details.
*
*  You should have received a copy of the GNU General Public License
*  along with this program; if not, write to the Free Software
*  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.
*/

#include "pan.h"
#include <stdlib.h>
#include <math.h>
#ifdef WIN32
#include <windows.h>
#endif
#ifndef M_PI
#define M_PI        3.14159265358979323846
#endif
const char *ambichan[] = {
	NULL,
	"W",
	"WY",
	"WXY",
	"WXYZ",
	"WXYRS",
	"WXYZRS",
	NULL,
	NULL,
	"WXYZRSTUV"
};

#if defined(_WIN32)
static char* strsep(char** haystack, const char *needle)
{
    char *c, *tmp, *p;
    if (haystack == NULL)
        return NULL;
    for (c = needle; *c != '\0'; c++)
    {
        p = *haystack;
        do
            p++;
        while ((*p != '\0') && (*p != *c));
        tmp = *haystack;
        if (*p != '\0')
        {
            *p = '\0';
            *haystack = p + 1;
        }
        else
            *haystack = NULL;
        return tmp;
    }
    return NULL;
}
#endif

typedef struct
{
    int inch, outch, nbufs;
    float matrix[255][9]; /* inch x outch */
    audio_read_func real_reader;
    void *real_readdata;
    float **bufs;
}
Pan;

static float furse_malham(float A, float E, char channel)
{
    switch (channel)
    {
    case 'W':
        return 1./sqrt(2);
    case 'X':
        return cos(A) * cos(E);
    case 'Y':
        return sin(A) * cos(E);
    case 'Z':
        return sin(E);
    case 'R':
        return 1.5 * sin(E) * sin(E) - 0.5;
    case 'S':
        return cos(A) * sin(2 * E);
    case 'T':
        return sin(A) * sin(2 * E);
    case 'U':
        return cos(2 * A) * cos(E) * cos(E);
    case 'V':
        return sin(2 * A) * cos(E) * cos(E);
    default:
        return 1.;
    }
}

static int setup_pan_coeffs(oe_enc_opt *opt, const char* channels_in_s)
{
    Pan* pan = opt->readdata;
    int inch = 0;
    float A, E;
    const char *outch;
    char *p, *channels_in = malloc(strlen(channels_in_s) + 1);
    strcpy(channels_in, channels_in_s);
    do
    {
        p = strsep(&channels_in, ",");
        if (strcmp(p, "lfe"))
        {
            A = strtod(p, NULL) * 2. * M_PI / 360.;
            strsep(&p, ":");
            E = (p) ? (strtod(p, NULL) * 2. * M_PI / 360.) : 0.;
            for (outch = ambichan[opt->channels]; outch <
ambichan[opt->channels] + opt->channels; outch++)
                pan->matrix[inch][outch - ambichan[opt->channels]]
furse_malham(A, E, *outch);
        }
        else
        {
            A = E = 0.;
            pan->matrix[inch][0] = 1.;
        }
        inch++;
    }
    while (channels_in);
    pan->inch = inch;
    pan->outch = opt->channels;
    return inch;
}

static long read_pan(void *data, float **buffer, int samples)
{
    Pan* pan = data;
    long in_samples = pan->real_reader(pan->real_readdata, pan->bufs,
samples);
    int i, x, y;
    float W, Y; // for stereo downmix
    for (i=0; i < in_samples; i++)
    {
        for (y = 0; y < pan->outch; y++)
        {
            buffer[y][i] = 0.;
            for (x = 0; x < pan->inch; x++)
                buffer[y][i] += pan->bufs[x][i] * pan->matrix[x][y] /
pan->inch;
        }
        if(pan->outch == 2)
        {
			W = buffer[0][i];
			Y = buffer[1][i];
			buffer[0][i] = W + Y;
			buffer[1][i] = W - Y;
		}
    }
    return in_samples;
}

void setup_pan(oe_enc_opt *opt, const char* channels_in, int channels_out)
{
	char *chan;
    int i, inch = opt->channels;
    Pan *pan = calloc(1, sizeof(Pan));
    if(!ambichan[channels_out])
    {
    	fprintf(stderr,"No suitable Ambisonic configuration for %d
channels. Panning disabled.\n",channels_out);
    	return;
	}
	chan = malloc(strlen(ambichan[channels_out]) + 1);
    pan->real_reader = opt->read_samples;
    pan->real_readdata = opt->readdata;
    opt->read_samples = read_pan;
    opt->readdata = pan;


    if (!strcmp(channels_in, "itu5.1"))
        channels_in = "30,330,0,lfe,125,235";
    else if (!strcmp(channels_in, "5.1ac3"))
        channels_in = "30,330,125,235,0,lfe";

    opt->channels = channels_out;
    setup_pan_coeffs(opt, channels_in);

    pan->bufs = malloc(inch * sizeof(float*));
    for (i = 0; i < inch; i++)
        pan->bufs[i] = malloc(4096 * sizeof(float));
	pan->nbufs = inch;

    pan->inch = min(pan->inch, inch); /* Avoid reading from
non-existing buffers */
    fprintf(stderr, "Enabling Ambisonic encoder %dch->",
pan->inch);
    switch(channels_out)
    {
    	case 1:
			fprintf(stderr, "mono\n");
			break;
    	case 2:
			fprintf(stderr, "stereo\n");
			break;
    	default:
			fprintf(stderr, "%s\n", ambichan[channels_out]);
			break;
	}
	free(chan);
}

void clear_pan(oe_enc_opt *opt)
{
    int i;
    Pan* pan = opt->readdata;
    opt->read_samples = pan->real_reader;
    opt->readdata = pan->real_readdata;
    for (i = 0; i < pan->nbufs; i++)
        free(pan->bufs[i]);
    free(pan->bufs);
    free(pan);
}
/*ends here*/

[file "pan.h"]
/* Ambisonic encoder module for OggEnc, header for pan.c */

#include "encode.h"

void setup_pan(oe_enc_opt *opt, const char* channels_in, int channels_out);
void clear_pan(oe_enc_opt *opt);
/*ends here*/

[PATCH for oggenc 1.0.2]
diff -b oggenc/audio.c downloads.xiph.org/vorbis-tools-1.1.1/oggenc/audio.c
389c389
< 	if(len!=16 && len!=18)
---> 	if(len!=16 && len!=18 && len != 40) // 40 is wave-ex415c415
< 	if(format.format == 1)
---> 	if(format.format == 1 || format.format == -2) // -2 is wave-exdiff -b oggenc/encode.c downloads.xiph.org/vorbis-tools-1.1.1/oggenc/encode.c
162,163d161
<             struct ovectl_ratemanage2_arg ai;
< 	        vorbis_encode_ctl(&vi, OV_ECTL_RATEMANAGE2_GET, &ai);
177a176,178>             struct ovectl_ratemanage2_arg ai;
> 	        vorbis_encode_ctl(&vi, OV_ECTL_RATEMANAGE2_GET, &ai);
>diff -b oggenc/encode.h downloads.xiph.org/vorbis-tools-1.1.1/oggenc/encode.h
87a88,91>     /* Ambisonics */
>     char *channels_in;
>     int channels_out;
>diff -b oggenc/oggenc.c downloads.xiph.org/vorbis-tools-1.1.1/oggenc/oggenc.c
26c26
<
---> #include "pan.h"
33a34,35>     {"sources",1,0,0},
>     {"sinks",1,0,0},82c84
< 			  NULL, 0, -1,-1,-1,.3,-1,0, 0,0.f, 0};
---> 			  NULL, 0, -1,-1,-1,.3,-1,0, 0,0.f, NULL, 3, 0};
188a191>
321a325,327>         if(opt.channels_in)
>             setup_pan(&enc_opts, opt.channels_in, opt.channels_out);
>340d345
<
357a363,364>         if(opt.channels_out > 0)
>             clear_pan(&enc_opts);
359a367>
431a440,456> 		" Ambisonics:\n"
> 		" --sources=string     Input setup:\n"
> 		"                      <preset>\n"
> 		"                     
<angle1>[:elevation1][,angle2[:elevation2]]...\n"
> 		"                      This option enables ambisonic encoding. Input
channels are\n"
> 		"                      mixed together to produce an ambisonic
output, using\n"
> 		"                      Furse-Malham equations. Input channels not
listed here\n"
> 		"                      are omitted. Currently two presets are
available:\n"
> 		"                      \"itu5.1\" and
\"5.1ac3\". They mean WAVE-EX and AC3 channel\n"
> 		"                      mapping, respectively. Use them with -q0 to
encode\n"
> 		"                      a ~140 kbps stream.\n"
> 		" --sinks=number       Total number of Ambisonic channels to be
produced. Number\n"
> 		"                      of channels determines what setup we use, see
.amb file\n"
> 		"                      format specification. Currently 3, 4, 5, 6
and 9-channel\n"
> 		"                      Ambisonic configurations are supported. 1 and
2 channels\n"
> 		"                      enable downmix.\n"
>         "\n"592c617,631
<                 if(!strcmp(long_options[option_index].name,
"managed")) {
--->                 if(!strcmp(long_options[option_index].name,
"sinks")) {
> 				    if(sscanf(optarg, "%d", &opt->channels_out) != 1)
{
>                         fprintf(stderr, _("WARNING: Bad number of
output channels: \"%s\". Pan filter disabled.\n"), optarg);
>     					opt->channels_out = 0;
> 				    }
> 				    else if(opt->channels_out < 1 || opt->channels_out == 7 ||
opt->channels_out == 8 || opt->channels_out > 9) {
>                         fprintf(stderr, _("WARNING: no %s-channel
Ambisonic setup. Assuming 2nd order, full sphere.\n"), optarg);
>     					opt->channels_out = 9;
> 				    }
>                 }
>                 else if(!strcmp(long_options[option_index].name,
"sources")) {
>                     opt->channels_in = malloc(strlen(optarg) + 1);
>                     strcpy(opt->channels_in, optarg);
>                 }
>                 else if(!strcmp(long_options[option_index].name,
"managed")) {Only in downloads.xiph.org/vorbis-tools-1.1.1/oggenc: pan.c
Only in downloads.xiph.org/vorbis-tools-1.1.1/oggenc: pan.h

/*ends here*/

Sebastian Olter

2007-Feb-22 20:53 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

Martin:> you haven't applied the Furse-Malham
> weightings.
> ...
> The Furse-Malham weightings were a recent
> addition to the ".amb" specification (January 24,
> 2007), so you may have missed it.
Yes, I have definitely to clear my Firefox's cache;) I simply didn't
think about normalization - especially in the context of a lossy
codec. Anyway, an updated version of the function (and patch) is in
attachment. Could you look at these equations again?

I added 3rd order, just for curiosity. 16 channels exceed mplayer's
limits, so a good [GPL'd] Ambisonic decoder library will be useful.
Does anybody know (and better: support) such a library?

AFAIK, the correct way to mix down B-Format into stereo is UHJ. Again:
is there a library to do this?

Patch is to be applied inside the oggenc/ source directory.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oggenc.patch
Type: application/octet-stream
Size: 3711 bytes
Desc: not available
Url :
http://lists.xiph.org/pipermail/vorbis-dev/attachments/20070223/c5860ffa/oggenc.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pan.h
Type: application/octet-stream
Size: 196 bytes
Desc: not available
Url :
http://lists.xiph.org/pipermail/vorbis-dev/attachments/20070223/c5860ffa/pan.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pan.c
Type: application/octet-stream
Size: 6478 bytes
Desc: not available
Url :
http://lists.xiph.org/pipermail/vorbis-dev/attachments/20070223/c5860ffa/pan-0001.obj

Richard Lee

2007-Feb-23 02:13 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

> May I ask what this "pan filter" is?
>It's an Ambisonic encoder which uses Furse-Malham equations to convert
WAVE-EX files (the format itself is a set of speaker feeds) or a .wav file
produced by an A52 decoder (mplayer is preferable as it does not truncate the
LFE channel), into a set of Ambisonic channels, up to 2nd order.
Wow !  What you've got is a 2nd order Ambisonic panner or B-format encoder
which translates any source (speaker) position into its proper Ambisonic
representation.

What this means is you can represent 5.1, 7.1 ...  zillion.1 with just 3
channels for horizontal sound, 4 channels for full sphere sound and play it back
with as few or as many speakers as the wife will allow in the room.  All without
losing specificity.

The extra channels with 2nd order B-format make the resulting directions more
accurate if you have more speakers.

But you can playback with less speakers and still get the best possible results
with your 4 speakers.  All directions are still represented, just not as
pin-pont as with more speakers.

Many thanks Sebastien.

But I don't see where you encode all this into Vorbis.  Is this still to be
decided?

And am I asking the right people about frequency banding and phase?  If I'm
in the wrong place, please tell me to go forth & multiply.



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.3/696 - Release Date: 21/02/07 15:19

Martin Leese

2007-Feb-23 12:20 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

"Sebastian Olter" <qduaty@gmail.com> wrote:
> Martin:
> > you haven't applied the Furse-Malham
> > weightings.
> > ...
> > The Furse-Malham weightings were a recent
> > addition to the ".amb" specification (January 24,
> > 2007), so you may have missed it.
>
> Yes, I have definitely to clear my Firefox's cache;) I simply
didn't
> think about normalization - especially in the context of a lossy
> codec. Anyway, an updated version of the function (and patch) is in
> attachment. Could you look at these equations again?
Looks good.
> I added 3rd order, just for curiosity. 16 channels exceed mplayer's
> limits, so a good [GPL'd] Ambisonic decoder library will be useful.
> Does anybody know (and better: support) such a library?
Don't know of such a library.
> AFAIK, the correct way to mix down B-Format into stereo is UHJ.
Yes, two-channel UHJ is stereo-compatible (and
mono).  Two-channel UHJ is produced from
first-order horizontal B-Format (W,X,Y) using
the following equations (where j is a +90 degree
phase shift):

S = 0.9396926*W + 0.1855740*X
D = j(-0.3420201*W + 0.5098604*X) + 0.6554516*Y
Left = (S + D)/2.0
Right = (S - D)/2.0
> Again:
> is there a library to do this?
Not as far as I know.

Regards,
Martin
-- 
Martin J Leese
E-mail: martin.leese@stanfordalumni.org
Web: http://members.tripod.com/martin_leese/

Richard Lee

2007-Feb-24 02:45 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

>> And am I asking the right people about frequency banding and phase?  If
I'm in the wrong place, please tell me to go forth & multiply.
>This list is definitely correct if we talk about people who canimprove Vorbis codec, but the question is whether they want to.
Vorbis' surround application stands in its place from at least five
years; there were no common tools to easily encode/decode Ambisonics
till now.

Thanks Sebastien.

Does this mean there is an existing B-format to Vorbis encoder?

And a Vorbis to B-format decoder?

How do I find what design decisions were made in this?

Particularly the frequency banding and "phase" model used.

My interest is that the time is now ripe for Ambisonics to adopt a good
compressed format.  Vorbis is the obvious candidate.

I was hoping to become a Vorbis guru and extend Vorbis to do Ambisonics
properly.  It's not as complicated as I originally envisaged.  But I need
more info.

Richard
wannabe Vorbis guru.


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.3/696 - Release Date: 21/02/07 15:19

Richard Lee

2007-Feb-25 17:40 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

Please excuse me for being dense but
>> Does this mean there is an existing B-format to Vorbis encoder?
>Yes, I posted an initial version of the encoder here several days ago.
What you posted was an Ambisonic panner which allowed any direction (speaker) to
be coded into B-format.

Does this mean there is a logical and tested path for a *.AMB file (the
Ambisonic studio B-format file) to be encoded into a Vorbis stream ?
>> And a Vorbis to B-format decoder?
>Current Vorbis Ambisonic streams do not need to be decoded intoB-Format.

What I was asking is if there is a logical and tested path for a Vorbis
Ambisonic stream to recover the original B-format signals.

B-format is the definition of the soundfield. ie what it should all sound like.
It makes no reference to speakers.

Once you have B-format, you derive however many speaker signals you need
depending on how many speakers you have. This is a separate process. The
Speaker Decoder.

If you think about it, this is how you need to do surround sound properly if you
are interested in more than sound coming out of zillion speakers.

A big advantage of Ambisonics is that you only need 3 channels to define a
complete horizontal stage. Hence you can encode 5.1 7.1 ... zillion.1 into 3
channels.

With 4 channels, you can define sound coming from any direction on a sphere
including up & down.

Of course you take full advantage of Ambisonics only by encoding directly into
B-format. Your Ambi panner allows this.

Starting with an original 5.1 file means you are stuck with the faults of 5.1 eg
side images are very poor.
>Vorbis started without any stereo channel coupling, and it worked andwas much better than other lossy audio codecs. I believe (well,
actually I have the opportunity to know it) the same applies to
Ambisonics.

If we don't need the efficiency gains from coupling, that's good. But I
believe from the nature of the Ambisonic signals, we will gain a lot; maybe 2x
or more with "lossless coupling".

What I'm trying to get a handle on is whether "lossy coupling"
would be acceptable at HF (me to answer) and whether there would be much gain in
efficiency (Vorbis guru to answer). This is why I'm asking the questions :
________________

Is it more usual to adopt eg "8 phase" at a lower frequency band
before going to "4 phase"?

Do the terms "8 phase", "4 phase", "point" still
make sense when we are coupling 4 channels?

Is there much advantage doing "4 phase" coupling above 18kHz only
compared to "lossless" coupling of the 4 channels right up to 20kHz?
________________
>So maybe you could think about how to use Vorbis mechanisms to do UHJ;
perhaps it will take much less work than a regular multichannel coupling. UHJ is
proven to work and it may be valuable to do listening tests with it.
UHJ is a stereo signal, a matrix system like Dolby Surround but more
sophisticated. But when B-format is encoded into UHJ, a lot of information is
lost.

You can get quite good results if you have a clever Speaker Decoder but no one
would use a UHJ chain if a modern compressed format like Vorbis could take
advantage of lossless in B-format signal to transmit accurate B-format.

I speak as someone who was involved in UHJ trials more than 20 yrs ago.

IMHO, the only reason for UHJ today is to get a good stereo signal from
B-format. But there are much simpler ways of getting a stereo signal with
advantages & disadvantages compared to UHJ

However, if you do use UHJ for whatever reason (even just for stereo) it MUST be
"losslessly coupled" over the whole frequency range.
_______________

The Speaker Decoder is accurately described in 2 documents

"Ambisonic Surround Decoder"

and

"SHELF FILTERS for Ambisonic Decoders"

both available from

www.ambisonicbootlegs.net\Members\ricardo

There are a lot of faulty equations for Ambisonic (Speaker) decoders on the www
which will give sub-optimal results. I've just corrected one on

http://en.wikipedia.org/wiki/Ambisonic_decoding

You can find the listening tests proving the above decoders at

http://www.ai.sri.com/ajh/ambisonics

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.3/700 - Release Date: 24/02/07 20:14

Martin Leese

2007-Feb-25 20:42 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

On 2/23/07, Martin Leese <martin.leese@stanfordalumni.org> wrote:
> "Sebastian Olter" <qduaty@gmail.com> wrote:
...> > AFAIK, the correct way to mix down B-Format into stereo is UHJ.
>
> Yes, two-channel UHJ is stereo-compatible (and
> mono).
...> > Again:
> > is there a library to do this?
>
> Not as far as I know.
There is some help with converting B-Format to
two-channel UHJ (and vice versa) at:

http://pcfarina.eng.unipr.it/Aurora/B-Format_to_UHJ.htm
http://pcfarina.eng.unipr.it/Aurora/conversion_between_uhj_and_b.htm

Also, rather than start from scratch, you might
want to investigate convolverVST and BruteFIR.
(You may already know of these.)  I have never
used them, so cannot help further.

Regards,
Martin
-- 
Martin J Leese
E-mail: martin.leese@stanfordalumni.org
Web: http://members.tripod.com/martin_leese/

Sebastian Olter

2007-Feb-26 02:36 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

2007/2/26, Richard Lee <ricardo@justnet.com.au>:> Does this mean there is a logical and tested path for a *.AMB file (the
Ambisonic studio B-format file) to be encoded into a Vorbis stream ?
The most logical way (i.e. involving the simplest logic :P) is to map
B-Format channels 1:1. Vorbis is a rare case of a codec that allows up
to 255 discrete channels to be encoded, and its official encoder,
oggenc, needed just 2 lines of code to be tweaked in order to handle
this. My patch includes this simple modification as well.
> What I was asking is if there is a logical and tested path for a Vorbis
Ambisonic stream to recover the original B-format signals.
The official Vorbis-to-wav decoder is called oggdec and it can handle
multichannel files. Wav can be converted into .amb format with
mctools.

Gregory Maxwell

2007-Feb-27 15:05 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

Regarding Sebastian Gesemann's panner patch.

Is there any reason why the panner shouldn't have a default
configuration matching the standard ITU layout and only expose the
fully flexible layout as an advanced option?

Richard Lee

2007-Feb-28 16:28 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

With some homework, I now add pseudo Vorbis guru to my Ambisonic & DSP
pseudo guru hats.

Looking at old Vorbis posts ...
>What's missing is tuning of the reference vorbis encoder to produce
better quality/bitrate for 5.1 surround mixes by taking advantage of redundancy
between the channels. Without that we're not technically competitive with
AC3.
Yes. Good multichannel "lossless coupling" is essential. It is this
which gives DD a 2:1 advantage over DTS and allows MLP to meet the DVD-A bitrate
spec.
>Monty has also cited the lack of (not lossily compressed) 5.1 sources to
test with as an obstacle here.
There is a large library of impressive uncompressed high quality Ambisonic
recordings at

www.ambisonicbootlegs.net
>However, most of the ambisonic literature is based on the the original quadrophonic implementations (where were also patented) and this isn't
enough channels to accurately represent the planar source arrangment
most content is mixed for.

No. Ambisonics is definely NOT based on quad. That's why it works (even
with 4.0) when quad doesn't. In fact Ambi 4.0 works better than 5.1

By separating the definition of the soundfield (B format where 3 channels is
sufficient for horizontal and 4 for a sphere) and the speaker feeds (depends on
your wife), Ambisonics optimises both.

The Speaker Decoder is based on where your speakers are and on ALL existing
theories of Auditory Localisation (except for Pinnae & HF ITD models).

And they sound good too. This was an important objective.

Higher Order Ambisonics allows even more pinpoint localisation but as Sebastian
Olter has pointed out, simple 3 or 4 channel Ambisonics is already a lot better
than the present naive 5.1 one speaker one channel systems.

But the really useful trick is that we can translate 5.1 to 3 channel B-format
and if played back on eg an Ambi 4.0 system, might even sound better than the
original.
>But Ambisonics already works =] I have some 130-140 kb/s movie soundtracks "recorded" ambisonically and encoded with q=0 and they sound
quite good (much better than faac 5.1 at 250-300 kb/s), sometimes it's
hard to distinguish between them and the original soundtrack. The
"recording" takes place in time domain so it can be done even in
oggenc.

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.4/705 - Release Date: 27/02/07 15:24

Richard Lee

2007-Feb-28 23:35 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

> Is there any reason why the panner shouldn't have a defaultconfiguration matching the standard ITU layout and only expose the
fully flexible layout as an advanced option?

There are 2 important reasons for not doing this

1) No one has an ITU layout at home. The only one I know of is in a big
research lab. I discuss this in "Ambisonic Surround Decoder" under
Real World Systems

www.ambisonicbootlegs.net\Members\ricardo

2) It doesn't give as good results as simple 4.0 square. This is predicted
by Ambisonic theory. That's why 5.1 can't put images at the sides.

This doesn't just apply to Ambisonics on 5.1 but anything played on an ITU
5.1 layout.

Benjamin, Lee & Heller are currently testing this very proposition in the
sequel to "Localization in Horizontal-Only Ambisonic Systems"

www.ai.sri.com/ajh/ambisonics

Richard Elen, Ambi guru and one time editor of Studio Sound recently coded some
Ambisonic square 4.0 speaker feeds into a DTS-CD and went round to his friends
with surround systems.

He reports that my Real World System is very representative and results were
excellent. Also that listeners instinctively moved forward so they were in the
centre of the square. This is of course where the best sound is for square 4.0
____________

So from theory & practice, a default surround Speaker Decode should be
simple 4.0 as in the Wiki Ambisonic Decoding page.

Better Speaker Decoders for ITU style layouts will come but in every case, a
regular layout will give better results if we apply the same effort.

Today, with 21st century digits, you can get reasonable stereo without identical
speakers the same distance from the listener. But you ALWAYS get better results
with matched equidistant speakers.

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.4/705 - Release Date: 27/02/07 15:24

Richard Lee

2007-Mar-01 15:46 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

> I do understand that ambisonic encoding is more efficient, but therecan definitely be more information in a 5.1 mix than in a 4 channel
ambisonic mix. In particular, I don't see how you can localize audio
across the (film) screen as accurately without including some l=2 modes.

5.1 is better than Ambi 4.0 (which is only 3 channels) ONLY for the 5 speaker
directions. It is poor everywhere else, even for positions near the speakers.

If you want to hear 5 speakers, use 5.1

If you want to hear good sound from all directions, try Ambisonics even in its
most primitive 4.0 guise.

If you want to use more channels, 5 channels will give you full horizontal 2nd
Order Ambisonics and 7 channels, 3rd Order.

The beauty of higher order ambisonics is that it doesn't just make the
speaker positions, but ALL directions better.

But some of us may be more interested in using plain 4 channel 1st order to get
height. Anyone know Mr Spielberg or Mr. Lucas?
>See, here's where you start sounding like an audiophile instead of an
engineer. :)
I'm an engineer. Auditory Localisation, approached correctly, is an
engineering problem.

The Mk1 Human Head, is a flattened ovoid with 2 sensors slightly low and back
surrounded by irregular flaps. Inside is some processing which can take
advantage of rotation & movement.

You ask yourself what CAN this do to determine directions of sound. And compare
this with research which goes back to Lord Rayleigh. Do a few experiments
yourself.

It turns out that the Mk1HH can and probably does, use a number of mechanisms.

Ambisonics attempts to get as many of these mechanisms right and where this is
not possible, the "wrong" answer should be natural. ie could possibly
have been from a natural source. This makes for "good sound".

One reason why quad (and 5.1) don't work is they rely on one or two of these
mechanisms but mess up the others. So although you might get some strong
localisation cues, the effect is fatiguing and unnatural cos other cues are
contradictory.

OK. I'd better come clean. The reason why reducing 5.1 to Ambi 4.0
"sounds better" than the original is probably to do with Speaker
Emphasis. This is a BIG fault of naive 5.1 systems. Hearing sound coming from
particular speakers spoils the illusion.

A good example is the opening sequence from "The Lion King" on a good
THX system. IMHO, this has some of the best musical 5.1 sound ever. You get
good envelopment and natural sound ..... UNTIL she starts singing. Then a noisy
box (speaker) suddenly appears in front of you and you're back in your
living room.

5.1 to Ambi 4.0 'blurs' the speaker positions slightly to give better
results where there are no speakers so the effect is more seamless.

You probably don't want to do 5.1 into 5 channel 2nd order Ambi cos that
will only give a more accurate impression of 5 boxes in the African veldt.

So an immediate saving which _might_ actually improve the illusion is to code
5.1 into WXY and 7.1 into WXYRS

Of course 5.1 into 4.0 won't work in EVERY case eg if it was important to
have precise CF dialogue coming from a front box ALL the time.

But I think it would improve most 5.1 films especially if the producers have
tried for good sound.

And if you encoded the sound directly into 5 channel 2nd order or 7 channel 3rd
order Ambisonics WITHOUT an intervening 5.1 or 7.1 stage, you would get better,
more accurate sound EVERYWHERE.

excuse the evangelising.
> Question is does the ambisonic encoding create a more effective way of
exploiting the redundancy.
Yes
> If the channels are efficiently coupled the number of channels will not
determine the size ... If ambisonics results in a more compressible
representation unanswered question.
ANSWER Yes

You can think of B-format encoding like this. (simplified explanation mostly
true)

The relative values of XYZ give you the direction cosines of the source. W is
the amplitude. Some distance info is in the phase cos this is what happens in
real life too. For a single distant source, WXYZ are just scaled versions of
each other. This obeys superposition so WXYZ accurately encodes all zillion
spherical directions.

We know Ambisonics is VERY compressible with formats which use sophisticated
"lossless coupling".

The important ones tested are Dolby Digital (from Eric Benjamin, one of its
inventors) and Meridian Lossless Packing (the lossless compression in DVD-A part
invented by Peter Craven of the original Ambisonic team)

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.4/705 - Release Date: 27/02/07 15:24

Richard Lee

2007-Mar-01 23:54 UTC

head link

[Vorbis-dev] 5.1 surround channel coupling

>You misunderstand my post.
>The patch that I'm referring to allows you to take a set of 5.1 --zillion.1 speaker feeds as input, specify a speaker layout, then it
uses an ambisonic panner to produce an ambisonic rendition of these
feeds for encoding.

Sorry Gregory.  I hope the post was useful anyway.

To answer your real question.  Dolby Digital EX and DTS ES are 7.1 systems.  

I'm not sure how many films have 7.1 soundtracks.  Anyone know?

The bitstream or file may have a channel mask which if Windoz, is likely to be
the same as the WAVE_X   dwChannelMask  So the encoder could tell what were the
input speaker assignments.

http://dolby.com/consumer/home_entertainment/roomlayout.html

shows 7.1 has the extra speakers at +-150
>> Yes.  Good multichannel "lossless coupling" is essential.  It
is this which gives DD a 2:1 advantage over DTS and allows MLP to meet the DVD-A
bitrate spec.
>Well, a '2:1 advantage' over anything involves alot of
presuppositionsthat involve getting some addiitonal ducks in a row for the lossless
coupling.  .

My information is from Eric who is a High Priest of Dolby Digital and also
involved with the specification of DVD-A, but I believe these are facts.

a)	Firstly ALL the formal listening tests have been at very low bitrates. 
Basically you lower the bitrate until you can hear a difference from the
original then that bitrate is your measure of efficiency

b)	On these tests which were conducted by the EBU though Dolby was heavily
involved, Dolby Digital shows about 2:1 advantage over DTS on most films.

c)	Encoders evolve.  So DD today is MUCH better than 5 yrs ago and you can see
the same trend in eg MP3 and Vorbis.

d)	At the bitrates used for DVD-V sound, Eric challenges anyone to tell the
original from either DD or DTS; both are more than good enough .

e)	Films often have little correlation between the fronts & backs and centre
dialogue is often quite independent.

e)	Ambisonics which has large inter-channel correlations performs even better
cos good multi-channel "lossless coupling"
>I'd like to see Ambisonic be _the_ way to do surround, and if you want
5.1, you get to map it to/from Ambisonic representation
The Ambisonic faithful grovel at your feet oh Guru Monty  ...  8>D



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.4/705 - Release Date: 27/02/07 15:24

Apparently Analagous Threads

Search for more reasonably related threads

Vorbis dev - Feb 2007 - 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

[Vorbis-dev] 5.1 surround channel coupling

Apparently Analagous Threads