thr3ads.net - Vorbis dev - [Vorbis-dev] Decoding for ambisonic Ogg audiob [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Gregory Maxwell

2007-Feb-26 20:25 UTC

[Vorbis-dev] Decoding for ambisonic Ogg audiob

The prospect of people actually putting B-format audio (via the panner
or directly input) into Ogg/Vorbis brings an interesting challenge:
What do we do with the audio after decoding it?

The following sane options exist:

A) Simply output the B-format audio
B) Produce a downmix
  1) Mono.
  2) Stereo  blumlein crossed pairs
  3) Stereo UHJ
  4) binaural
C) Produce speaker feeds
  1) Fully generalizable speaker feed decoder
      (such as http://www.kokkinizita.net/linuxaudio/adec-pict.html)
  2) G-format (fixed decode for the 5.1 layout)

(A) is pretty much a no-brainer, and minus some polish on marking up
the channel mapping we pretty much already do it today.

I think that some form of downmix support will be an essential feature
in the libraries. Most users will not have software or systems which
are equipt to play b-format Vorbis files, at least initally. Anyone
distributing such files will have a hard time if the files refuse to
play at all for people.

Mono, and simulated blumlein are the simplest downmixes and could be
added with a very minimal amount of code. They are also the least
satisfactory.  I think mono output would be especially surprising to
the user and it probably shouldn't be considered as an automated
fallback unless we have no other choice.

UHJ, Binaural, and actual speaker feeds would be preferable, but all
require some degree of filtering (for binaural, a full FIR engine and
a stack of HRTFs are needed). So this raises a question if this
functionality belongs in the core library.  I think both a decent
G-format decoder and a decent UHJ decoder can be implemented with a
fairly simple set of IIRs and some linear combinations.

A full speaker decoder as well as a binaural decoder will require a
user-interface and can't really be done automagically. So I think they
should be dropped from consideration as compatibility features for the
core libraries.

Ideally an application using the library should be able to register
its ability to receive B-format, and if it hasn't it should receive a
downmix and be otherwise unaware that the file is a surround file.
Since non-surround capable playback software will almost certainly be
2ch only, this diminishes the usefulness of a built-in G-format
downmix.

So we're left with Blumlein or UHJ. As I mentioned above, I think UHJ
is probably a preferable default but it will take a little more code
to implement. It would probably be worthwhile to do some A/B tests
between the two to find out what listeners prefer on with the
available recordings.

For multichannel able apps we will need something to do speaker
decodes. I think there is an opportunity here for an additional
library for this application. Perhaps  Fons Adriaensen's decoder (I
linked to it above) might be available for conversion into a light
version library with a collection of speaker arrangement presets?

Anything involving fancy layouts and more than 8 speakers is probably
fine going with a jackified decoder.. especially since such systems
will probably want to include things like room correction filters
(http://drc-fir.sourceforge.net/).

I have one other question on my mind: Should this be being solved just
for Vorbis, or is there a clear place to put a more general solution
which will cover other xiph codecs (Flac, OggPCM)?

xiphmont@xiph.org

2007-Feb-27 07:24 UTC

head link

[Vorbis-dev] Decoding for ambisonic Ogg audiob

On 2/26/07, Gregory Maxwell <gmaxwell@gmail.com> wrote:
> A) Simply output the B-format audio
> B) Produce a downmix
>   1) Mono.
>   2) Stereo  blumlein crossed pairs
>   3) Stereo UHJ
>   4) binaural
> C) Produce speaker feeds
>   1) Fully generalizable speaker feed decoder
>       (such as http://www.kokkinizita.net/linuxaudio/adec-pict.html)
>   2) G-format (fixed decode for the 5.1 layout)
>
> (A) is pretty much a no-brainer, and minus some polish on marking up
> the channel mapping we pretty much already do it today.
FWIW, vorbis 'Mapping 0' is what specifies the Dolby-style
multichannel output maps.  To say 'this doesn't support ambisonics
mappings' is only partially correct-- that's what I intended
'mapping
1' for.  There simply wasn't time to do any work on it before 1.0 as
the Vorbis project got discovered early and there was a push to
release the full 1.0 in a timely manner.
> I think that some form of downmix support will be an essential feature
> in the libraries. Most users will not have software or systems which
> are equipt to play b-format Vorbis files, at least initally. Anyone
> distributing such files will have a hard time if the files refuse to
> play at all for people.
There needs to be a distributed 'core' library for this, I agree.  It
shouldn't be part of libvorbis, but it would have to be part of the
distribution with libvorbis.
> A full speaker decoder as well as a binaural decoder will require a
> user-interface and can't really be done automagically. So I think they
> should be dropped from consideration as compatibility features for the
> core libraries.
I think the capability should be there, programmatically, even if a UI
for setting it up is not.

Monty

Paul Martin

2007-Feb-27 08:38 UTC

head link

[Vorbis-dev] Decoding for ambisonic Ogg audiob

On Mon, Feb 26, 2007 at 11:24:59PM -0500, Gregory Maxwell
wrote:> C) Produce speaker feeds
>  1) Fully generalizable speaker feed decoder
>      (such as http://www.kokkinizita.net/linuxaudio/adec-pict.html)
Not yet released. :-(

-- 
Paul Martin <pm@zetnet.net> (work)
  <pm@nowster.zetnet.co.uk> (home)

Richard Lee

2007-Feb-27 17:32 UTC

head link

[Vorbis-dev] Re: Decoding for ambisonic Ogg audiob

>The following sane options exist:
A) Simply output the B-format audio
B) Produce a downmix
1) Mono.
2) Stereo blumlein crossed pairs
3) Stereo UHJ
4) binaural
C) Produce speaker feeds
1) Fully generalizable speaker feed decoder
(such as http://www.kokkinizita.net/linuxaudio/adec-pict.html)
2) G-format (fixed decode for the 5.1 layout)

C) Produce speaker feeds

As we are talking surround sound, this is the most important support that is
missing.

There are various levels of complexity in deriving speaker feeds from Ambisonic
B-format.

I've just updated the Wiki Ambisonic Decoding page to show the simplest
options which just need a panner.

The more sophisticated decodes are detailed in my "Ambisonic Surround
Decoder" and "SHELF FILTERS for Ambisonic Decoders"

I'm sorry these are so obfuscating, but as I say very plainly, these are
aimed at DSP gurus wanting to write software Ambisonic Decoders.

Just be thankful there are no glaring mistakes which will result in sub-optimal
decoders.

Generating a UHJ signal these days is only sensible to get a stereo output. But
there are other simpler good stereo decodes.

My recommendation is to implement the simple 4.0, 5.0 & 7.0 decoders
I've described on Wiki in the 'core' library now. I recommend the
4.0 with a very low level CF over a Pentagon for general release. But these are
all good for playing Ambisonic material on most surround systems.

When we have Ambisonic gurus who are Ogg gurus, they can add the enhancements in
"SHELF FILTERS ..." and other more sophisticated speaker decoders.

Fon's decoder is the only software full Classic Ambisonic Decoder apart from
the secret Meridian stuff. It is an excellent starting point for a fully
programmable Ambisonic Speaker Decoder.
>Because these mapping 1 multichannel files will not playback on thingslike hardware players, is there any aspect of the Vorbis subset
specification which you believe we should explore breaking, in effect
defing a new vorbis subset for ambisonic audio?

This is similar to what a stereo only player should do if faced with a
multi-channel file. If this was designed before multi-channel files were
invented, it probably wouldn't have ANY strategy to deal with it.

Present players like Windoz Media Player tend to mix anything they don't
understand into stereo with good and bad results.

Are you saying that existing Vorbis players don't have a similar strategy if
faced with "Mapping <>0"?

If they do have a sensible strategy, we can adopt

A) Simply output the B-format audio

and leave the Ambisonic Speaker Decode to the new "core" Ambisonic
library.

After all, if we intend to use Ambi technology to help compress surround sound
in films, it is not unreasonable to expect a player to install a new decoder to
take advantage of this.

The new decoder could output whatever Dolby speaker assignments apply to the
punters surround system including Zillion.1

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.441 / Virus Database: 268.18.4/705 - Release Date: 27/02/07 15:24

Seemingly Similar Threads

Search for more possibly parallel threads

Vorbis dev - Feb 2007 - Decoding for ambisonic Ogg audiob

[Vorbis-dev] Decoding for ambisonic Ogg audiob

[Vorbis-dev] Decoding for ambisonic Ogg audiob

[Vorbis-dev] Decoding for ambisonic Ogg audiob

[Vorbis-dev] Re: Decoding for ambisonic Ogg audiob

Seemingly Similar Threads