The prospect of people actually putting B-format audio (via the panner or directly input) into Ogg/Vorbis brings an interesting challenge: What do we do with the audio after decoding it? The following sane options exist: A) Simply output the B-format audio B) Produce a downmix 1) Mono. 2) Stereo blumlein crossed pairs 3) Stereo UHJ 4) binaural C) Produce speaker feeds 1) Fully generalizable speaker feed decoder (such as http://www.kokkinizita.net/linuxaudio/adec-pict.html) 2) G-format (fixed decode for the 5.1 layout) (A) is pretty much a no-brainer, and minus some polish on marking up the channel mapping we pretty much already do it today. I think that some form of downmix support will be an essential feature in the libraries. Most users will not have software or systems which are equipt to play b-format Vorbis files, at least initally. Anyone distributing such files will have a hard time if the files refuse to play at all for people. Mono, and simulated blumlein are the simplest downmixes and could be added with a very minimal amount of code. They are also the least satisfactory. I think mono output would be especially surprising to the user and it probably shouldn't be considered as an automated fallback unless we have no other choice. UHJ, Binaural, and actual speaker feeds would be preferable, but all require some degree of filtering (for binaural, a full FIR engine and a stack of HRTFs are needed). So this raises a question if this functionality belongs in the core library. I think both a decent G-format decoder and a decent UHJ decoder can be implemented with a fairly simple set of IIRs and some linear combinations. A full speaker decoder as well as a binaural decoder will require a user-interface and can't really be done automagically. So I think they should be dropped from consideration as compatibility features for the core libraries. Ideally an application using the library should be able to register its ability to receive B-format, and if it hasn't it should receive a downmix and be otherwise unaware that the file is a surround file. Since non-surround capable playback software will almost certainly be 2ch only, this diminishes the usefulness of a built-in G-format downmix. So we're left with Blumlein or UHJ. As I mentioned above, I think UHJ is probably a preferable default but it will take a little more code to implement. It would probably be worthwhile to do some A/B tests between the two to find out what listeners prefer on with the available recordings. For multichannel able apps we will need something to do speaker decodes. I think there is an opportunity here for an additional library for this application. Perhaps Fons Adriaensen's decoder (I linked to it above) might be available for conversion into a light version library with a collection of speaker arrangement presets? Anything involving fancy layouts and more than 8 speakers is probably fine going with a jackified decoder.. especially since such systems will probably want to include things like room correction filters (http://drc-fir.sourceforge.net/). I have one other question on my mind: Should this be being solved just for Vorbis, or is there a clear place to put a more general solution which will cover other xiph codecs (Flac, OggPCM)?
On 2/26/07, Gregory Maxwell <gmaxwell@gmail.com> wrote:> A) Simply output the B-format audio > B) Produce a downmix > 1) Mono. > 2) Stereo blumlein crossed pairs > 3) Stereo UHJ > 4) binaural > C) Produce speaker feeds > 1) Fully generalizable speaker feed decoder > (such as http://www.kokkinizita.net/linuxaudio/adec-pict.html) > 2) G-format (fixed decode for the 5.1 layout) > > (A) is pretty much a no-brainer, and minus some polish on marking up > the channel mapping we pretty much already do it today.FWIW, vorbis 'Mapping 0' is what specifies the Dolby-style multichannel output maps. To say 'this doesn't support ambisonics mappings' is only partially correct-- that's what I intended 'mapping 1' for. There simply wasn't time to do any work on it before 1.0 as the Vorbis project got discovered early and there was a push to release the full 1.0 in a timely manner.> I think that some form of downmix support will be an essential feature > in the libraries. Most users will not have software or systems which > are equipt to play b-format Vorbis files, at least initally. Anyone > distributing such files will have a hard time if the files refuse to > play at all for people.There needs to be a distributed 'core' library for this, I agree. It shouldn't be part of libvorbis, but it would have to be part of the distribution with libvorbis.> A full speaker decoder as well as a binaural decoder will require a > user-interface and can't really be done automagically. So I think they > should be dropped from consideration as compatibility features for the > core libraries.I think the capability should be there, programmatically, even if a UI for setting it up is not. Monty
On Mon, Feb 26, 2007 at 11:24:59PM -0500, Gregory Maxwell wrote:> C) Produce speaker feeds > 1) Fully generalizable speaker feed decoder > (such as http://www.kokkinizita.net/linuxaudio/adec-pict.html)Not yet released. :-( -- Paul Martin <pm@zetnet.net> (work) <pm@nowster.zetnet.co.uk> (home)
>The following sane options exist:A) Simply output the B-format audio B) Produce a downmix 1) Mono. 2) Stereo blumlein crossed pairs 3) Stereo UHJ 4) binaural C) Produce speaker feeds 1) Fully generalizable speaker feed decoder (such as http://www.kokkinizita.net/linuxaudio/adec-pict.html) 2) G-format (fixed decode for the 5.1 layout) C) Produce speaker feeds As we are talking surround sound, this is the most important support that is missing. There are various levels of complexity in deriving speaker feeds from Ambisonic B-format. I've just updated the Wiki Ambisonic Decoding page to show the simplest options which just need a panner. The more sophisticated decodes are detailed in my "Ambisonic Surround Decoder" and "SHELF FILTERS for Ambisonic Decoders" I'm sorry these are so obfuscating, but as I say very plainly, these are aimed at DSP gurus wanting to write software Ambisonic Decoders. Just be thankful there are no glaring mistakes which will result in sub-optimal decoders. Generating a UHJ signal these days is only sensible to get a stereo output. But there are other simpler good stereo decodes. My recommendation is to implement the simple 4.0, 5.0 & 7.0 decoders I've described on Wiki in the 'core' library now. I recommend the 4.0 with a very low level CF over a Pentagon for general release. But these are all good for playing Ambisonic material on most surround systems. When we have Ambisonic gurus who are Ogg gurus, they can add the enhancements in "SHELF FILTERS ..." and other more sophisticated speaker decoders. Fon's decoder is the only software full Classic Ambisonic Decoder apart from the secret Meridian stuff. It is an excellent starting point for a fully programmable Ambisonic Speaker Decoder.>Because these mapping 1 multichannel files will not playback on thingslike hardware players, is there any aspect of the Vorbis subset specification which you believe we should explore breaking, in effect defing a new vorbis subset for ambisonic audio? This is similar to what a stereo only player should do if faced with a multi-channel file. If this was designed before multi-channel files were invented, it probably wouldn't have ANY strategy to deal with it. Present players like Windoz Media Player tend to mix anything they don't understand into stereo with good and bad results. Are you saying that existing Vorbis players don't have a similar strategy if faced with "Mapping <>0"? If they do have a sensible strategy, we can adopt A) Simply output the B-format audio and leave the Ambisonic Speaker Decode to the new "core" Ambisonic library. After all, if we intend to use Ambi technology to help compress surround sound in films, it is not unreasonable to expect a player to install a new decoder to take advantage of this. The new decoder could output whatever Dolby speaker assignments apply to the punters surround system including Zillion.1 -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.441 / Virus Database: 268.18.4/705 - Release Date: 27/02/07 15:24