Martin Leese
2015-Nov-30 20:57 UTC
[Vorbis] Proposal for Ambisonics format in vorbis comment.
"Gabriel I." wrote:> Greetings, > > I apologize if I posted this in the wrong list, I wasn't sure where to post > it, but seeing as the tags are called "vorbis comments" I thought vorbis, > rather than ogg-dev, would be the right choice. (actually, I'm not even a > developer anyway)Hi Gabriel, I doubt whether the Xiph community would promote a file format for Ambisonics without first seeing whether it had the support of the Ambisonic community *and* seeing it used in the wild. The Ambisonic crowd all hang out on the sursound list,(1) so you should post your proposal over there. (Links have been collected together at the end.) However, there was a heated discussion in August/September 2008 on that list about a new file format. Despite several hundred posts, no consensus emerged. My guess is that nobody over there has the stomach for another round (I know I don't). This also might explain why you received no replies when you posted your proposal on the Ambisonics list.(2) (So many lists.) I have interspersed some comments below. If you do not understand anything, feel free to e-mail me off list.> What I'd like to propose is a simple way to encode ambisonic files in vorbis > comments as simple tags. By this I don't mean a single change to the format > itself or the codec, but a simple "official" tag so that hopefully, in the > future, we'll have decoders complying with it. Nobody ever wants to take > ambisonic storage off the ground in an *universal* fashion because there's > no standard in encoding the *channel orderings*, what *channels are > present*, and the *normalization*, and people don't agree on one thing for > some reason. (perhaps being stubborn)I was a little surprised that you did not discuss the ".amb" format as this is the official file format for Ambisonic B-Format.(3) This has also been used in the wild for many years, particularly at Ambisonia.com.(4) It uses Furse-Malham component ordering and MaxN normalization. (It also only works up to third-order.) Note that Vorbis is lossy. Ambisonics is picky about low-frequency phase and, as far as I know, nobody has checked the extent to which this is preserved in Vorbis. This may or may not be a problem. (It is obviously not a problem with lossless FLAC.) Your proposal can utilize all 256 Vorbis channels, which is good; not all proposals for Ambisonics in Vorbis have allowed this.> My proposal is different because it solves all issues: it allows only > Pantophonic (or planar/2D) signals if you wish, as probably most music and > people will not even have a 3D system which includes height... at the same > time you can specify a full 3D sphere encoding, or somewhere in between. The > former is especially important because it needs far less number of channels > and thus consumes far less amount of space and bandwidth, so instead the > order of ambisonic or quality of the audio itself can be increased.The "*.amb" format has the same property, so your proposal is not *that* different.> Note that this proposal is *infinitely* extensible to an arbitrary > "ambisonic order", *and* it can specify the normalization. I haven't decided > on the default normalization scheme, I'd like it to be N3D (why? well, just > because? none is objectively superior but we have to agree on *something* > for a standard) but it doesn't really matter as it can be specified.The ".amb" format is limited to third-order and uses MaxN normalization (with the exception of a -3dB correction factor for W).(5) Unfortunately, the coefficicents for the latter cannot be specified algebraically above third-order (but they can be calculated numerically). Dumping this normalization is therefore probably a good thing.> Basically, it uses the ACN channel ordering described here: > http://ambisonics.ch/standards/channels/ (it is mathematically defined by > the relationship l*(l+1) + m; where l is the mathematical degree, and m is > the mathematical order). (note that in ambisonics jargon, the 'order' of > ambisonics actually refers to the mathematical degree)The ".amb" format uses Furse-Malham channel ordering.(6) This is complicated and counter-intuitive, and was used only for compatibility with (then) current practice. Dumping it is therefore probably a good thing.> However the filetypes are described here: > http://ambisonics.ch/standards/filetypes/ > > (Please note I have no affiliation with that site, I just found it and it is > the best way to describe ambisonics material) > > This allows us to *uniquely* identify the channels used without wasting > space on empty channels at all. Because you specify both the "degree" of the > Pantophony and the "degree" of the height individually. The value (3,0) > would thus mean "third order ambisonics pantophony" having channels > 0,1,3,4,8,9,15 present with no height component at all because it is degree > 0 for height, which means a 2D/Planar signal requiring *only* 7 channels > instead of 16! Of course if you wanted a full-sphere 3D field, then you'd > use (3,3) and get all 16 channels in the file. Lowering the second degree > simply lowers the "order" or "resolution" of the height component. > > The important thing to remember is that by just these two values, the > decoder knows *exactly* which channels are present and in what order, > because they are defined precisely from it. No empty channels that waste > space and bandwidth. Plus, the decoder is not confused as it knows exactly > how and which channels and in what order they are present (there are only 7 > in the 2D case).Your proposal for mixed order is the same as the ".amb" format. It has the disadvantage that as a source leaves the horizontal, its sharpness degrades rapidly to that of the height-order. An alternative scheme, which does not have this problem, is "Complete mixed-order sets".(7) However, I don't know of anybody who has experience with decoding such sets. ...> The last thing to add is the normalization which I think can simply be added > after a colon. Thus finally, my proposal would be to add tag like this as a > vorbiscomment: > > AMBISONIC=(3,0):N3D > > The above tag defines a 2D planar file with "third order ambisonics" and no > height at all, using the N3D normalization scheme. Thus, when a decoder sees > this, it knows this file has 7 channels and they are ACN 0,1,3,4,8,9,15. The > following tag: > > AMBISONIC=(3,3):SN3D > > defines a full-sphere 3D field using the SN3D normalization scheme. When the > decoder sees it, it knows the file has 16 channels, them being ACN 0...15. > (of course the decoder can refuse to decode if it cannot! that's beside the > point!) > > Would acknowledging such a tag as official format be much trouble and to be > added to the spec?Adding new VorbisComment tags to the Vorbis spec does not happen lightly; Xiph has an official policy of neglect with respect to tags. Back in July 2009 Xiph *asked* me to survey what tags were being used in the wild, and to propose additions to the Vorbis spec.(8) Even these were not added. (Me, bitter and twisted? Never!)> I simply want an *official* way to send this very simple information > requiring no more than just two values and the normalization scheme and > store it in a file. I already use this tag format on my things right now > (unreleased because I need to know it is the best way) because I really want > to take Ambisonics off the ground (even for music which is what I do). I > want it officially because then decoders will hopefully be made to comply > with it. Alone, I have no power to influence that, sadly, so I turned to > you. > > I need your help here. This can work in FLAC too with vorbiscomments. Maybe > other formats will follow if they see this take off. And if possible it > should work on any other format that can specify tags, like Opus, I just > need the official recognition. There is zero change in the codec itself or > the format, it's just an officially recognized tag in a way declared in the > spec, so decoders can know how to comply. Please if you do take this to > heart, and decide to implement it, feel free to describe it in much better > detail or technical terms as needed. I just wanted to explain it in an easy > to understand manner.Note that Native FLAC is limited to 8 channels. Obviously you could include two or more FLAC streams in an Ogg container (Ogg FLAC), and so have an unlimited number of channels. However, in this case, the metadata should not be in the FLAC stream(s) but in the Ogg container. There is no simple way to do this, but possibilities include name-value pairs in an Ogg Skeleton stream and a XMLEmbedding stream.(9) Finally, your proposal only considers Ambisonic B-Format. UHJ and G-Format are also part of Ambisonics. There is an official file format for UHJ,(10) and a proposal for G-Format.(11) UHJ could be accommodated into your proposal quite simply using the VorbisComment "AMBISONIC:UHJ". G-Format is a lot more complicated, and should probably be ignored.> If you have an alternative way to do this officially or a superior method > (but this one proposed has *zero* shortcomings as far as storing ambisonic > material is concerned that I'm aware of), please tell me so I will use it > instead! Even if rejected, I will continue to use it just because I want to > see it off the ground. I truly hope you'll consider it as an official tag > format (I will encourage its use if so). > > Thank you for your time and once again I am sorry if I mailed this in the > wrong section, as this isn't necessarily about the codec, but I did not know > where to put it (because I'm not a developer).Regards, Martin (1) https://mail.music.vt.edu/mailman/listinfo/sursound (2) http://ambisonics.ch/mailman/listinfo/ambisonics (3) http://members.tripod.com/martin_leese/Ambisonic/B-Format_file_format.html (4) http://www.ambisonia.com/ (5) https://en.wikipedia.org/wiki/Ambisonic_data_exchange_formats#maxN (6) https://en.wikipedia.org/wiki/Ambisonic_data_exchange_formats#Furse-Malham (7) https://en.wikipedia.org/wiki/Mixed-order_Ambisonics#Complete_mixed-order_sets_.28.23H.23V.29 (8) https://wiki.xiph.org/Field_names (9) https://wiki.xiph.org/Metadata#Ogg_Skeleton (10) http://members.tripod.com/martin_leese/Ambisonic/UHJ_file_format.html (11) http://members.tripod.com/martin_leese/Ambisonic/G-Format_chunk.html -- Martin J Leese E-mail: martin.leese stanfordalumni.org Web: http://members.tripod.com/martin_leese/
Ian Malone
2015-Dec-01 00:26 UTC
[Vorbis] Proposal for Ambisonics format in vorbis comment.
On 30 November 2015 at 20:57, Martin Leese <martin.leese at stanfordalumni.org> wrote:> "Gabriel I." wrote: > >> Greetings, >>Ol?. Hope everyone is well, thought I'd interject.>> I apologize if I posted this in the wrong list, I wasn't sure where to post >> it, but seeing as the tags are called "vorbis comments" I thought vorbis, >> rather than ogg-dev, would be the right choice. (actually, I'm not even a >> developer anyway) > > Hi Gabriel, > > I doubt whether the Xiph community would > promote a file format for Ambisonics without > first seeing whether it had the support of the > Ambisonic community *and* seeing it used in > the wild. The Ambisonic crowd all hang out on > the sursound list,(1) so you should post your > proposal over there. (Links have been > collected together at the end.) However, there > was a heated discussion in August/September > 2008 on that list about a new file format. > Despite several hundred posts, no consensus > emerged. My guess is that nobody over there > has the stomach for another round (I know I > don't). This also might explain why you > received no replies when you posted your > proposal on the Ambisonics list.(2) (So many > lists.) >>> >> Would acknowledging such a tag as official format be much trouble and to be >> added to the spec? > > Adding new VorbisComment tags to the Vorbis > spec does not happen lightly; Xiph has an > official policy of neglect with respect to tags. > Back in July 2009 Xiph *asked* me to survey > what tags were being used in the wild, and to > propose additions to the Vorbis spec.(8) Even > these were not added. (Me, bitter and twisted? > Never!) >While that list wasn't officially added, it does see use. One thing that did get acceptance (to some degree) was the METADATA_BLOCK_PICTURE (and I've seen others of the proposed list in use), this was probably because there was interest from a developer in actually using it. I think that's key. Apple or Microsoft (and google to some extent) can just put a new feature in their next version of a product and automatically most of their user base is using it. For open formats it's a bit different, I think the important thing for someone wanting to make this happen (not knowing the ambisonics world) would be to find a developer behind a commonly used system and get them interested. As to the details, an oggskeleton and an embedded metadata stream (maybe XML, maybe binary) would be the purest way of doing it. But you might find it's easier to sell a comment based one. The overwhelming explosion of bad unstructured metadata (like early mp3), which I think probably led Xiph to be cautious about comment contents early on never really made it to Ogg and seems to have tailed off now that more media comes through 'official' channels. Either way good test samples are very useful, being able to prepare those is important. I guess ambisonics is probably a relatively small community? Do you need full hardware solutions or do people mainly drive hardware from open software? -- imalone http://ibmalone.blogspot.co.uk
Martin Leese
2015-Dec-02 20:36 UTC
[Vorbis] Proposal for Ambisonics format in vorbis comment.
On 11/30/15, Martin Leese <martin.leese at stanfordalumni.org> wrote: ...> UHJ could be accommodated > into your proposal quite simply using the > VorbisComment "AMBISONIC:UHJ".I see a typo crept in. That should read "AMBISONIC=UHJ". Regards, Martin -- Martin J Leese E-mail: martin.leese stanfordalumni.org Web: http://members.tripod.com/martin_leese/