Hello Tim and others,
Thanks for your help explaining this process on IRC. I wrote out a
first draft in the RFC xml format. I have attached the xml (labeled as
xml.txt so it will appear inline) and the rendered txt files. Please
let me know where I can make improvements. I will upload this draft to
the IETF datatracker and send it out to codec@ after addressing your
comments.
-------------- next part --------------
<?xml version="1.0" encoding="utf-8"?>
<!--
   Copyright (c) 2012-2016 Xiph.Org Foundation and contributors
   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions
   are met:
   - Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
   - Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
   ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
   OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
   EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
   PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
   PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
   LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
   NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
   SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
   Special permission is granted to remove the above copyright notice, list of
   conditions, and disclaimer when submitting this document, with or without
   modification, to the IETF.
-->
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [
<!ENTITY rfc2119 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc6716 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6716.xml'>
<!ENTITY rfc7845 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7845.xml'>
]>
<?rfc toc="yes" symrefs="yes" ?>
<rfc ipr="trust200902" category="std"
docName="draft-graczyk-opus-ambisonics"
 updates="7845">
<front>
<title abbrev="Opus Ambisonics">Ambisonics in an Ogg Opus
Container</title>
<author initials="M.G." surname="Graczyk"
fullname="Michael Graczyk">
<organization>Google Inc.</organization>
<address>
<postal>
<street>1600 Amphitheatre Parkway</street>
<city>Mountain View</city>
<region>CA</region>
<code>94043</code>
<country>USA</country>
</postal>
<email>mgraczyk at google.com</email>
</address>
</author>
<date day="24" month="May" year="2016"/>
<area>RAI</area>
<workgroup>codec</workgroup>
<abstract>
<t>
This document defines an extension to the Ogg format to encapsulate
 ambisonics coded using the Opus audio codec.
</t>
</abstract>
</front>
<middle>
<section anchor="intro" title="Introduction">
<t>
Ambisonics is a representation format for three dimensional sound fields which
 can be used for surround sound and immersive virtual reality playback.
See <xref target="gerzon75"/> and <xref
target="daniel04"/> for technical
 details on the ambisonics format.
For the purposes of the this document, ambisonics can be considered a
 multichannel audio stream.
The Ogg format is a container which transmission and storage of audio coded
using the Opus codec.
See <xref target="RFC6716"/> and <xref
target="RFC7845"/> for technical details
 on the Opus codec and its encapsulation in the Ogg container respectively.
</t>
<t>
This document extends the Ogg format by defining a new channel mapping family
for
encoding ambisonics. The Ogg Opus format is extended indirectly by adding an
item with value 254 to the IANA "Opus Channel Mapping Families"
registry. When
254 is used as the Channel Mapping Family Number in an Ogg stream, the semantic
meaning of the channels in the multichannel Opus stream is the ambisonics layout
defined in this document.
</t>
</section>
<section anchor="terminology" title="Terminology">
<t>
The key words "MUST", "MUST NOT", "REQUIRED",
"SHALL", "SHALL NOT", "SHOULD",
 "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this
 document are to be interpreted as described in <xref
target="RFC2119"/>.
</t>
</section>
<section anchor="ogg_extension" title="Ambisonics With Ogg
Opus">
<t>
Ambisonics MAY be encapsulated in the Ogg format by encoding with the Opus codec
and setting the Channel Mapping Family value to 254 in the Ogg Identification
Header. A demuxer implmentation encountering Channel Mapping Family 254 SHOULD
interpret the Opus stream as containing ambisonics with the format described
in <xref target="channel_mapping"/>. 
</t>
<section anchor="channel_mapping" title="Channel Mapping
Family 254">
<t>
Allowed numbers of channels: (1 + l)^2 for l = 0...14. 
Explicitly 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225.
Ambisonics from zeroth to fourteenth order.
</t>
<t>
This channel mapping uses the same channel mapping table format used by channel
mapping families 1 and 255. Each output channel is assigned to an ambisonic
component in Ambisonic Channel Number (ACN) order. The ambisonic component with
degree n and ambisonic index m corresponds to channel (n * (n + 1) + m).
Channels are normalized with Schmidt Semi-Normalization (SN3D) with no
Condon-Shortley phase factor. In SN3D, the spherical harmonic of degree n and
index m is normalized according to
</t>
<figure align="center">
<artwork align="center"><![CDATA[
sqrt((2 - delta(m)) * ((l - m)! / (l + m)!)),
]]></artwork>
</figure>
<t>
where delta(0) = 1 and delta(m) = 0 otherwise.
</t>
<t>
The interpretation of the ambisonics signal as well as the channel order and
 normalization are described in <xref target="ambix"/>.
</t>
</section>
<section anchor="downmixing" title="Downmixing">
<t>
Implementations MAY use the matrix in Figure
 <xref target="stereo_downmix_matrix"
format="counter"/> to implement
 downmixing from multichannel files using Channel Mapping Family 254 (Section
 TODO), which is known to give acceptable results for stereo.
</t>
<figure anchor="stereo_downmix_matrix" title="Stereo
Downmixing Matrix" align="center">
<artwork align="center"><![CDATA[
/   \   /                  \ /  W  \
| L |   | 0.5  0.5 0.0 ... | |  Y  |
| R | = | 0.5 -0.5 0.0 ... | | ... |
\   /   \                  / \ ... /
]]></artwork>
</figure>
</section>
</section>
<section anchor="security" title="Security
Considerations">
<t>
Implementations of the Ogg container need take appropriate security
 considerations into account, as outlined in Section 10 of <xref
target="RFC7845"/>.
The extension definied in this document requires that semantic meaning be
 assigned to more channels than the existing Ogg format requires.
Since more allocations will be required to encode and decode these semantically
 meaningful channels, care should be taken in any new allocation paths.
Implementations MUST NOT overrun their allocated memory nor read from
 uninitialized memory when managing the ambisonic channel mapping.
</t>
</section>
<section anchor="iana" title="IANA Considerations">
<t>
This document updates the IANA Media Types registry "Opus Channel Mapping
Families" to add a new assignment.
</t>
<texttable>
<ttcol>Value</ttcol><ttcol>Reference</ttcol>
<c>254</c><c>This Document <xref
target="channel_mapping"/></c>
</texttable>
</section>
<section anchor="Acknowledgments"
title="Acknowledgments">
<t>
Thanks to Timothy Terriberry and Marcin Gorzel for their guidance and
 valuable contributions to this document.
</t>
</section>
</middle>
<back>
<references title="Normative References">
 &rfc2119;
 &rfc6716;
 &rfc7845;
<reference anchor="ambix"
 target="http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf">
  <front>
    <title>AMBIX - A SUGGESTED AMBISONICS FORMAT</title>
    <author initials="C." surname="Nachbar"
fullname="Christian Nachbar"/>
    <author initials="F." surname="Zotter"
fullname="Franz Zotter"/>
    <author initials="E." surname="Deleflie"
fullname="Etienne Deleflie"/>
    <author initials="A." surname="Sontacchi"
fullname="Alois Sontacchi"/>
    <date month="June" year="2011"/>
  </front>
</reference>
</references>
<references title="Informative References">
<reference anchor="gerzon75"
 target="http://www.michaelgerzonphotos.org.uk/articles/Ambisonics%201.pdf">
  <front>
    <title>Ambisonics. Part one: General system description</title>
    <author initials="M." surname="Gerzon"
fullname="Michael Gerzon"/>
    <date month="August" year="1975"/>
  </front>
</reference>
<reference anchor="daniel04"
 target="http://pcfarina.eng.unipr.it/Public/phd-thesis/aes116%20high-passed%20hoa.pdf">
  <front>
    <title>Further Study of Sound Field Coding with Higher Order
Ambisonics</title>
    <author initials="J." surname="Daniel"
fullname="J茅r么me Daniel"/>
    <author initials="S." surname="Moreau"
fullname="S茅bastien Moreau"/>
    <date month="May" year="2004"/>
  </front>
</reference>
</references>
</back>
</rfc>
-------------- next part --------------
codec                                                         M. Graczyk
Internet-Draft                                               Google Inc.
Updates: 7845 (if approved)                                 May 24, 2016
Intended status: Standards Track
Expires: November 25, 2016
                  Ambisonics in an Ogg Opus Container
                     draft-graczyk-opus-ambisonics
Abstract
   This document defines an extension to the Ogg format to encapsulate
   ambisonics coded using the Opus audio codec.
Status of This Memo
   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.
   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.
   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."
   This Internet-Draft will expire on November 25, 2016.
Graczyk                 Expires November 25, 2016               [Page 1]
Internet-Draft               Opus Ambisonics                    May 2016
Table of Contents
   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   2
   3.  Ambisonics With Ogg Opus  . . . . . . . . . . . . . . . . . .   2
     3.1.  Channel Mapping Family 254  . . . . . . . . . . . . . . .   3
     3.2.  Downmixing  . . . . . . . . . . . . . . . . . . . . . . .   3
   4.  Security Considerations . . . . . . . . . . . . . . . . . . .   3
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   4
   6.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   4
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   4
     7.2.  Informative References  . . . . . . . . . . . . . . . . .   4
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   5
1.  Introduction
   Ambisonics is a representation format for three dimensional sound
   fields which can be used for surround sound and immersive virtual
   reality playback.  See [gerzon75] and [daniel04] for technical
   details on the ambisonics format.  For the purposes of the this
   document, ambisonics can be considered a multichannel audio stream.
   The Ogg format is a container which transmission and storage of audio
   coded using the Opus codec.  See [RFC6716] and [RFC7845] for
   technical details on the Opus codec and its encapsulation in the Ogg
   container respectively.
   This document extends the Ogg format by defining a new channel
   mapping family for encoding ambisonics.  The Ogg Opus format is
   extended indirectly by adding an item with value 254 to the IANA
   "Opus Channel Mapping Families" registry.  When 254 is used as the
   Channel Mapping Family Number in an Ogg stream, the semantic meaning
   of the channels in the multichannel Opus stream is the ambisonics
   layout defined in this document.
2.  Terminology
   The key words "MUST", "MUST NOT", "REQUIRED",
"SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",
"NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   [RFC2119].
3.  Ambisonics With Ogg Opus
   Ambisonics MAY be encapsulated in the Ogg format by encoding with the
   Opus codec and setting the Channel Mapping Family value to 254 in the
   Ogg Identification Header.  A demuxer implmentation encountering
Graczyk                 Expires November 25, 2016               [Page 2]
Internet-Draft               Opus Ambisonics                    May 2016
   Channel Mapping Family 254 SHOULD interpret the Opus stream as
   containing ambisonics with the format described in Section 3.1.
3.1.  Channel Mapping Family 254
   Allowed numbers of channels: (1 + l)^2 for l = 0...14.  Explicitly 4,
   9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225.  Ambisonics
   from zeroth to fourteenth order.
   This channel mapping uses the same channel mapping table format used
   by channel mapping families 1 and 255.  Each output channel is
   assigned to an ambisonic component in Ambisonic Channel Number (ACN)
   order.  The ambisonic component with degree n and ambisonic index m
   corresponds to channel (n * (n + 1) + m).  Channels are normalized
   with Schmidt Semi-Normalization (SN3D) with no Condon-Shortley phase
   factor.  In SN3D, the spherical harmonic of degree n and index m is
   normalized according to
               sqrt((2 - delta(m)) * ((l - m)! / (l + m)!)),
   where delta(0) = 1 and delta(m) = 0 otherwise.
   The interpretation of the ambisonics signal as well as the channel
   order and normalization are described in [ambix].
3.2.  Downmixing
   Implementations MAY use the matrix in Figure 1 to implement
   downmixing from multichannel files using Channel Mapping Family 254
   (Section TODO), which is known to give acceptable results for stereo.
                   /   \   /                  \ /  W  \
                   | L |   | 0.5  0.5 0.0 ... | |  Y  |
                   | R | = | 0.5 -0.5 0.0 ... | | ... |
                   \   /   \                  / \ ... /
                    Figure 1: Stereo Downmixing Matrix
4.  Security Considerations
   Implementations of the Ogg container need take appropriate security
   considerations into account, as outlined in Section 10 of [RFC7845].
   The extension definied in this document requires that semantic
   meaning be assigned to more channels than the existing Ogg format
   requires.  Since more allocations will be required to encode and
   decode these semantically meaningful channels, care should be taken
   in any new allocation paths.  Implementations MUST NOT overrun their
Graczyk                 Expires November 25, 2016               [Page 3]
Internet-Draft               Opus Ambisonics                    May 2016
   allocated memory nor read from uninitialized memory when managing the
   ambisonic channel mapping.
5.  IANA Considerations
   This document updates the IANA Media Types registry "Opus Channel
   Mapping Families" to add a new assignment.
                   +-------+---------------------------+
                   | Value | Reference                 |
                   +-------+---------------------------+
                   | 254   | This Document Section 3.1 |
                   +-------+---------------------------+
6.  Acknowledgments
   Thanks to Timothy Terriberry and Marcin Gorzel for their guidance and
   valuable contributions to this document.
7.  References
7.1.  Normative References
   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <http://www.rfc-editor.org/info/rfc2119>.
   [RFC6716]  Valin, JM., Vos, K., and T. Terriberry, "Definition of the
              Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
              September 2012, <http://www.rfc-editor.org/info/rfc6716>.
   [RFC7845]  Terriberry, T., Lee, R., and R. Giles, "Ogg Encapsulation
              for the Opus Audio Codec", RFC 7845, DOI 10.17487/RFC7845,
              April 2016, <http://www.rfc-editor.org/info/rfc7845>.
   [ambix]    Nachbar, C., Zotter, F., Deleflie, E., and A. Sontacchi,
              "AMBIX - A SUGGESTED AMBISONICS FORMAT", June 2011,
              <http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/
              ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf>.
7.2.  Informative References
   [gerzon75]
              Gerzon, M., "Ambisonics. Part one: General system
              description", August 1975,
              <http://www.michaelgerzonphotos.org.uk/articles/
              Ambisonics%201.pdf>.
Graczyk                 Expires November 25, 2016               [Page 4]
Internet-Draft               Opus Ambisonics                    May 2016
   [daniel04]
              Daniel, J. and S. Moreau, "Further Study of Sound Field
              Coding with Higher Order Ambisonics", May 2004,
              <http://pcfarina.eng.unipr.it/Public/phd-thesis/
              aes116%20high-passed%20hoa.pdf>.
Author's Address
   Michael Graczyk
   Google Inc.
   1600 Amphitheatre Parkway
   Mountain View, CA  94043
   USA
   Email: mgraczyk at google.com
Graczyk                 Expires November 25, 2016               [Page 5]
I'll need to have a closer look, but two quick comments I can already make: 1) The IETF draft should ask for ambisonics to be assigned mapping family 2, not 254. The value of 254 is only in the source code right now because the IETF has not given us 2 yet. 2) The name of your draft for the initial submission should be: draft-graczyk-codec-ambisonics-00.txt Only once it becomes a working group draft does it get the draft-ietf- prefix. The -00 is the version. I'll comment more once I've read it thoroughly. Cheers, Jean-Marc On 05/26/2016 05:56 PM, Michael Graczyk wrote:> Hello Tim and others, > > Thanks for your help explaining this process on IRC. I wrote out a > first draft in the RFC xml format. I have attached the xml (labeled as > xml.txt so it will appear inline) and the rendered txt files. Please > let me know where I can make improvements. I will upload this draft to > the IETF datatracker and send it out to codec@ after addressing your > comments. > > > > _______________________________________________ > opus mailing list > opus at xiph.org > http://lists.xiph.org/mailman/listinfo/opus >
Hi Michael, Here's some more minor comments below. As long as you address the two comments from my previous email (254 -> 2 and the draft name), the draft is good for submitting as initial version on the IETF website (even if you don't address all the minor comments from this email). FYI, this is the address for submitting a new draft: https://datatracker.ietf.org/submit/ Introduction: "The Ogg format is a container which transmission and storage of audio coded using the Opus codec." I think I forgot a word in this sentence. Also, Ogg is a general container and isn't just for Opus. Section 3: "A demuxer implmentation encountering Channel Mapping Family 254 SHOULD interpret the Opus stream as containing ambisonics with the format described in Section 3.1." Aside from changing 254 to 2, the "SHOULD" should be a "MUST". Section 3.1: "Allowed numbers of channels: (1 + l)^2 for l = 0...14" Minor nit: unless the use of "l" is standard for talking about ambisonics, I would suggest using "k" instead, since "l" is easy to confuse with "1". Section 3.1: If it's not too complicated, can you explain how the "m" index is derived? Cheers, Jean-Marc On 05/26/2016 05:56 PM, Michael Graczyk wrote:> Hello Tim and others, > > Thanks for your help explaining this process on IRC. I wrote out a > first draft in the RFC xml format. I have attached the xml (labeled as > xml.txt so it will appear inline) and the rendered txt files. Please > let me know where I can make improvements. I will upload this draft to > the IETF datatracker and send it out to codec@ after addressing your > comments. > > > > _______________________________________________ > opus mailing list > opus at xiph.org > http://lists.xiph.org/mailman/listinfo/opus >
Hello Jean-Marc, Thanks for the quick reply and comments. On Thu, May 26, 2016 at 5:41 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:> Hi Michael, > > Here's some more minor comments below. As long as you address the two > comments from my previous email (254 -> 2 and the draft name), the draft > is good for submitting as initial version on the IETF website (even if > you don't address all the minor comments from this email). FYI, this is > the address for submitting a new draft: > https://datatracker.ietf.org/submit/Thanks I have fixed these and will submit.> Introduction: "The Ogg format is a container which transmission and > storage of audio coded using the Opus codec."I changed "which" to "for"> I think I forgot a word in this sentence. Also, Ogg is a general > container and isn't just for Opus.Rephrased to "... of audio. It can be used to encapsulate streamscoded using the Opus codec."> Section 3: "A demuxer implmentation encountering Channel Mapping Family > 254 SHOULD interpret the Opus stream as containing ambisonics with the > format described in Section 3.1." > > Aside from changing 254 to 2, the "SHOULD" should be a "MUST".Changed to "MUST". I used SHOULD because that is what is in RFC7845 5.1.1.4. I reasoned that if an implementation were allowed to treat mapping family 2 its own way (based on the language in 5.1.1.4), then I shouldn't change that behavior. It makes sense to make this strict though.> Section 3.1: "Allowed numbers of channels: (1 + l)^2 for l = 0...14" > > Minor nit: unless the use of "l" is standard for talking about > ambisonics, I would suggest using "k" instead, since "l" is easy to > confuse with "1".Actually "n" is common and is used in the [ambix] reference, so I switched to that and changed the language here to be consistent with [ambix].> Section 3.1: If it's not too complicated, can you explain how the "m" > index is derived?I added equations to derive n and m from channel index. I've attached updated versions of the document. -------------- next part -------------- <?xml version="1.0" encoding="utf-8"?> <!-- Copyright (c) 2012-2016 Xiph.Org Foundation and contributors Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Special permission is granted to remove the above copyright notice, list of conditions, and disclaimer when submitting this document, with or without modification, to the IETF. --> <!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [ <!ENTITY rfc2119 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'> <!ENTITY rfc6716 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6716.xml'> <!ENTITY rfc7845 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.7845.xml'> ]> <?rfc toc="yes" symrefs="yes" ?> <rfc ipr="trust200902" category="std" docName="draft-graczyk-opus-ambisonics" updates="7845"> <front> <title abbrev="Opus Ambisonics">Ambisonics in an Ogg Opus Container</title> <author initials="M.G." surname="Graczyk" fullname="Michael Graczyk"> <organization>Google Inc.</organization> <address> <postal> <street>1600 Amphitheatre Parkway</street> <city>Mountain View</city> <region>CA</region> <code>94043</code> <country>USA</country> </postal> <email>mgraczyk at google.com</email> </address> </author> <date day="24" month="May" year="2016"/> <area>RAI</area> <workgroup>codec</workgroup> <abstract> <t> This document defines an extension to the Ogg format to encapsulate ambisonics coded using the Opus audio codec. </t> </abstract> </front> <middle> <section anchor="intro" title="Introduction"> <t> Ambisonics is a representation format for three dimensional sound fields which can be used for surround sound and immersive virtual reality playback. See <xref target="gerzon75"/> and <xref target="daniel04"/> for technical details on the ambisonics format. For the purposes of the this document, ambisonics can be considered a multichannel audio stream. The Ogg format is a container for transmission and storage of audio. It can be used to encapsulate streamscoded using the Opus codec. See <xref target="RFC6716"/> and <xref target="RFC7845"/> for technical details on the Opus codec and its encapsulation in the Ogg container respectively. </t> <t> This document extends the Ogg format by defining a new channel mapping family for encoding ambisonics. The Ogg Opus format is extended indirectly by adding an item with value 2 to the IANA "Opus Channel Mapping Families" registry. When 2 is used as the Channel Mapping Family Number in an Ogg stream, the semantic meaning of the channels in the multichannel Opus stream is the ambisonics layout defined in this document. </t> </section> <section anchor="terminology" title="Terminology"> <t> The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in <xref target="RFC2119"/>. </t> </section> <section anchor="ogg_extension" title="Ambisonics With Ogg Opus"> <t> Ambisonics MAY be encapsulated in the Ogg format by encoding with the Opus codec and setting the Channel Mapping Family value to 2 in the Ogg Identification Header. A demuxer implmentation encountering Channel Mapping Family 2 MUST interpret the Opus stream as containing ambisonics with the format described in <xref target="channel_mapping"/>. </t> <section anchor="channel_mapping" title="Channel Mapping Family 2"> <t> Allowed numbers of channels: (1 + n)^2 for n = 0...14. Explicitly 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225. Ambisonics from zeroth to fourteenth order. </t> <t> This channel mapping uses the same channel mapping table format used by channel mapping families 1 and 255. Each output channel is assigned to an ambisonic component in Ambisonic Channel Number (ACN) order. The ambisonic component with order n and degree m corresponds to channel (n * (n + 1) + m). The reverse correspondence can also be computed for a channel with index k. </t> <figure align="center"> <artwork align="center"><![CDATA[ order n = ceil(sqrt(k)) - 1, degree m = k - n * (n + 1). ]]></artwork> </figure> <t> Channels are normalized with Schmidt Semi-Normalization (SN3D) with no Condon-Shortley phase factor. In SN3D, the spherical harmonic of order n and degree m is normalized according to </t> <figure align="center"> <artwork align="center"><![CDATA[ sqrt((2 - delta(m)) * ((n - m)! / (n + m)!)), ]]></artwork> </figure> <t> where delta(0) = 1 and delta(m) = 0 otherwise. </t> <t> The interpretation of the ambisonics signal as well as the channel order and normalization are described in <xref target="ambix"/>. </t> </section> <section anchor="downmixing" title="Downmixing"> <t> Implementations MAY use the matrix in Figure <xref target="stereo_downmix_matrix" format="counter"/> to implement downmixing from multichannel files using Channel Mapping Family 2 <xref target="channel_mapping"/>, which is known to give acceptable results for stereo. </t> <figure anchor="stereo_downmix_matrix" title="Stereo Downmixing Matrix" align="center"> <artwork align="center"><![CDATA[ / \ / \ / W \ | L | | 0.5 0.5 0.0 ... | | Y | | R | = | 0.5 -0.5 0.0 ... | | ... | \ / \ / \ ... / ]]></artwork> </figure> </section> </section> <section anchor="security" title="Security Considerations"> <t> Implementations of the Ogg container need take appropriate security considerations into account, as outlined in Section 10 of <xref target="RFC7845"/>. The extension definied in this document requires that semantic meaning be assigned to more channels than the existing Ogg format requires. Since more allocations will be required to encode and decode these semantically meaningful channels, care should be taken in any new allocation paths. Implementations MUST NOT overrun their allocated memory nor read from uninitialized memory when managing the ambisonic channel mapping. </t> </section> <section anchor="iana" title="IANA Considerations"> <t> This document updates the IANA Media Types registry "Opus Channel Mapping Families" to add a new assignment. </t> <texttable> <ttcol>Value</ttcol><ttcol>Reference</ttcol> <c>2</c><c>This Document <xref target="channel_mapping"/></c> </texttable> </section> <section anchor="Acknowledgments" title="Acknowledgments"> <t> Thanks to Timothy Terriberry and Marcin Gorzel for their guidance and valuable contributions to this document. </t> </section> </middle> <back> <references title="Normative References"> &rfc2119; &rfc6716; &rfc7845; <reference anchor="ambix" target="http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf"> <front> <title>AMBIX - A SUGGESTED AMBISONICS FORMAT</title> <author initials="C." surname="Nachbar" fullname="Christian Nachbar"/> <author initials="F." surname="Zotter" fullname="Franz Zotter"/> <author initials="E." surname="Deleflie" fullname="Etienne Deleflie"/> <author initials="A." surname="Sontacchi" fullname="Alois Sontacchi"/> <date month="June" year="2011"/> </front> </reference> </references> <references title="Informative References"> <reference anchor="gerzon75" target="http://www.michaelgerzonphotos.org.uk/articles/Ambisonics%201.pdf"> <front> <title>Ambisonics. Part one: General system description</title> <author initials="M." surname="Gerzon" fullname="Michael Gerzon"/> <date month="August" year="1975"/> </front> </reference> <reference anchor="daniel04" target="http://pcfarina.eng.unipr.it/Public/phd-thesis/aes116%20high-passed%20hoa.pdf"> <front> <title>Further Study of Sound Field Coding with Higher Order Ambisonics</title> <author initials="J." surname="Daniel" fullname="J茅r么me Daniel"/> <author initials="S." surname="Moreau" fullname="S茅bastien Moreau"/> <date month="May" year="2004"/> </front> </reference> </references> </back> </rfc> -------------- next part -------------- codec M. Graczyk Internet-Draft Google Inc. Updates: 7845 (if approved) May 24, 2016 Intended status: Standards Track Expires: November 25, 2016 Ambisonics in an Ogg Opus Container draft-graczyk-opus-ambisonics Abstract This document defines an extension to the Ogg format to encapsulate ambisonics coded using the Opus audio codec. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 25, 2016. Graczyk Expires November 25, 2016 [Page 1] Internet-Draft Opus Ambisonics May 2016 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2 3. Ambisonics With Ogg Opus . . . . . . . . . . . . . . . . . . 2 3.1. Channel Mapping Family 2 . . . . . . . . . . . . . . . . 3 3.2. Downmixing . . . . . . . . . . . . . . . . . . . . . . . 3 4. Security Considerations . . . . . . . . . . . . . . . . . . . 3 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 4 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 4 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 4 7.1. Normative References . . . . . . . . . . . . . . . . . . 4 7.2. Informative References . . . . . . . . . . . . . . . . . 5 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 5 1. Introduction Ambisonics is a representation format for three dimensional sound fields which can be used for surround sound and immersive virtual reality playback. See [gerzon75] and [daniel04] for technical details on the ambisonics format. For the purposes of the this document, ambisonics can be considered a multichannel audio stream. The Ogg format is a container for transmission and storage of audio. It can be used to encapsulate streamscoded using the Opus codec. See [RFC6716] and [RFC7845] for technical details on the Opus codec and its encapsulation in the Ogg container respectively. This document extends the Ogg format by defining a new channel mapping family for encoding ambisonics. The Ogg Opus format is extended indirectly by adding an item with value 2 to the IANA "Opus Channel Mapping Families" registry. When 2 is used as the Channel Mapping Family Number in an Ogg stream, the semantic meaning of the channels in the multichannel Opus stream is the ambisonics layout defined in this document. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 3. Ambisonics With Ogg Opus Ambisonics MAY be encapsulated in the Ogg format by encoding with the Opus codec and setting the Channel Mapping Family value to 2 in the Ogg Identification Header. A demuxer implmentation encountering Graczyk Expires November 25, 2016 [Page 2] Internet-Draft Opus Ambisonics May 2016 Channel Mapping Family 2 MUST interpret the Opus stream as containing ambisonics with the format described in Section 3.1. 3.1. Channel Mapping Family 2 Allowed numbers of channels: (1 + n)^2 for n = 0...14. Explicitly 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225. Ambisonics from zeroth to fourteenth order. This channel mapping uses the same channel mapping table format used by channel mapping families 1 and 255. Each output channel is assigned to an ambisonic component in Ambisonic Channel Number (ACN) order. The ambisonic component with order n and degree m corresponds to channel (n * (n + 1) + m). The reverse correspondence can also be computed for a channel with index k. order n = ceil(sqrt(k)) - 1, degree m = k - n * (n + 1). Channels are normalized with Schmidt Semi-Normalization (SN3D) with no Condon-Shortley phase factor. In SN3D, the spherical harmonic of order n and degree m is normalized according to sqrt((2 - delta(m)) * ((n - m)! / (n + m)!)), where delta(0) = 1 and delta(m) = 0 otherwise. The interpretation of the ambisonics signal as well as the channel order and normalization are described in [ambix]. 3.2. Downmixing Implementations MAY use the matrix in Figure 1 to implement downmixing from multichannel files using Channel Mapping Family 2 Section 3.1, which is known to give acceptable results for stereo. / \ / \ / W \ | L | | 0.5 0.5 0.0 ... | | Y | | R | = | 0.5 -0.5 0.0 ... | | ... | \ / \ / \ ... / Figure 1: Stereo Downmixing Matrix 4. Security Considerations Implementations of the Ogg container need take appropriate security considerations into account, as outlined in Section 10 of [RFC7845]. The extension definied in this document requires that semantic Graczyk Expires November 25, 2016 [Page 3] Internet-Draft Opus Ambisonics May 2016 meaning be assigned to more channels than the existing Ogg format requires. Since more allocations will be required to encode and decode these semantically meaningful channels, care should be taken in any new allocation paths. Implementations MUST NOT overrun their allocated memory nor read from uninitialized memory when managing the ambisonic channel mapping. 5. IANA Considerations This document updates the IANA Media Types registry "Opus Channel Mapping Families" to add a new assignment. +-------+---------------------------+ | Value | Reference | +-------+---------------------------+ | 2 | This Document Section 3.1 | +-------+---------------------------+ 6. Acknowledgments Thanks to Timothy Terriberry and Marcin Gorzel for their guidance and valuable contributions to this document. 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>. [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, September 2012, <http://www.rfc-editor.org/info/rfc6716>. [RFC7845] Terriberry, T., Lee, R., and R. Giles, "Ogg Encapsulation for the Opus Audio Codec", RFC 7845, DOI 10.17487/RFC7845, April 2016, <http://www.rfc-editor.org/info/rfc7845>. [ambix] Nachbar, C., Zotter, F., Deleflie, E., and A. Sontacchi, "AMBIX - A SUGGESTED AMBISONICS FORMAT", June 2011, <http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/ ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf>. Graczyk Expires November 25, 2016 [Page 4] Internet-Draft Opus Ambisonics May 2016 7.2. Informative References [gerzon75] Gerzon, M., "Ambisonics. Part one: General system description", August 1975, <http://www.michaelgerzonphotos.org.uk/articles/ Ambisonics%201.pdf>. [daniel04] Daniel, J. and S. Moreau, "Further Study of Sound Field Coding with Higher Order Ambisonics", May 2004, <http://pcfarina.eng.unipr.it/Public/phd-thesis/ aes116%20high-passed%20hoa.pdf>. Author's Address Michael Graczyk Google Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043 USA Email: mgraczyk at google.com Graczyk Expires November 25, 2016 [Page 5]