I've run into some issues using Opus with source files in channel layouts other than the default 8. For instance, 2.1 isn't supported, so I have to either downconvert to 2.0 or upconvert to 5.1 (which usually involves adding empty channels, which prevents the playback device from upconverting to the native layout). To address this, I've put together an initial draft of an I-D I'd like to run by this list for feedback. Here's the RFCXML: And the resulting TXT: This is primarily based on the Ambisonics draft, as it's generally making additions to the same areas. A few areas I'd expect could be improved: Currently I'm using Microsoft's WAVEFORMATEXTENSIBLE structure's dwChannelMask field to define channel positions. These values are reasonably standard, but I'm not sure if citing Microsoft's documentation directly is the best way to handle this. We might want to copy the channel names into the spec, or cite some other document from a standards body naming and numbering speaker positions (does anyone know anything applicable?). Should this list be made into its own registry, for later extensibility? ffmpeg additionally defines "stereo left/right" (for an embedded downmix), "wide left/right", "surround direct left/right", and "low frequency #2". The downmixing algorithm is largely a description of ffmpeg's libswresample's behavior. Documenting some downmixing behavior was suggested by mark4o in the #opus channel on Freenode; not sure if the current formatting is optimal. I'm referring to (and using terminology from) the struct defined in section 5.1.1 of RFC7845. Should I copy the relevant portions into this document as well? I'm redefining the mapping table for this mapping family. This was an easy place to specify channel positions, but it means that the "multiple channels copied from the same stream" feature isn't available. I don't particularly expect this to be an issue, but if people have problems with it, we can instead keep the existing table and add a simple bitmask before or after it. I've looked into the relevant changes in libavcodec and libopus. It's reasonably simple to add on the decode side in libavcodec's native decoder (largely changes to ff_opus_parse_extradata), but it's a bit trickier in libopus, because there's no support in the API for channel layouts (and thus stereo stream counts) other than those in the standard. I'd expect to need to take one of these routes: Add new opus_multistream_surround_encoder_create and opus_multistream_surround_encoder_init variants that take a uint32_t (or uint64_t for future expansion) bitmask Have opus_multistream_surround_encoder_create/opus_multistream_surround_encoder_init read from streams and coupled_streams when mapping_family == 4 (and require the consumer to set them, and adjust the resulting mapping) Have opus_multistream_surround_encoder_create/opus_multistream_surround_encoder_init read from mapping when mapping_family == 4 (and require the consumer to set it) Anyone have preferences on this? Additionally, I'd expect that once this draft is finalized, I'll be writing up a similar extension for FLAC. Thanks in advance for the feedback! --rcombs -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20181024/0f7c125a/attachment-0003.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: draft-ietf-codec-extended-layouts-00.xml Type: application/xml Size: 11679 bytes Desc: not available URL: <http://lists.xiph.org/pipermail/opus/attachments/20181024/0f7c125a/attachment-0001.xml> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20181024/0f7c125a/attachment-0004.html> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: draft-ietf-codec-extended-layouts-00-2.txt URL: <http://lists.xiph.org/pipermail/opus/attachments/20181024/0f7c125a/attachment-0001.txt> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20181024/0f7c125a/attachment-0005.html>
Ulrich Windl
2018-Oct-25 06:15 UTC
[opus] Antw: Proposal - Extended Channel Layouts in Opus
>>> Rodger Combs <rodger.combs at gmail.com> schrieb am 25.10.2018 um 03:30 inNachricht <06F11512-FD4F-4C99-8E87-7D1B5D7CAC3A at gmail.com>:> I've run into some issues using Opus with source files in channel layouts > other than the default 8. For instance, 2.1 isn't supported, so I have to > either downconvert to 2.0 or upconvert to 5.1 (which usually involves adding> empty channels, which prevents the playback device from upconverting to the> native layout).Hi! While talking on 2.1: Is there any standard for recording MS (Mid-Side (2 channels) with Opus (or any other Ogg Codec). In MS one channel is the sum of both stereo channels, while the other channel is the difference between left and right (or the other way round). Usually this isn't computed from stereo (LR) recordings, but made with special microphones... Regards, Ulrich> To address this, I've put together an initial draft of an I‑D I'd like torun> by this list for feedback. > Here's the RFCXML: > And the resulting TXT: > > This is primarily based on the Ambisonics draft, as it's generally making > additions to the same areas. > > A few areas I'd expect could be improved: > Currently I'm using Microsoft's WAVEFORMATEXTENSIBLE structure's > dwChannelMask field to define channel positions. These values are reasonably> standard, but I'm not sure if citing Microsoft's documentation directly is > the best way to handle this. We might want to copy the channel names intothe> spec, or cite some other document from a standards body naming and numbering> speaker positions (does anyone know anything applicable?). > Should this list be made into its own registry, for later extensibility? > ffmpeg additionally defines "stereo left/right" (for an embedded downmix), > "wide left/right", "surround direct left/right", and "low frequency #2". > The downmixing algorithm is largely a description of ffmpeg's > libswresample's behavior. Documenting some downmixing behavior was suggested> by mark4o in the #opus channel on Freenode; not sure if the current > formatting is optimal. > I'm referring to (and using terminology from) the struct defined in section> 5.1.1 of RFC7845. Should I copy the relevant portions into this document as> well? > I'm redefining the mapping table for this mapping family. This was an easy > place to specify channel positions, but it means that the "multiple channels> copied from the same stream" feature isn't available. I don't particularly > expect this to be an issue, but if people have problems with it, we can > instead keep the existing table and add a simple bitmask before or afterit.> > I've looked into the relevant changes in libavcodec and libopus. It's > reasonably simple to add on the decode side in libavcodec's native decoder > (largely changes to ff_opus_parse_extradata), but it's a bit trickier in > libopus, because there's no support in the API for channel layouts (and thus> stereo stream counts) other than those in the standard. I'd expect to needto> take one of these routes: > Add new opus_multistream_surround_encoder_create and > opus_multistream_surround_encoder_init variants that take a uint32_t (or > uint64_t for future expansion) bitmask > Have >opus_multistream_surround_encoder_create/opus_multistream_surround_encoder_in> it read from streams and coupled_streams when mapping_family == 4 (and > require the consumer to set them, and adjust the resulting mapping) > Have >opus_multistream_surround_encoder_create/opus_multistream_surround_encoder_in> it read from mapping when mapping_family == 4 (and require the consumer to > set it) > Anyone have preferences on this? > > Additionally, I'd expect that once this draft is finalized, I'll be writing> up a similar extension for FLAC. > > Thanks in advance for the feedback! > ‑‑rcombs