thr3ads.net - opus - [opus] Gapless concatenation of Opus frames [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Andreas Stöckel

2017-Nov-08 08:43 UTC

[opus] Gapless concatenation of Opus frames

Hi!

Short version of my question: How to produce Opus frames which can be
safely concatenated and how to embed them into a WebM file?

Long version:

I'm currently implementing a web-based audio player which streams
audio as opus/WebM using the HTML5 media source extensions. Currently,
the server decodes a set of input files to a fixed RAW audio format
(stereo, 48000 kHz) and encodes the resulting continuous RAW stream as
Opus/WebM. Having a single, uninterrupted RAW stream allows for
perfect gapless playback on the client (which only sees a single live
WebM stream), e.g. there are no interruptions whatsoever when
transitioning between continuous tracks from the same music album.

An early tech-demo of the technique can be found here [1], the source
file http_audio_server/encoder.cpp implements the relevant
opus-encoding and webm-encapsulation (but see also [2] for a condensed
version).


Now, for performance reasons I'd like to split my RAW audio into
independent blocks (say, as an example, 50 frames or 1s each), encode
these as raw Opus frames and cache them on disc ahead of time. For
each block I'd like to reset the encoder to ensure independence
between the first frame of each block and the last frames in the
previous block, e.g., using

opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)

When the client requests a certain sequence of blocks (which may
originate from various input files in (let's pretend) any order) my
goal is to (on-demand) encapsulate the pre-encoded frames as WebM and
send them to the client.

However, in early experiments [2], resetting the encoder state at the
beginning of each block and then concatenating the frames in the WebM
container leads to clearly audible gaps in the decoded WebM stream
whenever the opus encoder has been reset.

Interestingly, such artifacts are far less pronounced (if they exist
at all), if I don't explicitly reset the encoder. However, in my real
application the encoder will at least be reset implicitly (e.g. by
starting the encoding process in multiple threads for two files which
may be played consecutively).

See [2] for a MWE which expresses what I've tried to describe above.

So to rephrase my question: if it is possible at all, how can I
independently pre-encode blocks of Opus audio frames, such that I can
concatenate them during WebM muxing without audible glitches?


In advance, thank you for your help. Please let me know I anything I
wrote is unclear, or you need more information to answer my question.


Andreas


[1] https://github.com/astoeckel/http_audio_server/
[2] https://github.com/astoeckel/opus_gapless_webm/

Jean-Marc Valin

2017-Nov-13 20:42 UTC

head link

[opus] Gapless concatenation of Opus frames

Hi Andreas,

So if I understand your question correctly, what you want is really
short "files" that are independent, but yet create a glitchless stream
when concatenated, right. For Ogg, this can be implemented with
libopusenc and chaining. It works pretty well (even for really tiny
files). For WebM, I'm not sure how to handle the details at the
container level, but for how to handle the transition details (reset and
all), I suggest you have a look at the libopusenc code. In general, the
idea is to disable the prediction at the point of the transition between
two files and to include the transition frames in both files.

Cheers,

	Jean-Marc

On 11/08/2017 03:43 AM, Andreas Stöckel wrote:> Hi!
> 
> Short version of my question: How to produce Opus frames which can be
> safely concatenated and how to embed them into a WebM file?
> 
> Long version:
> 
> I'm currently implementing a web-based audio player which streams
> audio as opus/WebM using the HTML5 media source extensions. Currently,
> the server decodes a set of input files to a fixed RAW audio format
> (stereo, 48000 kHz) and encodes the resulting continuous RAW stream as
> Opus/WebM. Having a single, uninterrupted RAW stream allows for
> perfect gapless playback on the client (which only sees a single live
> WebM stream), e.g. there are no interruptions whatsoever when
> transitioning between continuous tracks from the same music album.
> 
> An early tech-demo of the technique can be found here [1], the source
> file http_audio_server/encoder.cpp implements the relevant
> opus-encoding and webm-encapsulation (but see also [2] for a condensed
> version).
> 
> 
> Now, for performance reasons I'd like to split my RAW audio into
> independent blocks (say, as an example, 50 frames or 1s each), encode
> these as raw Opus frames and cache them on disc ahead of time. For
> each block I'd like to reset the encoder to ensure independence
> between the first frame of each block and the last frames in the
> previous block, e.g., using
> 
> opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
> 
> When the client requests a certain sequence of blocks (which may
> originate from various input files in (let's pretend) any order) my
> goal is to (on-demand) encapsulate the pre-encoded frames as WebM and
> send them to the client.
> 
> However, in early experiments [2], resetting the encoder state at the
> beginning of each block and then concatenating the frames in the WebM
> container leads to clearly audible gaps in the decoded WebM stream
> whenever the opus encoder has been reset.
> 
> Interestingly, such artifacts are far less pronounced (if they exist
> at all), if I don't explicitly reset the encoder. However, in my real
> application the encoder will at least be reset implicitly (e.g. by
> starting the encoding process in multiple threads for two files which
> may be played consecutively).
> 
> See [2] for a MWE which expresses what I've tried to describe above.
> 
> So to rephrase my question: if it is possible at all, how can I
> independently pre-encode blocks of Opus audio frames, such that I can
> concatenate them during WebM muxing without audible glitches?
> 
> 
> In advance, thank you for your help. Please let me know I anything I
> wrote is unclear, or you need more information to answer my question.
> 
> 
> Andreas
> 
> 
> [1] https://github.com/astoeckel/http_audio_server/
> [2] https://github.com/astoeckel/opus_gapless_webm/
> _______________________________________________
> opus mailing list
> opus at xiph.org
> http://lists.xiph.org/mailman/listinfo/opus
>

Michael Bradshaw

2017-Nov-13 21:14 UTC

head link

[opus] Gapless concatenation of Opus frames

For WebM, you can use a Block elements with DiscardPadding set to skip the
trailing/leading samples at the seam points. Not all video players respect
the DiscardPadding element, though (they should, but not all do).

On Mon, Nov 13, 2017 at 12:42 PM, Jean-Marc Valin <jmvalin at jmvalin.ca>
wrote:
> Hi Andreas,
>
> So if I understand your question correctly, what you want is really
> short "files" that are independent, but yet create a glitchless
stream
> when concatenated, right. For Ogg, this can be implemented with
> libopusenc and chaining. It works pretty well (even for really tiny
> files). For WebM, I'm not sure how to handle the details at the
> container level, but for how to handle the transition details (reset and
> all), I suggest you have a look at the libopusenc code. In general, the
> idea is to disable the prediction at the point of the transition between
> two files and to include the transition frames in both files.
>
> Cheers,
>
>         Jean-Marc
>
> On 11/08/2017 03:43 AM, Andreas Stöckel wrote:
> > Hi!
> >
> > Short version of my question: How to produce Opus frames which can be
> > safely concatenated and how to embed them into a WebM file?
> >
> > Long version:
> >
> > I'm currently implementing a web-based audio player which streams
> > audio as opus/WebM using the HTML5 media source extensions. Currently,
> > the server decodes a set of input files to a fixed RAW audio format
> > (stereo, 48000 kHz) and encodes the resulting continuous RAW stream as
> > Opus/WebM. Having a single, uninterrupted RAW stream allows for
> > perfect gapless playback on the client (which only sees a single live
> > WebM stream), e.g. there are no interruptions whatsoever when
> > transitioning between continuous tracks from the same music album.
> >
> > An early tech-demo of the technique can be found here [1], the source
> > file http_audio_server/encoder.cpp implements the relevant
> > opus-encoding and webm-encapsulation (but see also [2] for a condensed
> > version).
> >
> >
> > Now, for performance reasons I'd like to split my RAW audio into
> > independent blocks (say, as an example, 50 frames or 1s each), encode
> > these as raw Opus frames and cache them on disc ahead of time. For
> > each block I'd like to reset the encoder to ensure independence
> > between the first frame of each block and the last frames in the
> > previous block, e.g., using
> >
> > opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
> >
> > When the client requests a certain sequence of blocks (which may
> > originate from various input files in (let's pretend) any order)
my
> > goal is to (on-demand) encapsulate the pre-encoded frames as WebM and
> > send them to the client.
> >
> > However, in early experiments [2], resetting the encoder state at the
> > beginning of each block and then concatenating the frames in the WebM
> > container leads to clearly audible gaps in the decoded WebM stream
> > whenever the opus encoder has been reset.
> >
> > Interestingly, such artifacts are far less pronounced (if they exist
> > at all), if I don't explicitly reset the encoder. However, in my
real
> > application the encoder will at least be reset implicitly (e.g. by
> > starting the encoding process in multiple threads for two files which
> > may be played consecutively).
> >
> > See [2] for a MWE which expresses what I've tried to describe
above.
> >
> > So to rephrase my question: if it is possible at all, how can I
> > independently pre-encode blocks of Opus audio frames, such that I can
> > concatenate them during WebM muxing without audible glitches?
> >
> >
> > In advance, thank you for your help. Please let me know I anything I
> > wrote is unclear, or you need more information to answer my question.
> >
> >
> > Andreas
> >
> >
> > [1] https://github.com/astoeckel/http_audio_server/
> > [2] https://github.com/astoeckel/opus_gapless_webm/
> > _______________________________________________
> > opus mailing list
> > opus at xiph.org
> > http://lists.xiph.org/mailman/listinfo/opus
> >
> _______________________________________________
> opus mailing list
> opus at xiph.org
> http://lists.xiph.org/mailman/listinfo/opus
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xiph.org/pipermail/opus/attachments/20171113/6c025ac4/attachment.html>

Andreas Stöckel

2017-Nov-13 21:16 UTC

head link

[opus] Gapless concatenation of Opus frames

Hi Jean-Mark,

thank you for your answer!

Yes, you understood my question correctly. I was just about to compose
a reply to my original question, where I described how I solved my
problem. As you've already suggested, I've switched to Ogg/Opus, which
is better supported, but does not work with the Media Source Extensions.

I'll have a look whether disabling prediction will help with the
transitioning phase, but I think the way I'm implementing it right now
it probably won't.

So here is what I was going to write originally:

When I wrote the question, I wasn't really aware of the pre-skip
(CodecDelay in WebM) and DiscardPadding [1]. However, these properties
can only be set on a per-stream basis, and not on independent
sequences of WebM packets. As a consequence of my ignorance regarding
pre-skip, I also didn't append an additional frame to the audio such
that 6.5ms lost due to the pre-skip couldn't be recovered when
decoding. As an additional complication with WebM, there is also no
way to indicate in a WebM stream that the decoder should reset. So if
anything, we can only concatenate entire files/streams, and not on a
per-packet basis.

However, playing back individual WebM streams with CodecDelay and
DiscardPadding set (and an additional lead-out frame) did not work,
since CodecDelay/DiscardPadding were only insufficiently interpreted
by Chromium/Firefox and even ffmpeg. There is a method for gapless
concatenation of entire files using MSE, described here [2], but this
didn't work for Firefox and still produced audible artifacts on Chrome.

Well, the way I'm solving the problem now is the following:

First, I've switched to Ogg/Opus. Second, I'm appending a reversed
version of the first/last 20ms to the beginning/end of the audio chunk
I'm encoding. This reduces ringing artifacts from the transient at the
beginning/end of the chunk. I then set pre-skip and the granule of the
last packet in the generated Ogg stream in such a way, that the
relevant audio information is "cut out". In contrast to WebM, browsers
(and ffmpeg) actually correctly interpret this meta-information in an
Ogg container. However, browsers do not support Ogg in conjunction
with the Media Source Extensions. Thus, I've ditched MSE and I am now
decoding the individual chunks with the WebAudio API and schedule
gapless playback of the chunks (which is not optimal, since WebAudio
is rather finicky).

The working implementation can be found here [3]. Since Ogg is so much
simpler than WebM I also wrote my own minimal C++ Ogg/Opus muxer,
which shaves off another dependency of my application.

Thank you for your help,
Andreas

[1] https://wiki.xiph.org/MatroskaOpus

[2]
https://developers.google.com/web/fundamentals/media/mse/seamless-playback

[3] https://github.com/astoeckel/opus_gapless

On 2017-11-13 03:42 PM, Jean-Marc Valin wrote:> Hi Andreas,
> 
> So if I understand your question correctly, what you want is really
> short "files" that are independent, but yet create a glitchless
stream
> when concatenated, right. For Ogg, this can be implemented with
> libopusenc and chaining. It works pretty well (even for really tiny
> files). For WebM, I'm not sure how to handle the details at the
> container level, but for how to handle the transition details (reset and
> all), I suggest you have a look at the libopusenc code. In general, the
> idea is to disable the prediction at the point of the transition between
> two files and to include the transition frames in both files.
> 
> Cheers,
> 
> 	Jean-Marc
> 
> On 11/08/2017 03:43 AM, Andreas Stöckel wrote:
>> Hi!
>>
>> Short version of my question: How to produce Opus frames which can be
>> safely concatenated and how to embed them into a WebM file?
>>
>> Long version:
>>
>> I'm currently implementing a web-based audio player which streams
>> audio as opus/WebM using the HTML5 media source extensions. Currently,
>> the server decodes a set of input files to a fixed RAW audio format
>> (stereo, 48000 kHz) and encodes the resulting continuous RAW stream as
>> Opus/WebM. Having a single, uninterrupted RAW stream allows for
>> perfect gapless playback on the client (which only sees a single live
>> WebM stream), e.g. there are no interruptions whatsoever when
>> transitioning between continuous tracks from the same music album.
>>
>> An early tech-demo of the technique can be found here [1], the source
>> file http_audio_server/encoder.cpp implements the relevant
>> opus-encoding and webm-encapsulation (but see also [2] for a condensed
>> version).
>>
>>
>> Now, for performance reasons I'd like to split my RAW audio into
>> independent blocks (say, as an example, 50 frames or 1s each), encode
>> these as raw Opus frames and cache them on disc ahead of time. For
>> each block I'd like to reset the encoder to ensure independence
>> between the first frame of each block and the last frames in the
>> previous block, e.g., using
>>
>> opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
>>
>> When the client requests a certain sequence of blocks (which may
>> originate from various input files in (let's pretend) any order) my
>> goal is to (on-demand) encapsulate the pre-encoded frames as WebM and
>> send them to the client.
>>
>> However, in early experiments [2], resetting the encoder state at the
>> beginning of each block and then concatenating the frames in the WebM
>> container leads to clearly audible gaps in the decoded WebM stream
>> whenever the opus encoder has been reset.
>>
>> Interestingly, such artifacts are far less pronounced (if they exist
>> at all), if I don't explicitly reset the encoder. However, in my
real
>> application the encoder will at least be reset implicitly (e.g. by
>> starting the encoding process in multiple threads for two files which
>> may be played consecutively).
>>
>> See [2] for a MWE which expresses what I've tried to describe
above.
>>
>> So to rephrase my question: if it is possible at all, how can I
>> independently pre-encode blocks of Opus audio frames, such that I can
>> concatenate them during WebM muxing without audible glitches?
>>
>>
>> In advance, thank you for your help. Please let me know I anything I
>> wrote is unclear, or you need more information to answer my question.
>>
>>
>> Andreas
>>
>>
>> [1] https://github.com/astoeckel/http_audio_server/
>> [2] https://github.com/astoeckel/opus_gapless_webm/
>> _______________________________________________
>> opus mailing list
>> opus at xiph.org
>> http://lists.xiph.org/mailman/listinfo/opus
>>

Ulrich Windl

2017-Nov-14 08:03 UTC

head link

[opus] Antw: Gapless concatenation of Opus frames

HI!

I'm not really an expert on this, but would it be sufficient for your
requirements if a new header was added for each block of audio? I mean the
_same_ header block.
I suspect starting a new encoding for a new block will change the parameters
of the encoding (it's adaptive), and thus you may hear artefacts.
Maybe the experts can comment on that (whether the chances of artefacts
reduce, and whether taht's possible (same header) at all).

Regards,
Ulrich
>>> Andreas Stöckel <astoecke at uwaterloo.ca> schrieb am
08.11.2017 um 09:43 inNachricht <82d4042c-dd1b-5c4f-4cc5-f7a7d503e004 at
uwaterloo.ca>:> Hi!
> 
> Short version of my question: How to produce Opus frames which can be
> safely concatenated and how to embed them into a WebM file?
> 
> Long version:
> 
> I'm currently implementing a web-based audio player which streams
> audio as opus/WebM using the HTML5 media source extensions. Currently,
> the server decodes a set of input files to a fixed RAW audio format
> (stereo, 48000 kHz) and encodes the resulting continuous RAW stream as
> Opus/WebM. Having a single, uninterrupted RAW stream allows for
> perfect gapless playback on the client (which only sees a single live
> WebM stream), e.g. there are no interruptions whatsoever when
> transitioning between continuous tracks from the same music album.
> 
> An early tech-demo of the technique can be found here [1], the source
> file http_audio_server/encoder.cpp implements the relevant
> opus-encoding and webm-encapsulation (but see also [2] for a condensed
> version).
> 
> 
> Now, for performance reasons I'd like to split my RAW audio into
> independent blocks (say, as an example, 50 frames or 1s each), encode
> these as raw Opus frames and cache them on disc ahead of time. For
> each block I'd like to reset the encoder to ensure independence
> between the first frame of each block and the last frames in the
> previous block, e.g., using
> 
> opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
> 
> When the client requests a certain sequence of blocks (which may
> originate from various input files in (let's pretend) any order) my
> goal is to (on-demand) encapsulate the pre-encoded frames as WebM and
> send them to the client.
> 
> However, in early experiments [2], resetting the encoder state at the
> beginning of each block and then concatenating the frames in the WebM
> container leads to clearly audible gaps in the decoded WebM stream
> whenever the opus encoder has been reset.
> 
> Interestingly, such artifacts are far less pronounced (if they exist
> at all), if I don't explicitly reset the encoder. However, in my real
> application the encoder will at least be reset implicitly (e.g. by
> starting the encoding process in multiple threads for two files which
> may be played consecutively).
> 
> See [2] for a MWE which expresses what I've tried to describe above.
> 
> So to rephrase my question: if it is possible at all, how can I
> independently pre-encode blocks of Opus audio frames, such that I can
> concatenate them during WebM muxing without audible glitches?
> 
> 
> In advance, thank you for your help. Please let me know I anything I
> wrote is unclear, or you need more information to answer my question.
> 
> 
> Andreas
> 
> 
> [1] https://github.com/astoeckel/http_audio_server/ 
> [2] https://github.com/astoeckel/opus_gapless_webm/ 
> _______________________________________________
> opus mailing list
> opus at xiph.org 
> http://lists.xiph.org/mailman/listinfo/opus

Reasonably Related Threads

Search for more possibly parallel threads

opus - Nov 2017 - Gapless concatenation of Opus frames

[opus] Gapless concatenation of Opus frames

[opus] Gapless concatenation of Opus frames

[opus] Gapless concatenation of Opus frames

[opus] Gapless concatenation of Opus frames

[opus] Antw: Gapless concatenation of Opus frames

Reasonably Related Threads