thr3ads.net - opus - [opus] Gapless concatenation of Opus frames [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Andreas Stöckel

2017-Nov-15 08:00 UTC

[opus] Gapless concatenation of Opus frames

Hi Jean-Marc (and everyone else who replied),
> Considering you're switching to Ogg, I think you should give
libopusenc> a try. It does a really good job at getting rid of *all*discontinuities> -- to the point where you can chop a song into files
less than one> millisecond each and it still sounds good. It's also
pretty simple to> use. You just feed it audio and tell it where the
file boundaries are.

thank you for pointing me at libopusenc. I had a look at the source
code and liked the idea of using Linear Predictive Coding for the
generation of the lead-in/lead-out frame. This avoids some
high-frequency content that my mirroring technique produced. I
C++ified the corresponding 1994 LPC code and implanted it into my
program [1]. Works like a charm.

Since my program seemed to work fairly well I was doing some extended
tests and found one particular case where it still produces audible
artifacts.

Unfortunately, libopusenc with ope_encoder_continue_new_file (see [2]
for my code) produces similar (though not the same) audible artifacts.
The affected audio file has very low frequency content (produced by a
Taiko).

In my program the low frequency content seems to be phase shifted,
producing a discontinuity while transitioning OGG files [3].

Libopusenc seems to introduce ringing artifacts [4], resulting in a
similar, yet not that pronounced clicking noise. (Maybe the ringing
stems from no "lead-in" frame being used is used -- in my program I do
a reverse LPC at the begining of the first audio chunk to create an
artificial frame that leads up to the first frame [7]).

You can reproduce the libopusenc problem by compiling my adapted
opusenc_example.c [2] and feeding in a segment of the affected RAW
audio as indicated at the beginning of my source code. The RAW can be
downloaded here [5] (48000 Hz, stereo, 16-bit signed, little endian;
the complete song can be downloaded here [6]).

Any idea how any of the two issues (either in libopusenc or in my
program) might be solved?


Again, thank you for your help!

Cheers,
Andreas


[1] https://github.com/astoeckel/opus_gapless/blob/master/lpc.cpp

[2] https://gist.github.com/astoeckel/6731bc846a2f70dd7f5e155e75683fae

[3] https://somweyr.de/opus/click_opus_gapless.png

[4] https://somweyr.de/opus/click_libopusenc.png

[5]
https://somweyr.de/opus/test_libopusenc_ope_encoder_continue_new_file.raw.bz2

[6] https://www.youtube.com/watch?v=z64HCi2rQkE

[7]
https://github.com/astoeckel/opus_gapless/blob/master/opus_gapless.cpp#L82
> 
> Cheers,
> 
> 	Jean-Marc
> 
> On 11/13/2017 04:16 PM, Andreas Stöckel wrote:
>> Hi Jean-Mark,
>>
>> thank you for your answer!
>>
>> Yes, you understood my question correctly. I was just about to compose
>> a reply to my original question, where I described how I solved my
>> problem. As you've already suggested, I've switched to
Ogg/Opus, which
>> is better supported, but does not work with the Media Source
Extensions.
>>
>> I'll have a look whether disabling prediction will help with the
>> transitioning phase, but I think the way I'm implementing it right
now
>> it probably won't.
>>
>> So here is what I was going to write originally:
>>
>> When I wrote the question, I wasn't really aware of the pre-skip
>> (CodecDelay in WebM) and DiscardPadding [1]. However, these properties
>> can only be set on a per-stream basis, and not on independent
>> sequences of WebM packets. As a consequence of my ignorance regarding
>> pre-skip, I also didn't append an additional frame to the audio
such
>> that 6.5ms lost due to the pre-skip couldn't be recovered when
>> decoding. As an additional complication with WebM, there is also no
>> way to indicate in a WebM stream that the decoder should reset. So if
>> anything, we can only concatenate entire files/streams, and not on a
>> per-packet basis.
>>
>> However, playing back individual WebM streams with CodecDelay and
>> DiscardPadding set (and an additional lead-out frame) did not work,
>> since CodecDelay/DiscardPadding were only insufficiently interpreted
>> by Chromium/Firefox and even ffmpeg. There is a method for gapless
>> concatenation of entire files using MSE, described here [2], but this
>> didn't work for Firefox and still produced audible artifacts on
Chrome.
>>
>>
>> Well, the way I'm solving the problem now is the following:
>>
>> First, I've switched to Ogg/Opus. Second, I'm appending a
reversed
>> version of the first/last 20ms to the beginning/end of the audio chunk
>> I'm encoding. This reduces ringing artifacts from the transient at
the
>> beginning/end of the chunk. I then set pre-skip and the granule of the
>> last packet in the generated Ogg stream in such a way, that the
>> relevant audio information is "cut out". In contrast to WebM,
browsers
>> (and ffmpeg) actually correctly interpret this meta-information in an
>> Ogg container. However, browsers do not support Ogg in conjunction
>> with the Media Source Extensions. Thus, I've ditched MSE and I am
now
>> decoding the individual chunks with the WebAudio API and schedule
>> gapless playback of the chunks (which is not optimal, since WebAudio
>> is rather finicky).
>>
>> The working implementation can be found here [3]. Since Ogg is so much
>> simpler than WebM I also wrote my own minimal C++ Ogg/Opus muxer,
>> which shaves off another dependency of my application.
>>
>>
>> Thank you for your help,
>> Andreas
>>
>>
>>
>> [1] https://wiki.xiph.org/MatroskaOpus
>>
>> [2]
>>
https://developers.google.com/web/fundamentals/media/mse/seamless-playback
>>
>> [3] https://github.com/astoeckel/opus_gapless
>>
>> On 2017-11-13 03:42 PM, Jean-Marc Valin wrote:
>>> Hi Andreas,
>>>
>>> So if I understand your question correctly, what you want is really
>>> short "files" that are independent, but yet create a
glitchless stream
>>> when concatenated, right. For Ogg, this can be implemented with
>>> libopusenc and chaining. It works pretty well (even for really tiny
>>> files). For WebM, I'm not sure how to handle the details at the
>>> container level, but for how to handle the transition details
(reset and
>>> all), I suggest you have a look at the libopusenc code. In general,
the
>>> idea is to disable the prediction at the point of the transition
between
>>> two files and to include the transition frames in both files.
>>>
>>> Cheers,
>>>
>>> 	Jean-Marc
>>>
>>> On 11/08/2017 03:43 AM, Andreas Stöckel wrote:
>>>> Hi!
>>>>
>>>> Short version of my question: How to produce Opus frames which
can be
>>>> safely concatenated and how to embed them into a WebM file?
>>>>
>>>> Long version:
>>>>
>>>> I'm currently implementing a web-based audio player which
streams
>>>> audio as opus/WebM using the HTML5 media source extensions.
Currently,
>>>> the server decodes a set of input files to a fixed RAW audio
format
>>>> (stereo, 48000 kHz) and encodes the resulting continuous RAW
stream as
>>>> Opus/WebM. Having a single, uninterrupted RAW stream allows for
>>>> perfect gapless playback on the client (which only sees a
single live
>>>> WebM stream), e.g. there are no interruptions whatsoever when
>>>> transitioning between continuous tracks from the same music
album.
>>>>
>>>> An early tech-demo of the technique can be found here [1], the
source
>>>> file http_audio_server/encoder.cpp implements the relevant
>>>> opus-encoding and webm-encapsulation (but see also [2] for a
condensed
>>>> version).
>>>>
>>>>
>>>> Now, for performance reasons I'd like to split my RAW audio
into
>>>> independent blocks (say, as an example, 50 frames or 1s each),
encode
>>>> these as raw Opus frames and cache them on disc ahead of time.
For
>>>> each block I'd like to reset the encoder to ensure
independence
>>>> between the first frame of each block and the last frames in
the
>>>> previous block, e.g., using
>>>>
>>>> opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
>>>>
>>>> When the client requests a certain sequence of blocks (which
may
>>>> originate from various input files in (let's pretend) any
order) my
>>>> goal is to (on-demand) encapsulate the pre-encoded frames as
WebM and
>>>> send them to the client.
>>>>
>>>> However, in early experiments [2], resetting the encoder state
at the
>>>> beginning of each block and then concatenating the frames in
the WebM
>>>> container leads to clearly audible gaps in the decoded WebM
stream
>>>> whenever the opus encoder has been reset.
>>>>
>>>> Interestingly, such artifacts are far less pronounced (if they
exist
>>>> at all), if I don't explicitly reset the encoder. However,
in my real
>>>> application the encoder will at least be reset implicitly (e.g.
by
>>>> starting the encoding process in multiple threads for two files
which
>>>> may be played consecutively).
>>>>
>>>> See [2] for a MWE which expresses what I've tried to
describe above.
>>>>
>>>> So to rephrase my question: if it is possible at all, how can I
>>>> independently pre-encode blocks of Opus audio frames, such that
I can
>>>> concatenate them during WebM muxing without audible glitches?
>>>>
>>>>
>>>> In advance, thank you for your help. Please let me know I
anything I
>>>> wrote is unclear, or you need more information to answer my
question.
>>>>
>>>>
>>>> Andreas
>>>>
>>>>
>>>> [1] https://github.com/astoeckel/http_audio_server/
>>>> [2] https://github.com/astoeckel/opus_gapless_webm/
>>>> _______________________________________________
>>>> opus mailing list
>>>> opus at xiph.org
>>>> http://lists.xiph.org/mailman/listinfo/opus
>>>>
>>

Jean-Marc Valin

2017-Nov-16 01:25 UTC

head link

[opus] Gapless concatenation of Opus frames

Hi Andreas,

So I encoded your file in chunks with a slightly modified version of
opusenc_example and I can't hear anything wrong. Maybe there's a problem
in the tools you used? I uploaded the files at:

https://jmvalin.ca/misc_stuff/continuous.opus (one file)
https://jmvalin.ca/misc_stuff/continuous.wav (one file, decoded)
https://jmvalin.ca/misc_stuff/chained.opus (many small files)
https://jmvalin.ca/misc_stuff/chained.wav (many small files, decoded)

Can you hear any of the glitches you mentioned in continuous.wav? If
there's indeed an issue, it can likely be fixed by just adding a small
amount of redundancy to libopusenc. There's no fundamental reason it
can't be perfectly glitchless.

Cheers,

	Jean-Marc


On 11/15/2017 03:00 AM, Andreas Stöckel wrote:> Hi Jean-Marc (and everyone else who replied),
> 
>> Considering you're switching to Ogg, I think you should give
libopusenc> a try. It does a really good job at getting rid of *all*
> discontinuities> -- to the point where you can chop a song into files
> less than one> millisecond each and it still sounds good. It's also
> pretty simple to> use. You just feed it audio and tell it where the
> file boundaries are.
> 
> thank you for pointing me at libopusenc. I had a look at the source
> code and liked the idea of using Linear Predictive Coding for the
> generation of the lead-in/lead-out frame. This avoids some
> high-frequency content that my mirroring technique produced. I
> C++ified the corresponding 1994 LPC code and implanted it into my
> program [1]. Works like a charm.
> 
> Since my program seemed to work fairly well I was doing some extended
> tests and found one particular case where it still produces audible
> artifacts.
> 
> Unfortunately, libopusenc with ope_encoder_continue_new_file (see [2]
> for my code) produces similar (though not the same) audible artifacts.
> The affected audio file has very low frequency content (produced by a
> Taiko).
> 
> In my program the low frequency content seems to be phase shifted,
> producing a discontinuity while transitioning OGG files [3].
> 
> Libopusenc seems to introduce ringing artifacts [4], resulting in a
> similar, yet not that pronounced clicking noise. (Maybe the ringing
> stems from no "lead-in" frame being used is used -- in my program
I do
> a reverse LPC at the begining of the first audio chunk to create an
> artificial frame that leads up to the first frame [7]).
> 
> You can reproduce the libopusenc problem by compiling my adapted
> opusenc_example.c [2] and feeding in a segment of the affected RAW
> audio as indicated at the beginning of my source code. The RAW can be
> downloaded here [5] (48000 Hz, stereo, 16-bit signed, little endian;
> the complete song can be downloaded here [6]).
> 
> Any idea how any of the two issues (either in libopusenc or in my
> program) might be solved?
> 
> 
> Again, thank you for your help!
> 
> Cheers,
> Andreas
> 
> 
> [1] https://github.com/astoeckel/opus_gapless/blob/master/lpc.cpp
> 
> [2] https://gist.github.com/astoeckel/6731bc846a2f70dd7f5e155e75683fae
> 
> [3] https://somweyr.de/opus/click_opus_gapless.png
> 
> [4] https://somweyr.de/opus/click_libopusenc.png
> 
> [5]
>
https://somweyr.de/opus/test_libopusenc_ope_encoder_continue_new_file.raw.bz2
> 
> [6] https://www.youtube.com/watch?v=z64HCi2rQkE
> 
> [7]
> https://github.com/astoeckel/opus_gapless/blob/master/opus_gapless.cpp#L82
> 
>>
>> Cheers,
>>
>> 	Jean-Marc
>>
>> On 11/13/2017 04:16 PM, Andreas Stöckel wrote:
>>> Hi Jean-Mark,
>>>
>>> thank you for your answer!
>>>
>>> Yes, you understood my question correctly. I was just about to
compose
>>> a reply to my original question, where I described how I solved my
>>> problem. As you've already suggested, I've switched to
Ogg/Opus, which
>>> is better supported, but does not work with the Media Source
Extensions.
>>>
>>> I'll have a look whether disabling prediction will help with
the
>>> transitioning phase, but I think the way I'm implementing it
right now
>>> it probably won't.
>>>
>>> So here is what I was going to write originally:
>>>
>>> When I wrote the question, I wasn't really aware of the
pre-skip
>>> (CodecDelay in WebM) and DiscardPadding [1]. However, these
properties
>>> can only be set on a per-stream basis, and not on independent
>>> sequences of WebM packets. As a consequence of my ignorance
regarding
>>> pre-skip, I also didn't append an additional frame to the audio
such
>>> that 6.5ms lost due to the pre-skip couldn't be recovered when
>>> decoding. As an additional complication with WebM, there is also no
>>> way to indicate in a WebM stream that the decoder should reset. So
if
>>> anything, we can only concatenate entire files/streams, and not on
a
>>> per-packet basis.
>>>
>>> However, playing back individual WebM streams with CodecDelay and
>>> DiscardPadding set (and an additional lead-out frame) did not work,
>>> since CodecDelay/DiscardPadding were only insufficiently
interpreted
>>> by Chromium/Firefox and even ffmpeg. There is a method for gapless
>>> concatenation of entire files using MSE, described here [2], but
this
>>> didn't work for Firefox and still produced audible artifacts on
Chrome.
>>>
>>>
>>> Well, the way I'm solving the problem now is the following:
>>>
>>> First, I've switched to Ogg/Opus. Second, I'm appending a
reversed
>>> version of the first/last 20ms to the beginning/end of the audio
chunk
>>> I'm encoding. This reduces ringing artifacts from the transient
at the
>>> beginning/end of the chunk. I then set pre-skip and the granule of
the
>>> last packet in the generated Ogg stream in such a way, that the
>>> relevant audio information is "cut out". In contrast to
WebM, browsers
>>> (and ffmpeg) actually correctly interpret this meta-information in
an
>>> Ogg container. However, browsers do not support Ogg in conjunction
>>> with the Media Source Extensions. Thus, I've ditched MSE and I
am now
>>> decoding the individual chunks with the WebAudio API and schedule
>>> gapless playback of the chunks (which is not optimal, since
WebAudio
>>> is rather finicky).
>>>
>>> The working implementation can be found here [3]. Since Ogg is so
much
>>> simpler than WebM I also wrote my own minimal C++ Ogg/Opus muxer,
>>> which shaves off another dependency of my application.
>>>
>>>
>>> Thank you for your help,
>>> Andreas
>>>
>>>
>>>
>>> [1] https://wiki.xiph.org/MatroskaOpus
>>>
>>> [2]
>>>
https://developers.google.com/web/fundamentals/media/mse/seamless-playback
>>>
>>> [3] https://github.com/astoeckel/opus_gapless
>>>
>>> On 2017-11-13 03:42 PM, Jean-Marc Valin wrote:
>>>> Hi Andreas,
>>>>
>>>> So if I understand your question correctly, what you want is
really
>>>> short "files" that are independent, but yet create a
glitchless stream
>>>> when concatenated, right. For Ogg, this can be implemented with
>>>> libopusenc and chaining. It works pretty well (even for really
tiny
>>>> files). For WebM, I'm not sure how to handle the details at
the
>>>> container level, but for how to handle the transition details
(reset and
>>>> all), I suggest you have a look at the libopusenc code. In
general, the
>>>> idea is to disable the prediction at the point of the
transition between
>>>> two files and to include the transition frames in both files.
>>>>
>>>> Cheers,
>>>>
>>>> 	Jean-Marc
>>>>
>>>> On 11/08/2017 03:43 AM, Andreas Stöckel wrote:
>>>>> Hi!
>>>>>
>>>>> Short version of my question: How to produce Opus frames
which can be
>>>>> safely concatenated and how to embed them into a WebM file?
>>>>>
>>>>> Long version:
>>>>>
>>>>> I'm currently implementing a web-based audio player
which streams
>>>>> audio as opus/WebM using the HTML5 media source extensions.
Currently,
>>>>> the server decodes a set of input files to a fixed RAW
audio format
>>>>> (stereo, 48000 kHz) and encodes the resulting continuous
RAW stream as
>>>>> Opus/WebM. Having a single, uninterrupted RAW stream allows
for
>>>>> perfect gapless playback on the client (which only sees a
single live
>>>>> WebM stream), e.g. there are no interruptions whatsoever
when
>>>>> transitioning between continuous tracks from the same music
album.
>>>>>
>>>>> An early tech-demo of the technique can be found here [1],
the source
>>>>> file http_audio_server/encoder.cpp implements the relevant
>>>>> opus-encoding and webm-encapsulation (but see also [2] for
a condensed
>>>>> version).
>>>>>
>>>>>
>>>>> Now, for performance reasons I'd like to split my RAW
audio into
>>>>> independent blocks (say, as an example, 50 frames or 1s
each), encode
>>>>> these as raw Opus frames and cache them on disc ahead of
time. For
>>>>> each block I'd like to reset the encoder to ensure
independence
>>>>> between the first frame of each block and the last frames
in the
>>>>> previous block, e.g., using
>>>>>
>>>>> opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
>>>>>
>>>>> When the client requests a certain sequence of blocks
(which may
>>>>> originate from various input files in (let's pretend)
any order) my
>>>>> goal is to (on-demand) encapsulate the pre-encoded frames
as WebM and
>>>>> send them to the client.
>>>>>
>>>>> However, in early experiments [2], resetting the encoder
state at the
>>>>> beginning of each block and then concatenating the frames
in the WebM
>>>>> container leads to clearly audible gaps in the decoded WebM
stream
>>>>> whenever the opus encoder has been reset.
>>>>>
>>>>> Interestingly, such artifacts are far less pronounced (if
they exist
>>>>> at all), if I don't explicitly reset the encoder.
However, in my real
>>>>> application the encoder will at least be reset implicitly
(e.g. by
>>>>> starting the encoding process in multiple threads for two
files which
>>>>> may be played consecutively).
>>>>>
>>>>> See [2] for a MWE which expresses what I've tried to
describe above.
>>>>>
>>>>> So to rephrase my question: if it is possible at all, how
can I
>>>>> independently pre-encode blocks of Opus audio frames, such
that I can
>>>>> concatenate them during WebM muxing without audible
glitches?
>>>>>
>>>>>
>>>>> In advance, thank you for your help. Please let me know I
anything I
>>>>> wrote is unclear, or you need more information to answer my
question.
>>>>>
>>>>>
>>>>> Andreas
>>>>>
>>>>>
>>>>> [1] https://github.com/astoeckel/http_audio_server/
>>>>> [2] https://github.com/astoeckel/opus_gapless_webm/
>>>>> _______________________________________________
>>>>> opus mailing list
>>>>> opus at xiph.org
>>>>> http://lists.xiph.org/mailman/listinfo/opus
>>>>>
>>>
>

Andreas Stöckel

2017-Nov-16 02:33 UTC

head link

[opus] Gapless concatenation of Opus frames

Hi Jean-Marc,

thank you for looking into this. I'm using the current release version
of libopus (1.2.1) and version 0.1.10 of opus-tools, both built from
source. Libopusenc is the current git master.

Indeed I cannot hear any glitches in continuous.opus, continuous.wav,
and chained.wav. chained.opus decodes just fine with opusdec as well,
but doesn't play properly in any standard audio player.

In my adapted version of opusenc_example [1] the clicks are pretty
audible once the drum kicks in:

https://somweyr.de/opus/chained.wav

This file was produced by first decoding the blocks and then concatenating

( for i in `find blocks -name '*.ogg' | sort -n`; do opusdec $i -;
done ) | aplay -f dat

but oddly I get the same result if I concatenate all the opus blocks
I've generated into a single file, i.e. doing

cat `find blocks -name '*.ogg' | sort -n` | opusdec - - | aplay -f dat


Maybe I'm just doing something wrong in my adapted libopusenc example?

Can you send me your version?


Again, thank you,
Andreas



[1] https://gist.github.com/astoeckel/6731bc846a2f70dd7f5e155e75683fae


On 2017-11-15 08:25 PM, Jean-Marc Valin wrote:> Hi Andreas,
> 
> So I encoded your file in chunks with a slightly modified version of
> opusenc_example and I can't hear anything wrong. Maybe there's a
problem
> in the tools you used? I uploaded the files at:
> 
> https://jmvalin.ca/misc_stuff/continuous.opus (one file)
> https://jmvalin.ca/misc_stuff/continuous.wav (one file, decoded)
> https://jmvalin.ca/misc_stuff/chained.opus (many small files)
> https://jmvalin.ca/misc_stuff/chained.wav (many small files, decoded)
> 
> Can you hear any of the glitches you mentioned in continuous.wav? If
> there's indeed an issue, it can likely be fixed by just adding a small
> amount of redundancy to libopusenc. There's no fundamental reason it
> can't be perfectly glitchless.
> 
> Cheers,
> 
> 	Jean-Marc
> 
> 
> On 11/15/2017 03:00 AM, Andreas Stöckel wrote:
>> Hi Jean-Marc (and everyone else who replied),
>>
>>> Considering you're switching to Ogg, I think you should give
libopusenc> a try. It does a really good job at getting rid of *all*
>> discontinuities> -- to the point where you can chop a song into
files
>> less than one> millisecond each and it still sounds good. It's
also
>> pretty simple to> use. You just feed it audio and tell it where the
>> file boundaries are.
>>
>> thank you for pointing me at libopusenc. I had a look at the source
>> code and liked the idea of using Linear Predictive Coding for the
>> generation of the lead-in/lead-out frame. This avoids some
>> high-frequency content that my mirroring technique produced. I
>> C++ified the corresponding 1994 LPC code and implanted it into my
>> program [1]. Works like a charm.
>>
>> Since my program seemed to work fairly well I was doing some extended
>> tests and found one particular case where it still produces audible
>> artifacts.
>>
>> Unfortunately, libopusenc with ope_encoder_continue_new_file (see [2]
>> for my code) produces similar (though not the same) audible artifacts.
>> The affected audio file has very low frequency content (produced by a
>> Taiko).
>>
>> In my program the low frequency content seems to be phase shifted,
>> producing a discontinuity while transitioning OGG files [3].
>>
>> Libopusenc seems to introduce ringing artifacts [4], resulting in a
>> similar, yet not that pronounced clicking noise. (Maybe the ringing
>> stems from no "lead-in" frame being used is used -- in my
program I do
>> a reverse LPC at the begining of the first audio chunk to create an
>> artificial frame that leads up to the first frame [7]).
>>
>> You can reproduce the libopusenc problem by compiling my adapted
>> opusenc_example.c [2] and feeding in a segment of the affected RAW
>> audio as indicated at the beginning of my source code. The RAW can be
>> downloaded here [5] (48000 Hz, stereo, 16-bit signed, little endian;
>> the complete song can be downloaded here [6]).
>>
>> Any idea how any of the two issues (either in libopusenc or in my
>> program) might be solved?
>>
>>
>> Again, thank you for your help!
>>
>> Cheers,
>> Andreas
>>
>>
>> [1] https://github.com/astoeckel/opus_gapless/blob/master/lpc.cpp
>>
>> [2] https://gist.github.com/astoeckel/6731bc846a2f70dd7f5e155e75683fae
>>
>> [3] https://somweyr.de/opus/click_opus_gapless.png
>>
>> [4] https://somweyr.de/opus/click_libopusenc.png
>>
>> [5]
>>
https://somweyr.de/opus/test_libopusenc_ope_encoder_continue_new_file.raw.bz2
>>
>> [6] https://www.youtube.com/watch?v=z64HCi2rQkE
>>
>> [7]
>>
https://github.com/astoeckel/opus_gapless/blob/master/opus_gapless.cpp#L82
>>
>>>
>>> Cheers,
>>>
>>> 	Jean-Marc
>>>
>>> On 11/13/2017 04:16 PM, Andreas Stöckel wrote:
>>>> Hi Jean-Mark,
>>>>
>>>> thank you for your answer!
>>>>
>>>> Yes, you understood my question correctly. I was just about to
compose
>>>> a reply to my original question, where I described how I solved
my
>>>> problem. As you've already suggested, I've switched to
Ogg/Opus, which
>>>> is better supported, but does not work with the Media Source
Extensions.
>>>>
>>>> I'll have a look whether disabling prediction will help
with the
>>>> transitioning phase, but I think the way I'm implementing
it right now
>>>> it probably won't.
>>>>
>>>> So here is what I was going to write originally:
>>>>
>>>> When I wrote the question, I wasn't really aware of the
pre-skip
>>>> (CodecDelay in WebM) and DiscardPadding [1]. However, these
properties
>>>> can only be set on a per-stream basis, and not on independent
>>>> sequences of WebM packets. As a consequence of my ignorance
regarding
>>>> pre-skip, I also didn't append an additional frame to the
audio such
>>>> that 6.5ms lost due to the pre-skip couldn't be recovered
when
>>>> decoding. As an additional complication with WebM, there is
also no
>>>> way to indicate in a WebM stream that the decoder should reset.
So if
>>>> anything, we can only concatenate entire files/streams, and not
on a
>>>> per-packet basis.
>>>>
>>>> However, playing back individual WebM streams with CodecDelay
and
>>>> DiscardPadding set (and an additional lead-out frame) did not
work,
>>>> since CodecDelay/DiscardPadding were only insufficiently
interpreted
>>>> by Chromium/Firefox and even ffmpeg. There is a method for
gapless
>>>> concatenation of entire files using MSE, described here [2],
but this
>>>> didn't work for Firefox and still produced audible
artifacts on Chrome.
>>>>
>>>>
>>>> Well, the way I'm solving the problem now is the following:
>>>>
>>>> First, I've switched to Ogg/Opus. Second, I'm appending
a reversed
>>>> version of the first/last 20ms to the beginning/end of the
audio chunk
>>>> I'm encoding. This reduces ringing artifacts from the
transient at the
>>>> beginning/end of the chunk. I then set pre-skip and the granule
of the
>>>> last packet in the generated Ogg stream in such a way, that the
>>>> relevant audio information is "cut out". In contrast
to WebM, browsers
>>>> (and ffmpeg) actually correctly interpret this meta-information
in an
>>>> Ogg container. However, browsers do not support Ogg in
conjunction
>>>> with the Media Source Extensions. Thus, I've ditched MSE
and I am now
>>>> decoding the individual chunks with the WebAudio API and
schedule
>>>> gapless playback of the chunks (which is not optimal, since
WebAudio
>>>> is rather finicky).
>>>>
>>>> The working implementation can be found here [3]. Since Ogg is
so much
>>>> simpler than WebM I also wrote my own minimal C++ Ogg/Opus
muxer,
>>>> which shaves off another dependency of my application.
>>>>
>>>>
>>>> Thank you for your help,
>>>> Andreas
>>>>
>>>>
>>>>
>>>> [1] https://wiki.xiph.org/MatroskaOpus
>>>>
>>>> [2]
>>>>
https://developers.google.com/web/fundamentals/media/mse/seamless-playback
>>>>
>>>> [3] https://github.com/astoeckel/opus_gapless
>>>>
>>>> On 2017-11-13 03:42 PM, Jean-Marc Valin wrote:
>>>>> Hi Andreas,
>>>>>
>>>>> So if I understand your question correctly, what you want
is really
>>>>> short "files" that are independent, but yet
create a glitchless stream
>>>>> when concatenated, right. For Ogg, this can be implemented
with
>>>>> libopusenc and chaining. It works pretty well (even for
really tiny
>>>>> files). For WebM, I'm not sure how to handle the
details at the
>>>>> container level, but for how to handle the transition
details (reset and
>>>>> all), I suggest you have a look at the libopusenc code. In
general, the
>>>>> idea is to disable the prediction at the point of the
transition between
>>>>> two files and to include the transition frames in both
files.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> 	Jean-Marc
>>>>>
>>>>> On 11/08/2017 03:43 AM, Andreas Stöckel wrote:
>>>>>> Hi!
>>>>>>
>>>>>> Short version of my question: How to produce Opus
frames which can be
>>>>>> safely concatenated and how to embed them into a WebM
file?
>>>>>>
>>>>>> Long version:
>>>>>>
>>>>>> I'm currently implementing a web-based audio player
which streams
>>>>>> audio as opus/WebM using the HTML5 media source
extensions. Currently,
>>>>>> the server decodes a set of input files to a fixed RAW
audio format
>>>>>> (stereo, 48000 kHz) and encodes the resulting
continuous RAW stream as
>>>>>> Opus/WebM. Having a single, uninterrupted RAW stream
allows for
>>>>>> perfect gapless playback on the client (which only sees
a single live
>>>>>> WebM stream), e.g. there are no interruptions
whatsoever when
>>>>>> transitioning between continuous tracks from the same
music album.
>>>>>>
>>>>>> An early tech-demo of the technique can be found here
[1], the source
>>>>>> file http_audio_server/encoder.cpp implements the
relevant
>>>>>> opus-encoding and webm-encapsulation (but see also [2]
for a condensed
>>>>>> version).
>>>>>>
>>>>>>
>>>>>> Now, for performance reasons I'd like to split my
RAW audio into
>>>>>> independent blocks (say, as an example, 50 frames or 1s
each), encode
>>>>>> these as raw Opus frames and cache them on disc ahead
of time. For
>>>>>> each block I'd like to reset the encoder to ensure
independence
>>>>>> between the first frame of each block and the last
frames in the
>>>>>> previous block, e.g., using
>>>>>>
>>>>>> opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
>>>>>>
>>>>>> When the client requests a certain sequence of blocks
(which may
>>>>>> originate from various input files in (let's
pretend) any order) my
>>>>>> goal is to (on-demand) encapsulate the pre-encoded
frames as WebM and
>>>>>> send them to the client.
>>>>>>
>>>>>> However, in early experiments [2], resetting the
encoder state at the
>>>>>> beginning of each block and then concatenating the
frames in the WebM
>>>>>> container leads to clearly audible gaps in the decoded
WebM stream
>>>>>> whenever the opus encoder has been reset.
>>>>>>
>>>>>> Interestingly, such artifacts are far less pronounced
(if they exist
>>>>>> at all), if I don't explicitly reset the encoder.
However, in my real
>>>>>> application the encoder will at least be reset
implicitly (e.g. by
>>>>>> starting the encoding process in multiple threads for
two files which
>>>>>> may be played consecutively).
>>>>>>
>>>>>> See [2] for a MWE which expresses what I've tried
to describe above.
>>>>>>
>>>>>> So to rephrase my question: if it is possible at all,
how can I
>>>>>> independently pre-encode blocks of Opus audio frames,
such that I can
>>>>>> concatenate them during WebM muxing without audible
glitches?
>>>>>>
>>>>>>
>>>>>> In advance, thank you for your help. Please let me know
I anything I
>>>>>> wrote is unclear, or you need more information to
answer my question.
>>>>>>
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>>
>>>>>> [1] https://github.com/astoeckel/http_audio_server/
>>>>>> [2] https://github.com/astoeckel/opus_gapless_webm/
>>>>>> _______________________________________________
>>>>>> opus mailing list
>>>>>> opus at xiph.org
>>>>>> http://lists.xiph.org/mailman/listinfo/opus
>>>>>>
>>>>
>>

Maybe Matching Threads

Search for more possibly parallel threads

opus - Nov 2017 - Gapless concatenation of Opus frames

[opus] Gapless concatenation of Opus frames

[opus] Gapless concatenation of Opus frames

[opus] Gapless concatenation of Opus frames

Maybe Matching Threads