thr3ads.net - flac dev - [flac-dev] Feedback on implementation of decoding of chained streams [Sep 2024]

If this information is useful, please help other people find it:
Share via:

Martijn van Beurden

2024-Sep-03 06:25 UTC

[flac-dev] Feedback on implementation of decoding of chained streams

Op ma 2 sep 2024 om 22:40 schreef Timothy B. Terriberry <tterribe at
xiph.org>:>
> Martijn van Beurden wrote:
> >> Since chained streams can have different sample rates, how would
one go
> >> about seeking to a specific _time_?
> >
> > I assume one would first use the sample rate of the first link to
> > guess the sample number, seek to that point, correct if it turns out
> > one of the links that passed has a different sample rate, seek again
> > etc.
>
> The use case I am thinking of here is a standard media player with a
> scrubber that shows the current playback position and can be dragged by
> the user to seek. I don't know how many would use libFLAC for this
> directly (maybe VLC?) instead of some other media framework, but it at
> least seems like a common use case.
>
> >> Am I reading the seeking implementation correctly that the only
way to
> >> seek to a future link is to scan forward through all of the file
data?
> >
> > Yes, that is correct.
>
> So, to implement the above scrubber, you would have to read the entire
> file before being able to begin playback, plus maintain a bunch of
> custom code to enumerate and store the list of link durations and sample
> rates to do the conversion between sample number and time. I do not
> think a lot of people would be willing to enable chained stream support
> if that is the cost.
As far as I know, please correct if wrong, neither libopusfile nor
libvorbisfile provide this functionality either. In neither I could
find a function to find the total number of samples over all links.
Opus has a fixed samplerate so there is no changing, but libvorbisfile
doesn't provide a way to query it on any data about the links it might
have stored.

I could expand the provided functionality in a number of ways
- Provide a function that does the scrubbing (I call it indexing in
the code) and return the total number of samples and link details
- and/or provide a way to skip over a current link (instead of decoding it)
- and/or provide a way to seek to a certain link by its serial number
(instead of decoding all links before it)
- and/or provide a way to seek to a certain link by its "number" (i.e.
go to the fourth link)
- and/or provide a way to seek to a certain timestamp instead of a
certain sample
- and/or provide a way to seek to a certain sample within a certain link

I'm not sure what would be most useful, and I am reluctant to
implement them all.

Please let me know what you think.

Kind regards, Martijn van Beurden

Timothy B. Terriberry

2024-Sep-03 12:52 UTC

head link

[flac-dev] Feedback on implementation of decoding of chained streams

Martijn van Beurden wrote:> As far as I know, please correct if wrong, neither libopusfile nor
> libvorbisfile provide this functionality either. In neither I could
> find a function to find the total number of samples over all links.
> Opus has a fixed samplerate so there is no changing, but libvorbisfile
> doesn't provide a way to query it on any data about the links it might
> have stored.
For opusfile:

https://opus-codec.org/docs/opusfile_api-0.12/group__stream__info.html#ga8c228c3d95f2c903ad6cfd2b78d8dad6

     ogg_int64_t op_pcm_total(const OggOpusFile *_of,int _li)

     _li: The index of the link whose PCM length should be computed. Use 
a negative number to get the PCM length of the entire stream.

"The entire stream" here includes all links in a chained stream.

https://xiph.org/vorbis/doc/vorbisfile/ov_pcm_total.html does the same 
thing for vorbisfile (the opusfile API is pretty much directly adapted 
from vorbisfile's). vorbisfile also includes an ov_time_total():

https://xiph.org/vorbis/doc/vorbisfile/ov_time_total.html

As well as an ov_time_seek():

https://xiph.org/vorbis/doc/vorbisfile/ov_time_seek.html

These were redundant and thus not implemented separately in opusfile, 
because as you point out, the opusfile API decodes at a fixed sample 
rate, so samples are the same thing as time with a particular choice of 
unit. However, I think they are frequently used by vorbisfile users.

https://xiph.org/vorbis/doc/vorbisfile/ov_info.html also does let you 
query the data from the info header of each link, with 
https://xiph.org/vorbis/doc/vorbisfile/ov_streams.html to tell you how 
many there are. This includes the sample rate, if you really wanted to 
do your own time calculations, but also includes important information 
like the channel count. It is impossible to interpret the decoded audio 
correctly without these.

op_head() 
<https://opus-codec.org/docs/opusfile_api-0.12/group__stream__info.html#gabae95dfa8a278a305213332e295443bb>
and op_link_count() 
<https://opus-codec.org/docs/opusfile_api-0.12/group__stream__info.html#gaaf6ff40725a8bc7e73c9d396ab91837d>
do the same for opusfile.
> I'm not sure what would be most useful, and I am reluctant to
> implement them all.
> 
> Please let me know what you think.
It is a bit of a challenge because of the need to support unseekable 
streams (for, e.g., internet radio, your original use-case), but also 
not make chained streams _too_ much more of a burden to support than a 
stream with a single link in the seekable case. The vorbisfile API at 
least has a long history of use in a lot of applications, so it should 
not be too terrible a guide.

I do not think seeking to a link by serial number is that useful (how 
does the application know the serial number in advance?). Seeking to a 
link by "number" is probably not that common, either, and still needs
a
way to know the total number of links. Being able to seek to specific 
samples and specific times in the entire stream are both useful, and 
does need a way to know the total number of samples or total duration. 
Being able to enumerate the information about all of the links is also 
quite useful. For example, an application may wish to know if the sample 
rate and channel count do not change in all of the links of a chained 
file, for the purpose of generating an RIFF header or similar.

Probably the most difficult piece to implement is doing link enumeration 
in an efficient way. You want to be doing some sort of bisection search 
to locate link boundaries, re-using previous results for subsequent 
links, if possible. Some thought went in to doing this efficiently in 
opusfile. See op_bisect_forward_serialno() 
<https://gitlab.xiph.org/xiph/opusfile/-/blob/master/src/opusfile.c?ref_type=heads#L1104>
for details of the current approach (or maybe even start at 
<https://gitlab.xiph.org/xiph/opusfile/-/blob/master/src/opusfile.c?ref_type=heads#L1391>
to see how the call to that function is set up).

For testing during development, I often used a file gmaxwell created 
that is 2.6 GB with 30 links containing over 26 hours of audio, accessed 
over https. It can still take several seconds to open the file under 
those conditions, but it is many orders of magnitude faster than trying 
to scan through the whole thing, and I imagine that is only more true 
with the high bitrate of FLAC files. Output from opusfile's seeking_example:

     Opened file containing 30 links with 194 seeks (6.467 per link).
     Loaded (240.592 kbps average).
     Testing raw seeking to random places in 2831820229 bytes...
     Total seek operations: 1000 (1.000 per raw seek, 1 maximum).
     Testing exact PCM seeking to random places in 4519756545 samples 
(1d02h09m21.595s)...
     Total seek operations: 1873 (1.873 per exact seek, 4 maximum).
     OK.

A more typical example (~4 hours captured from an actual internet radio 
stream) looks like:

     Opened file containing 75 links with 425 seeks (5.667 per link).
     Loaded (98.551 kbps average).
     Testing raw seeking to random places in 169127944 bytes...
     Total seek operations: 1000 (1.000 per raw seek, 1 maximum).
     Testing exact PCM seeking to random places in 658998312 samples 
(3h48m49.132s)...
     Total seek operations: 1001 (1.001 per exact seek, 2 maximum).
     OK.

flac dev - Sep 2024 - Feedback on implementation of decoding of chained streams

[flac-dev] Feedback on implementation of decoding of chained streams

[flac-dev] Feedback on implementation of decoding of chained streams