Stefan Oltmanns
2024-Oct-13 22:09 UTC
[flac-dev] C API: How to get a seektable for very long files?
I think there is another major issue for me: In METADATA_BLOCK_STREAMINFO the field for the length is only 36 bit, that's not even half an hour at 40 MHz sample rate, resulting in that the encoder sets it to 0 for longer captures. In the seekpoint the sample number is 64 bit, which is more than enough. But how does the decoder handle the seektable when the total number of samples is unknown? Or does the seektable override the info from METADATA_BLOCK_STREAMINFO? I used this functions now to add seekpoints, but all remain placeholders according to metaflac: FLAC__metadata_object_new FLAC__metadata_object_seektable_template_append_placeholders FLAC__stream_encoder_set_metadata (encoder init & loop) FLAC__metadata_object_seektable_template_sort Best regards Stefan Am 13.10.24 um 22:33 schrieb Stefan Oltmanns:> Hi Martijn, > > Am 13.10.24 um 21:00 schrieb Martijn van Beurden: >> >> There's actually quite a lot of documentation for this. >> >> Please review https://xiph.org/flac/api/ >> group__flac__stream__encoder.html#ga80d57f9069e354cbf1a15a3e3ad9ca78 >> >> I quote: >>> SEEKTABLE blocks are handled specially. Since you will not know the >>> values for the seek point stream offsets, you should pass in a >>> SEEKTABLE 'template', that is, a SEEKTABLE object with the required >>> sample numbers (or placeholder points), with 0 for the frame_samples >>> and stream_offset fields for each point. If the client has specified >>> that it supports seeking by providing a seek callback to >>> FLAC__stream_encoder_init_stream() or both seek AND read callback to >>> FLAC__stream_encoder_init_ogg_stream() (or by using >>> FLAC__stream_encoder_init*_file() or >>> FLAC__stream_encoder_init*_FILE()), then while it is encoding the >>> encoder will fill the stream offsets in for you and when encoding is >>> finished, it will seek back and write the real values into the >>> SEEKTABLE block in the stream. There are helper routines for >>> manipulating seektable template blocks; see metadata.h: >>> FLAC__metadata_object_seektable_template_*(). If the client does not >>> support seeking, the SEEKTABLE will have inaccurate offsets which will >>> slow down or remove the ability to seek in the FLAC stream. >> >> >> Also, take a look at this: >> https://xiph.org/flac/api/ >> group__flac__metadata__object.html#gab91c8b020a1da37d7524051ae82328cb >> >> Hope that helps. > > Thanks, I have looked at the wrong place (at the encoder documentation, > not the metadata). > > Is the seektable written at the beginning of the file in the metadata > block or can there also be a second metadata block at the end? > > If it's written at the end could I just call > FLAC__metadata_object_seektable_template_append_point() in the encoding > loop? Should the sample already exist at that point or should the > seekpoint appended before that data is passed to the encoder? > > If it's at the beginning, would it possible to reserve space for N seek > points and during encoding remember a seek point after X samples, > resulting in M seek points when encoding is finished. If M <= N all seek > points are written, otherwise only every 2nd, 3rd etc. > Is it possible to do that? The functions all expect a a total_samples > argument, which is not known at the beginning. > >> >> Also, as I'm always extra curious when FLAC is used for non-audio >> purposes: could you perhaps say a little bit about what kind of >> signals you're compressing? > > The signal is the FM-modulated video signal of video tapes (like VHS). > The idea is to capture the signal directly from the video head amplifier > in the VCR and later demodulate/decode it in software, providing higher > quality than traditional capture of analog video. See this project: > https://github.com/oyvindln/vhs-decode/ > I started to design a capture device, as there is no 40 MHz continuous > sampling hardware available at consumer prices: https://github.com/ > Stefan-Olt/MISRC > > Best regards > Stefan Oltmanns >
Martijn van Beurden
2024-Oct-14 07:11 UTC
[flac-dev] C API: How to get a seektable for very long files?
Op zo 13 okt 2024 om 22:33 schreef Stefan Oltmanns <stefan-oltmanns at gmx.net>:> > Is the seektable written at the beginning of the file in the metadata > block or can there also be a second metadata block at the end? >Only at the start of the file.> > If it's at the beginning, would it possible to reserve space for N seek > points and during encoding remember a seek point after X samples, > resulting in M seek points when encoding is finished. If M <= N all seek > points are written, otherwise only every 2nd, 3rd etc. > Is it possible to do that? The functions all expect a a total_samples > argument, which is not known at the beginning. >No, you can only provide a seek table template with specific sample numbers. There is no way to ask the encoder to add a seek point every so many samples.> > The signal is the FM-modulated video signal of video tapes (like VHS). > The idea is to capture the signal directly from the video head amplifier > in the VCR and later demodulate/decode it in software, providing higher > quality than traditional capture of analog video. See this project: > https://github.com/oyvindln/vhs-decode/ > I started to design a capture device, as there is no 40 MHz continuous > sampling hardware available at consumer prices: > https://github.com/Stefan-Olt/MISRC >I've seen similar uses before. Maybe this one can serve as some inspiration: https://www.youtube.com/watch?v=ZrEFU22C8l8 According to that guy, he used cheap hardware. Op ma 14 okt 2024 om 00:09 schreef Stefan Oltmanns <stefan-oltmanns at gmx.net>:> > I think there is another major issue for me: In > METADATA_BLOCK_STREAMINFO the field for the length is only 36 bit, > that's not even half an hour at 40 MHz sample rate, resulting in that > the encoder sets it to 0 for longer captures. In the seekpoint the > sample number is 64 bit, which is more than enough. > But how does the decoder handle the seektable when the total number of > samples is unknown? Or does the seektable override the info from > METADATA_BLOCK_STREAMINFO?When a suitable seektable is found, it overrides the information from streaminfo, yes.> > I used this functions now to add seekpoints, but all remain placeholders > according to metaflac: > > FLAC__metadata_object_new > FLAC__metadata_object_seektable_template_append_placeholders > FLAC__stream_encoder_set_metadata > (encoder init & loop) > FLAC__metadata_object_seektable_template_sort >Yes, that is correct, because you asked for placeholder points. You should ask for spaced points. I just tested what happens if you make a seek table template with a total_samples that is bigger than the eventual total_samples that is encoded, and found a bug in the encoder. It works, but the resulting seek table isn't valid. You could try to use that approach anyway in the meantime, perhaps it works just fine? I'd say, have your implementation prepare a seek table template for 5 hours of recording (I assume that is above the upper bound of such captures?), the stream encoder will fill in those seek points when it reaches them, and leaves the unused ones unfilled. I will work on a fix, so the stream encoder converts those unused points to placeholder points in the future. The stream encoder works pretty simple: you give it metadata to add to the start of the stream, it adds those verbatim. After encoding is finished, it will update streaminfo and seektable. There is no metadata at the end of the stream, and the metadata blocks should not be changed during encoding. I see the difficulty here now by the way: metaflac also refuses to write a seektable when the streaminfo metadata block specifies 0 total samples, which is unavoidable in your case.
brianw
2024-Oct-14 17:14 UTC
[flac-dev] C API: How to get a seektable for very long files?
Many hardware and software developers allow the user to set a maximum file size. When a live recording reaches this limit, a new file is created to continue the recording without losing a single sample. Note that many software programs allow multiple files to be concatenated without any gap. I have been able to use these techniques reliably for years to make extended recordings that last several hours. Typical maximum sizes are 4 GB, 2 GB, or even 1 GB. The 4 GB and 2 GB limits come from old file seek API that have those limits. The 1 GB option is probably just to make files more manageable, or to allow a compressed file to be expanded and still fit within 2 GB. Although this standard practice does not directly solve your problems, I think it would help if you avoid creating large files. You mention that the 36-bit field limits you to 30 minutes, but what size file would that correspond to? Just limit your file to that size, use the full 36-bit field, and then skip to the next file with gapless, sample-accurate recording. Brian Willoughby On Oct 13, 2024, at 3:09 PM, Stefan Oltmanns wrote:> I think there is another major issue for me: In > METADATA_BLOCK_STREAMINFO the field for the length is only 36 bit, > that's not even half an hour at 40 MHz sample rate, resulting in that > the encoder sets it to 0 for longer captures. In the seekpoint the > sample number is 64 bit, which is more than enough. > But how does the decoder handle the seektable when the total number of > samples is unknown? Or does the seektable override the info from > METADATA_BLOCK_STREAMINFO? > > I used this functions now to add seekpoints, but all remain placeholders > according to metaflac: > > FLAC__metadata_object_new > FLAC__metadata_object_seektable_template_append_placeholders > FLAC__stream_encoder_set_metadata > (encoder init & loop) > FLAC__metadata_object_seektable_template_sort > > Best regards > Stefan