Josh Coalson wrote:> yeah, flac doesn't have a 'gzip' fallback method > so any non-audio data will probably get stored > verbatim. I'm kind of reluctant to add a generic > compressor. If you wan't, you could come up with a > FLAC metadata block to store a gzip'ed chunk and I > could add that to the format. >I had the same thought when I was looking over the FLAC format and saw the META data stuff. Is there anything that needs to be added in order to take advantage of this? Why not just a binary META data block that has some sort of identifier (a string maybe). That would allow arbitrary data to be stored. My other option (which I was originally thinking of) would be to use FLAC on the audio stuff, and zlib for the non-audio data, then just write my own custom format file which would basically just have a simple header and the two compressed blocks appended onto it. I'm not sure what would make the most sense. Should FLAC try to encorporate compression of sound fonts? This might not be that hard actually. If it had the capability of concatenating gzip/bzip compressed blocks with FLAC compressed blocks it would not really need to know too much of the internals of a sound font (nothing on the decoding side), just where the sample data begins and ends and I guess perhaps on a per sample basis when variable block length compression is available. iirc FLAC treats WAV files specially, and doesn't store some types of WAV information blocks. If arbitrary gzip/bzip compressed blocks could be inserted into the stream then any portion of a WAV file that isn't audio could be retained. Does this fit into FLAC's purpose?> > > the best thing would be to try and set the blocksize > to match the length of the individual 'sample'. if > each sample is much shorter than the blocksize then > the encoder may not be able to generate an efficient > model of the signal. if the samples within the > soundfont vary greatly in length that also makes > it harder because right now flac only supports a > fixed blocksize (even though the format allows for a > varying blocksize). >I noticed that the max block size is 65535. Sample sizes in sound fonts can be of fairly arbitrary length (32 bit length), and many would exceed 64k, although many would be much smaller too. So breaking them into smaller blocks would be necessary for many samples. I guess varying blocksize compression is really what would be the best. Any ideas of when this might be available? Compression is good as things stand now, although not optimal. I compressed a 118MB sound font to 64MB with FLAC which was only 3MB difference to sfArk (only 170k of this is non-audio data which would be more significant with smaller sound fonts). Lates.. Josh Green
Josh Green
2004-Sep-10 16:45 UTC
[Flac-dev] FREEFORM metadata (was: Compressing sound fonts with FLAC)
Josh Coalson wrote:> I've been thinking about this, and here's what I > came up with. This kind of dovetails into the > discussion Mike Wren started about the etree > header. > > I was thinking about defining a FREEFORM metadata > block which may be of arbitrary size. The only > mandatory field would be a (say, 32-bit) id of > the owner. In your case, you would request an > id, then I would register you as the owner, then > you could write as many FREEFORM blocks with your > id as you wanted and the contents would be up to > you. The same could be done for the etree header. > > Then I would have a registration section on the > FLAC site with optional links to registered owners > (for instance it could link to your page where you > define the format of you block) and a registration > form. Each owner would maintain their own block > type and applications could support any they knew > about and ignore the rest. > > What do you guys think? >Sounds fine to me. I guess this would rely on an external program to extract the special header data from the FLAC file? I didn't quite get the answer to my question concerning whether the design of the FLAC standard should directly support inserting arbitrary data blocks which would be re-assembled with standard audio blocks on extract, or should FLAC leave this up to other programs? Adding zlib (bzlib too) support to FLAC and allowing for traditionally compressed blocks to be re-assembled when the file is extracted, would allow direct support of probably a lot of file formats that contain audio, but also contain other fluff. Only the compressor would need to know something about those special formats, and the files would stay under the domain of FLAC. Lates.. Josh Green
Josh Coalson
2004-Sep-10 16:45 UTC
[Flac-dev] FREEFORM metadata (was: Compressing sound fonts with FLAC)
> > yeah, flac doesn't have a 'gzip' fallback method > > so any non-audio data will probably get stored > > verbatim. I'm kind of reluctant to add a generic > > compressor. If you wan't, you could come up with > a > > FLAC metadata block to store a gzip'ed chunk and I > > could add that to the format. > > > > I had the same thought when I was looking over the > FLAC format and saw > the META data stuff. Is there anything that needs to > be added in order > to take advantage of this? Why not just a binary > META data block that > has some sort of identifier (a string maybe). That > would allow arbitrary > data to be stored.I've been thinking about this, and here's what I came up with. This kind of dovetails into the discussion Mike Wren started about the etree header. I was thinking about defining a FREEFORM metadata block which may be of arbitrary size. The only mandatory field would be a (say, 32-bit) id of the owner. In your case, you would request an id, then I would register you as the owner, then you could write as many FREEFORM blocks with your id as you wanted and the contents would be up to you. The same could be done for the etree header. Then I would have a registration section on the FLAC site with optional links to registered owners (for instance it could link to your page where you define the format of you block) and a registration form. Each owner would maintain their own block type and applications could support any they knew about and ignore the rest. What do you guys think? Josh __________________________________________________ Do You Yahoo!? Get personalized email addresses from Yahoo! Mail - only $35 a year! http://personal.mail.yahoo.com/
> > the best thing would be to try and set the > blocksize > > to match the length of the individual 'sample'. > if > > each sample is much shorter than the blocksize > then > > the encoder may not be able to generate an > efficient > > model of the signal. if the samples within the > > soundfont vary greatly in length that also makes > > it harder because right now flac only supports a > > fixed blocksize (even though the format allows for > a > > varying blocksize). > > > > I noticed that the max block size is 65535. Sample > sizes in sound fonts > can be of fairly arbitrary length (32 bit length), > and many would exceed > 64k, although many would be much smaller too. So > breaking them into > smaller blocks would be necessary for many samples. > I guess varying > blocksize compression is really what would be the > best. Any ideas of > when this might be available? Compression is good as > things stand now, > although not optimal. I compressed a 118MB sound > font to 64MB with FLAC > which was only 3MB difference to sfArk (only 170k of > this is non-audio > data which would be more significant with smaller > sound fonts). >ok, in the case where each 'sample' is long (like>64k samples (sorry to mix terminology here)) theblocksize is probably not going to matter too much, since the optimal blocksize for the way FLAC models at CD audio rates is around 1k-4k samples. unless sfArk has a much better compressor (ala Monkey's audio), I can only think of one thing that would give them much higher rates, and that is if they take advantage of inter-sample correlation (by 'sample' I mean sample in the soundfont terminology). for example, if your soundfont was storing 88 'samples' from a piano (one for each key), an inter-sample decorrelator might model all the octaves together since it knows the structure of the font, which FLAC cannot. Josh __________________________________________________ Do You Yahoo!? Get personalized email addresses from Yahoo! Mail - only $35 a year! http://personal.mail.yahoo.com/
Josh Coalson wrote:> > ok, in the case where each 'sample' is long (like > >64k samples (sorry to mix terminology here)) the > blocksize is probably not going to matter too much, > since the optimal blocksize for the way FLAC models > at CD audio rates is around 1k-4k samples. > > unless sfArk has a much better compressor (ala > Monkey's audio), I can only think of one thing > that would give them much higher rates, and that > is if they take advantage of inter-sample > correlation (by 'sample' I mean sample in the > soundfont terminology). for example, if your > soundfont was storing 88 'samples' from a piano > (one for each key), an inter-sample decorrelator > might model all the octaves together since it knows > the structure of the font, which FLAC cannot. >I was thinking about that. I wasn't thinking correlations between samples of the same instrument, but rather 2 samples used for stereo purposes. I wonder how much one could compress using correlations between samples of the same instrument. Interesting. I can see taking advantage of stereo correlation as being kind of hard to take advantage of with FLAC though. There aren't any interleved stereo samples in a sound font. Only mono samples which have specified panning at the instrument level allowing for stereo. These 2 samples can be anywhere in a sound font. Supporting something like that in FLAC would mean allowing 2 different data areas to be checked for stereo correlation and extraction to the same different points in the output. Weired. Lates.. Josh Green