thr3ads.net - flac dev - [Flac-dev] Compressing sound fonts with FLAC [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Josh Green

2004-Sep-10 16:45 UTC

[Flac-dev] Compressing sound fonts with FLAC

Josh Coalson wrote:
> yeah, flac doesn't have a 'gzip' fallback method
> so any non-audio data will probably get stored
> verbatim.  I'm kind of reluctant to add a generic
> compressor.  If you wan't, you could come up with a
> FLAC metadata block to store a gzip'ed chunk and I
> could add that to the format.
> 
I had the same thought when I was looking over the FLAC format and saw
the META data stuff. Is there anything that needs to be added in order
to take advantage of this? Why not just a binary META data block that
has some sort of identifier (a string maybe). That would allow arbitrary
data to be stored.
My other option (which I was originally thinking of) would be to use
FLAC on the audio stuff, and zlib for the non-audio data, then just
write my own custom format file which would basically just have a simple
header and the two compressed blocks appended onto it.
I'm not sure what would make the most sense. Should FLAC try to
encorporate compression of sound fonts? This might not be that hard
actually. If it had the capability of concatenating gzip/bzip compressed
blocks with FLAC compressed blocks it would not really need to know too
much of the internals of a sound font (nothing on the decoding side),
just where the sample data begins and ends and I guess perhaps on a per
sample basis when variable block length compression is available.
iirc FLAC treats WAV files specially, and doesn't store some types of
WAV information blocks. If arbitrary gzip/bzip compressed blocks could
be inserted into the stream then any portion of a WAV file that isn't
audio could be retained. Does this fit into FLAC's purpose?
> >
> the best thing would be to try and set the blocksize
> to match the length of the individual 'sample'.  if
> each sample is much shorter than the blocksize then
> the encoder may not be able to generate an efficient
> model of the signal.  if the samples within the
> soundfont vary greatly in length that also makes
> it harder because right now flac only supports a
> fixed blocksize (even though the format allows for a
> varying blocksize).
> 
I noticed that the max block size is 65535. Sample sizes in sound fonts
can be of fairly arbitrary length (32 bit length), and many would exceed
64k, although many would be much smaller too. So breaking them into
smaller blocks would be necessary for many samples. I guess varying
blocksize compression is really what would be the best. Any ideas of
when this might be available? Compression is good as things stand now,
although not optimal. I compressed a 118MB sound font to 64MB with FLAC
which was only 3MB difference to sfArk (only 170k of this is non-audio
data which would be more significant with smaller sound fonts). Lates..
	Josh Green

Josh Green

2004-Sep-10 16:45 UTC

head link

[Flac-dev] FREEFORM metadata (was: Compressing sound fonts with FLAC)

Josh Coalson wrote:> I've been thinking about this, and here's what I
> came up with.  This kind of dovetails into the
> discussion Mike Wren started about the etree
> header.
> 
> I was thinking about defining a FREEFORM metadata
> block which may be of arbitrary size.  The only
> mandatory field would be a (say, 32-bit) id of
> the owner.  In your case, you would request an
> id, then I would register you as the owner, then
> you could write as many FREEFORM blocks with your
> id as you wanted and the contents would be up to
> you.  The same could be done for the etree header.
> 
> Then I would have a registration section on the
> FLAC site with optional links to registered owners
> (for instance it could link to your page where you
> define the format of you block) and a registration
> form.  Each owner would maintain their own block
> type and applications could support any they knew
> about and ignore the rest.
> 
> What do you guys think?
> 
Sounds fine to me. I guess this would rely on an external program to
extract the special header data from the FLAC file?
I didn't quite get the answer to my question concerning whether the
design of the FLAC standard should directly support inserting arbitrary
data blocks which would be re-assembled with standard audio blocks on
extract, or should FLAC leave this up to other programs?
Adding zlib (bzlib too) support to FLAC and allowing for traditionally
compressed blocks to be re-assembled when the file is extracted, would
allow direct support of probably a lot of file formats that contain
audio, but also contain other fluff. Only the compressor would need to
know something about those special formats, and the files would stay
under the domain of FLAC. Lates..
	Josh Green

Josh Coalson

2004-Sep-10 16:45 UTC

head link

[Flac-dev] FREEFORM metadata (was: Compressing sound fonts with FLAC)

> > yeah, flac doesn't have a 'gzip' fallback method
> > so any non-audio data will probably get stored
> > verbatim.  I'm kind of reluctant to add a generic
> > compressor.  If you wan't, you could come up with
> a
> > FLAC metadata block to store a gzip'ed chunk and I
> > could add that to the format.
> > 
> 
> I had the same thought when I was looking over the
> FLAC format and saw
> the META data stuff. Is there anything that needs to
> be added in order
> to take advantage of this? Why not just a binary
> META data block that
> has some sort of identifier (a string maybe). That
> would allow arbitrary
> data to be stored.
I've been thinking about this, and here's what I
came up with.  This kind of dovetails into the
discussion Mike Wren started about the etree
header.

I was thinking about defining a FREEFORM metadata
block which may be of arbitrary size.  The only
mandatory field would be a (say, 32-bit) id of
the owner.  In your case, you would request an
id, then I would register you as the owner, then
you could write as many FREEFORM blocks with your
id as you wanted and the contents would be up to
you.  The same could be done for the etree header.

Then I would have a registration section on the
FLAC site with optional links to registered owners
(for instance it could link to your page where you
define the format of you block) and a registration
form.  Each owner would maintain their own block
type and applications could support any they knew
about and ignore the rest.

What do you guys think?

Josh


__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 
a year!  http://personal.mail.yahoo.com/

Josh Coalson

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Compressing sound fonts with FLAC

> > the best thing would be to try and set the
> blocksize
> > to match the length of the individual 'sample'. 
> if
> > each sample is much shorter than the blocksize
> then
> > the encoder may not be able to generate an
> efficient
> > model of the signal.  if the samples within the
> > soundfont vary greatly in length that also makes
> > it harder because right now flac only supports a
> > fixed blocksize (even though the format allows for
> a
> > varying blocksize).
> > 
> 
> I noticed that the max block size is 65535. Sample
> sizes in sound fonts
> can be of fairly arbitrary length (32 bit length),
> and many would exceed
> 64k, although many would be much smaller too. So
> breaking them into
> smaller blocks would be necessary for many samples.
> I guess varying
> blocksize compression is really what would be the
> best. Any ideas of
> when this might be available? Compression is good as
> things stand now,
> although not optimal. I compressed a 118MB sound
> font to 64MB with FLAC
> which was only 3MB difference to sfArk (only 170k of
> this is non-audio
> data which would be more significant with smaller
> sound fonts).
>ok, in the case where each 'sample' is long
(like>64k samples (sorry to mix terminology here)) theblocksize is probably not going to matter too much,
since the optimal blocksize for the way FLAC models
at CD audio rates is around 1k-4k samples.

unless sfArk has a much better compressor (ala
Monkey's audio), I can only think of one thing
that would give them much higher rates, and that
is if they take advantage of inter-sample
correlation (by 'sample' I mean sample in the
soundfont terminology).  for example, if your
soundfont was storing 88 'samples' from a piano
(one for each key), an inter-sample decorrelator
might model all the octaves together since it knows
the structure of the font, which FLAC cannot.

Josh


__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 
a year!  http://personal.mail.yahoo.com/

Josh Green

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Compressing sound fonts with FLAC

Josh Coalson wrote:> 
> ok, in the case where each 'sample' is long (like
> >64k samples (sorry to mix terminology here)) the
> blocksize is probably not going to matter too much,
> since the optimal blocksize for the way FLAC models
> at CD audio rates is around 1k-4k samples.
> 
> unless sfArk has a much better compressor (ala
> Monkey's audio), I can only think of one thing
> that would give them much higher rates, and that
> is if they take advantage of inter-sample
> correlation (by 'sample' I mean sample in the
> soundfont terminology).  for example, if your
> soundfont was storing 88 'samples' from a piano
> (one for each key), an inter-sample decorrelator
> might model all the octaves together since it knows
> the structure of the font, which FLAC cannot.
> 
I was thinking about that. I wasn't thinking correlations between
samples of the same instrument, but rather 2 samples used for stereo
purposes. I wonder how much one could compress using correlations
between samples of the same instrument. Interesting.

I can see taking advantage of stereo correlation as being kind of hard
to take advantage of with FLAC though.
There aren't any interleved stereo samples in a sound font. Only mono
samples which have specified panning at the instrument level allowing
for stereo. These 2 samples can be anywhere in a sound font.

Supporting something like that in FLAC would mean allowing 2 different
data areas to be checked for stereo correlation and extraction to the
same different points in the output. Weired. Lates..
	Josh Green

Reasonably Related Threads

Search for more apparently analagous threads

flac dev - Sep 2004 - Compressing sound fonts with FLAC

[Flac-dev] Compressing sound fonts with FLAC

[Flac-dev] FREEFORM metadata (was: Compressing sound fonts with FLAC)

[Flac-dev] FREEFORM metadata (was: Compressing sound fonts with FLAC)

[Flac-dev] Compressing sound fonts with FLAC

[Flac-dev] Compressing sound fonts with FLAC

Reasonably Related Threads