thr3ads.net - flac dev - [Flac-dev] Blocking and compression. [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Wayde Milas

2004-Sep-10 16:45 UTC

[Flac-dev] Blocking and compression.

Hello. I thought I'd introduce myself. I'm the guy that wrote kexis way
back when. For those of you that dont know, kexis was written before
flac, and pretty much does what flac does, although its not nearly as
nice or robust. I wrote kexis actualy cause I was interested in playing
around with lossless compression and shorten was about the only tool
that was linux friendly, but had a horrible liscense. So, kexis was
hacked together for my own personal use.

Anyways, I've got that lossless coding itch again, and I pulled out some
of my old notes on things I was going to play around adding to kexis.
The version of kexis thats on sourceforge was totally re-written later
for speed but I never released it. At this point its kind of silly to
release it since flac is just as fast now, plus its much more robust
(btw, nice job on it).

The reason for this post is I would like to know if I start coding again
and releasing patches agaisnt flac in various areas, if they will be
accepted. I'm not sure if these additions fit the "focus" of what
flac
is meant to be, so I thought I'd discuss them first. Keep in mind that I
just subscribed to this list, so if these have already been gone over,
just slap me with a tuna :)

I believe the reason flac is behind some of the closed source encoders
is two fold: Blocking and arithmatic encoding.

Blocking: If the data set is local (not streaming) and we have the
opportunity to make a coupla passes on the data, its possible to
determine non fixed block sizes that fit orders of prediction closely.
I'm sure this is no surprise to all of you. Actually coming up with a
algo to do this is not as easy as it sounds. However, a few more % can
be squeezed this way jsut but fitting lp's I believe. Then again, maybe
not. Its depends if there are sharp shifts in the data set. If there
are, finding these "edges" could benefit because you wont have a block
where its encoding well, then goes all to hell because the predictor is
missing by large numbers, and takes a while to readjust. The downside is
not very large... scanning a data set and building a table of block
offsets would not add that much overhead to the encoding process as its
a lightweight operation (assuming i/o operations are moderately fast).

Arithmatic encoding. Rice's granularity for encoding is 1 bit.
(obviously) However there are techniques out there to encode fractions
of bits. Well, you can only encode down to the granularity of 1 bit, but
by doing a bit of magic, you get the net effect of being able to
represent fractions of bits. Over a large data set, this can save
hundreds of thousands of bits. I think this is where some of the closed
source encoders make thier large gains. The down side is that arithmatic
encoding takes more horsepower. Looking at some of the other encoders,
any of them that take alot of time to encode are almsot certainly
employing this technique. It might not be suitable for streaming, but
would be a very nice option for archival purposes.

Thats my 2 cents.

Wayde Milas
Rarcoa

Josh Coalson

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Blocking and compression.

--- Wayde Milas <wmilas@rarcoa.com> wrote:> The reason for this post is I would like to know if I start coding
> again
> and releasing patches agaisnt flac in various areas, if they will be
> accepted. I'm not sure if these additions fit the "focus" of
what
> flac
> is meant to be, so I thought I'd discuss them first. Keep in mind
> that I
> just subscribed to this list, so if these have already been gone
> over,
> just slap me with a tuna :)
hi Wayde.  sure, if they fit the goals, patches will make it in
in some form or another.
> Blocking: If the data set is local (not streaming) and we have the
> opportunity to make a coupla passes on the data, its possible to
> determine non fixed block sizes that fit orders of prediction
> closely.
> I'm sure this is no surprise to all of you. Actually coming up with a
> algo to do this is not as easy as it sounds. However, a few more %
> can
> be squeezed this way jsut but fitting lp's I believe. Then again,
> maybe
> not. Its depends if there are sharp shifts in the data set. If there
> are, finding these "edges" could benefit because you wont have a
> block
> where its encoding well, then goes all to hell because the predictor
> is
> missing by large numbers, and takes a while to readjust. The downside
> is
> not very large... scanning a data set and building a table of block
> offsets would not add that much overhead to the encoding process as
> its
> a lightweight operation (assuming i/o operations are moderately
> fast).
Miroslav did some experiments with searching for optimum blocksize.
from what I remember it made at best a couple percent difference.
there was a thread about it here a while back.
> Arithmatic encoding. Rice's granularity for encoding is 1 bit.
> (obviously) However there are techniques out there to encode
> fractions
> of bits. Well, you can only encode down to the granularity of 1 bit,
> but
> by doing a bit of magic, you get the net effect of being able to
> represent fractions of bits. Over a large data set, this can save
> hundreds of thousands of bits. I think this is where some of the
> closed
> source encoders make thier large gains. The down side is that
> arithmatic
> encoding takes more horsepower. Looking at some of the other
> encoders,
> any of them that take alot of time to encode are almsot certainly
> employing this technique. It might not be suitable for streaming, but
> would be a very nice option for archival purposes.
you can get very close ot arithmetic coding with range coding,
which is much faster and believed to be patent free.  monkey's
audio uses range coding as part of its entropy coder.  I did
some experiments and couldn't get much of a compression
improvement, but the area is far from exhausted.

Josh


__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus

Wayde Milas

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Blocking and compression.

On Tue, 2004-01-20 at 23:44, Josh Coalson wrote:> Miroslav did some experiments with searching for optimum blocksize.
> from what I remember it made at best a couple percent difference.
> there was a thread about it here a while back.
> Did his changes make it in? I can think of a coupla ways to approach
this and I'd like to hear about what he tried.

A couple of % doesnt seem like much, but considering that flac is
lossless, a couple of % here and there adds up to a big savings.
> 
> you can get very close ot arithmetic coding with range coding,
> which is much faster and believed to be patent free.  monkey's
> audio uses range coding as part of its entropy coder.  I did
> some experiments and couldn't get much of a compression
> improvement, but the area is far from exhausted.
> 
I was not aware that arithmetic coding was patented anywhere. The texts
I have on the subject don't mention a patent, although that doesn't
surprise me. I had considered range coding, but arithmetic is superior
in every was except for speed. Having a nice reference encoder that you
could then go back to and compare ordered rice or range to I thought
would be ideal.

Speaking of, I'd be curios as to what texts your recommend on lossless
compression. If you are interested, when I get home later today I can
tell you what I've read through and what I found worthwile. Lossless
compression has always been a hobby for me and I have no formal training
on the subject, so anything I can find to read is a great help.

I heard that Monky's released source a while back. I'll browse around
and take a look at his stuff when I get a chance.

Wayde

Apparently Analagous Threads

Search for more seemingly similar threads

flac dev - Sep 2004 - Blocking and compression.

[Flac-dev] Blocking and compression.

[Flac-dev] Blocking and compression.

[Flac-dev] Blocking and compression.

Apparently Analagous Threads