thr3ads.net - flac dev - [Flac-dev] Re: nice idea [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Josh Coalson

2004-Sep-10 16:45 UTC

[Flac-dev] Re: nice idea

--- Brady Patterson <brady@spaceship.com> wrote:> This is in response to Miroslav's idea about variable block sizes.  I
> may be a
> bit out of my league here as I'm just starting to look at how the
> actual
> encoding gets done.  But it seems to me that you could make a decent
> guess
> about when something "new" happens based on the second derivative
of
> the signal
> (where the first derivative is the difference between a given sample
> and the
> previous, and the second is you-get-the-idea).
> 
> Here's my rationale: high-amplitude, high-frequency sections are the
> hard ones
> to encode, or at least will work best in their own frame.  Those
> characteristics imply a high first derivative.  You want to put such
> sections
> in their own block, and boundaries of such blocks will be where the
> second
> derivative is relatively high.
> 
> Okay, that's not quite right, since the first derivative will be
> negative about
> half the time, and large negative has the same effect as lange
> positive.  So I
> think what you really want is the first derivative of the absolute
> value of the
> first derivative.
> 
> Then there's the question of where to put the boundaries.  Some
> trial-and-error
> is probably the best approach here.  For files on which the above
> formula is
> consistently high, it will probably be desirable to set the limit
> high to avoid
> too much frame overhead.
I had done some experiments a while back with varying the blocksize.
My initial approach was to do an exhaustive search on some clips
just to see where the upper limit of improvement was, and it
turned out do not be that great.  If I remember right it was
something like <1% compression improvement.  In retrospect I
probably would have designed the format with more restrictions
on the blocksize to make the decoder simpler.

So I guess I would say before trying really complicated algos,
brute force it on a couple of samples to see if what you end
up with will be worth it.  My conclusion at the time was that
varying the blocksize would probably only make sense for things
like sound fonts.

Josh


__________________________________________________
Do you Yahoo!?
Faith Hill - Exclusive Performances, Videos & More
http://faith.yahoo.com

Miroslav Lichvar

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Re: nice idea

On Thu, Oct 17, 2002 at 09:51:02AM -0500, Brady Patterson
wrote:> 
> Okay, I deleted most of this thread, so I was waiting for another message
to
> respond to, so unfortunately this will be out of place in the thread.
> 
> This is in response to Miroslav's idea about variable block sizes.  I
may be a
> bit out of my league here as I'm just starting to look at how the
actual
> encoding gets done.  But it seems to me that you could make a decent guess
> about when something "new" happens based on the second derivative
of the signal
> (where the first derivative is the difference between a given sample and
the
> previous, and the second is you-get-the-idea).
> 
> Here's my rationale: high-amplitude, high-frequency sections are the
hard ones
> to encode, or at least will work best in their own frame.  Those
> characteristics imply a high first derivative.  You want to put such
sections
> in their own block, and boundaries of such blocks will be where the second
> derivative is relatively high.
> 
> Okay, that's not quite right, since the first derivative will be
negative about
> half the time, and large negative has the same effect as lange positive. 
So I
> think what you really want is the first derivative of the absolute value of
the
> first derivative.
> 
> Then there's the question of where to put the boundaries.  Some
trial-and-error
> is probably the best approach here.  For files on which the above formula
is
> consistently high, it will probably be desirable to set the limit high to
avoid
> too much frame overhead.
> 
> Hope this was interesting and/or useful :) .
Well, i took 10 CD and test my primitive implementations of these
algos. Here are my results:

      size                            encoding time
(0)   6401778544  1.0000            
(1)   4193699407  0.6551  1.0000      1.00
(2)   4180011683  0.6529  0.9967      1.18
(3)   4186509853  0.6540  0.9983      1.15

"best" CD:
(0)    503448568  1.0000  
(1)    349525363  0.6942  1.0000
(2)    347167639  0.6896  0.9933
(3)    347864119  0.6910  0.9952

"best" track:
(0)     44111804  1.0000
(1)     28091683  0.6368  1.0000
(2)     27769870  0.6295  0.9885
(3)     27864205  0.6317  0.9919

where:
(0) wav files
(1) flac files, fixed blocksize 4608
(2) flac files, variable blocksize, "lpc idea"
(3) flac files, variable blocksize, watching average of absolute
    values of first and second derivative


-- 
Miroslav Lichvar

Josh Coalson

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Re: nice idea

--- Miroslav Lichvar <lichvarm@phoenix.inf.upol.cz>
wrote:> On Thu, Oct 17, 2002 at 09:51:02AM -0500, Brady Patterson wrote:
> > ... But it seems to me that you could make a decent guess
> > about when something "new" happens based on the second
derivative
> of the signal
> > (where the first derivative is the difference between a given
> sample and the
> > previous, and the second is you-get-the-idea).
> > 
> > Here's my rationale: high-amplitude, high-frequency sections are
> the hard ones
> > to encode, or at least will work best in their own frame.  Those
> > characteristics imply a high first derivative.  You want to put
> such sections
> > in their own block, and boundaries of such blocks will be where the
> second
> > derivative is relatively high.
> > 
> > Okay, that's not quite right, since the first derivative will be
> negative about
> > half the time, and large negative has the same effect as lange
> positive.  So I
> > think what you really want is the first derivative of the absolute
> value of the
> > first derivative.
> > 
> > Then there's the question of where to put the boundaries.  Some
> trial-and-error
> > is probably the best approach here.  For files on which the above
> formula is
> > consistently high, it will probably be desirable to set the limit
> high to avoid
> > too much frame overhead.
> 
> Well, i took 10 CD and test my primitive implementations of these
> algos. Here are my results:
> 
>       size                            encoding time
> (0)   6401778544  1.0000            
> (1)   4193699407  0.6551  1.0000      1.00
> (2)   4180011683  0.6529  0.9967      1.18
> (3)   4186509853  0.6540  0.9983      1.15
> 
> "best" CD:
> (0)    503448568  1.0000  
> (1)    349525363  0.6942  1.0000
> (2)    347167639  0.6896  0.9933
> (3)    347864119  0.6910  0.9952
> 
> "best" track:
> (0)     44111804  1.0000
> (1)     28091683  0.6368  1.0000
> (2)     27769870  0.6295  0.9885
> (3)     27864205  0.6317  0.9919
> 
> where:
> (0) wav files
> (1) flac files, fixed blocksize 4608
> (2) flac files, variable blocksize, "lpc idea"
> (3) flac files, variable blocksize, watching average of absolute
>     values of first and second derivative
Interesting, looks like the best case is ~ 0.75% increase in
compression for 18% increase in encode time.  The compression
increase is similar to my old brute force test but much faster.
The question is, is it worth it from the user's point of view?

Josh


__________________________________________________
Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site
http://webhosting.yahoo.com/

Brady Patterson

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Re: nice idea

Okay, I deleted most of this thread, so I was waiting for another message to
respond to, so unfortunately this will be out of place in the thread.

This is in response to Miroslav's idea about variable block sizes.  I may be
a
bit out of my league here as I'm just starting to look at how the actual
encoding gets done.  But it seems to me that you could make a decent guess
about when something "new" happens based on the second derivative of
the signal
(where the first derivative is the difference between a given sample and the
previous, and the second is you-get-the-idea).

Here's my rationale: high-amplitude, high-frequency sections are the hard
ones
to encode, or at least will work best in their own frame.  Those
characteristics imply a high first derivative.  You want to put such sections
in their own block, and boundaries of such blocks will be where the second
derivative is relatively high.

Okay, that's not quite right, since the first derivative will be negative
about
half the time, and large negative has the same effect as lange positive.  So I
think what you really want is the first derivative of the absolute value of the
first derivative.

Then there's the question of where to put the boundaries.  Some
trial-and-error
is probably the best approach here.  For files on which the above formula is
consistently high, it will probably be desirable to set the limit high to avoid
too much frame overhead.

Hope this was interesting and/or useful :) .

--
Brady Patterson (brady@spaceship.com)

Hod McWuff

2004-Sep-10 16:45 UTC

head link

[Flac-dev] Re: nice idea

A good first run. I wonder, though, what the distribution of block sizes
looks like, and what the magnitude of residual is as a function of block
index.

I'd honestly expect a more significant improvement for either algorithm.

On Sat, 2002-10-19 at 11:13, Miroslav Lichvar wrote:> On Thu, Oct 17, 2002 at 09:51:02AM -0500, Brady Patterson wrote:
> > 
> > Okay, I deleted most of this thread, so I was waiting for another
message to
> > respond to, so unfortunately this will be out of place in the thread.
> > 
> > This is in response to Miroslav's idea about variable block sizes.
I may be a
> > bit out of my league here as I'm just starting to look at how the
actual
> > encoding gets done.  But it seems to me that you could make a decent
guess
> > about when something "new" happens based on the second
derivative of the signal
> > (where the first derivative is the difference between a given sample
and the
> > previous, and the second is you-get-the-idea).
> > 
> > Here's my rationale: high-amplitude, high-frequency sections are
the hard ones
> > to encode, or at least will work best in their own frame.  Those
> > characteristics imply a high first derivative.  You want to put such
sections
> > in their own block, and boundaries of such blocks will be where the
second
> > derivative is relatively high.
> > 
> > Okay, that's not quite right, since the first derivative will be
negative about
> > half the time, and large negative has the same effect as lange
positive.  So I
> > think what you really want is the first derivative of the absolute
value of the
> > first derivative.
> > 
> > Then there's the question of where to put the boundaries.  Some
trial-and-error
> > is probably the best approach here.  For files on which the above
formula is
> > consistently high, it will probably be desirable to set the limit high
to avoid
> > too much frame overhead.
> > 
> > Hope this was interesting and/or useful :) .
> 
> Well, i took 10 CD and test my primitive implementations of these
> algos. Here are my results:
> 
>       size                            encoding time
> (0)   6401778544  1.0000            
> (1)   4193699407  0.6551  1.0000      1.00
> (2)   4180011683  0.6529  0.9967      1.18
> (3)   4186509853  0.6540  0.9983      1.15
> 
> "best" CD:
> (0)    503448568  1.0000  
> (1)    349525363  0.6942  1.0000
> (2)    347167639  0.6896  0.9933
> (3)    347864119  0.6910  0.9952
> 
> "best" track:
> (0)     44111804  1.0000
> (1)     28091683  0.6368  1.0000
> (2)     27769870  0.6295  0.9885
> (3)     27864205  0.6317  0.9919
> 
> where:
> (0) wav files
> (1) flac files, fixed blocksize 4608
> (2) flac files, variable blocksize, "lpc idea"
> (3) flac files, variable blocksize, watching average of absolute
>     values of first and second derivative
> 
> 
> -- 
> Miroslav Lichvar
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:
> Access Your PC Securely with GoToMyPC. Try Free Now
> https://www.gotomypc.com/s/OSND/DD
> _______________________________________________
> Flac-dev mailing list
> Flac-dev@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/flac-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
http://lists.xiph.org/pipermail/flac-dev/attachments/20021021/901c9a4a/attachment.pgp

Apparently Analagous Threads

Search for more seemingly similar threads

flac dev - Sep 2004 - Re: nice idea

[Flac-dev] Re: nice idea

[Flac-dev] Re: nice idea

[Flac-dev] Re: nice idea

[Flac-dev] Re: nice idea

[Flac-dev] Re: nice idea

Apparently Analagous Threads