thr3ads.net - flac dev - [Flac-dev] alternate compression [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Brian Willoughby

2009-Aug-09 19:33 UTC

[Flac-dev] alternate compression

On Aug 8, 2009, at 23:11, Didier Dambrin wrote:> Electronic music quite often doesn't leave a computer these days.  
> And it
> mainly consists of drums, synths & vocals/effects. Drums are often  
> samples
> sequenced at sample (not sub-sample) accuracy, thus repeated (of  
> course if
> the song was post-resampled, there will be sub-sample times).
Good point.  I have certainly seen songs which were at a fixed tempo,  
say 128 BPM, and were so precise that you could cut and paste pieces  
of the song without glitches.  Every measure lined up closely enough  
with the others that you could separate instruments from each other  
by subtracting out the repeated patterns.

> Synths are a problem, as the riffs will have more variations, and  
> also free-running
> oscillators will give troubles.
Not only that, but some synths are oversampled, thus you have the  
"analog" problem of subsampled waveforms.

> Anyway, right now I get what I wanted somewhat working, well enough
> considering I've only spent a couple of hours.
Excellent!

> ..But sadly none of FLAC, WavPack or OptimFrog could compress the
> pre-processed song better, or hardly. And considering you'd also  
> have to add
> the pool of frames, it would end up worse.
This surprises me.  Have you tried aligning your frames to the  
standard FLAC frame size?

As for the pool, it seems like the first occurrence of a repetition  
would compress like usual, and the subsequent ones would compress  
more than usual.
> The problem is the discontinuities I think. Say you work with little,
> non-tempo-synced frames, and you find a matching frame, which you  
> subtract
> from the song at the places it matches. You'll have a discontinuity  
> around
> it. If the frames around this one also match, it doesn't matter as  
> they will
> be subtracted as well. But if they don't (enough), the  
> discontinuity will
> stay.
You may be right about the discontinuities.  Have you tried making  
your transitions only at zero-crossings?
> I also tried windowing the frame before subtracting it, no more
> discontinuity but with small frames it's not very useful anymore.
Windowing may seem like a good idea, but remember that your decoder  
will have to recreate every step that your encoder uses so that it  
can be undone.  Thus, windowing may make it difficult to be lossless.
> But if I run the pre-processing on something perfectly repeated  
> several
> times, it really finds the frames, and it doesn't require knowing  
> the tempo.
> If you don't know the tempo, the only problem will be misalignment,  
> which
> will leave little bits of audio that were too short to find  
> matching frames,
> but most of the processed waveform will still be silence.
Seems like knowing the tempo would allow the encoding phase to take  
far less time.  It makes sense that you don't absolutely "need"
it,
but you did say it takes a really long time to find matches.

> Btw, are all lossless compression methods working in the time domain?
I would guess that most lossless audio compression methods are time  
domain.  However, LJPG (lossless JPEG) uses a very efficient lossy  
compression followed by lossless compression of the difference.  I  
wouldn't be surprised if there is an audio codec which combines lossy  
frequency domain compression with lossless compression of the  
difference between the lossy version and the original.  If there  
isn't then I'll just patent that...

Brian Willoughby
Sound Consulting

Didier Dambrin

2009-Aug-10 04:18 UTC

head link

[Flac-dev] alternate compression

>> ..But sadly none of FLAC, WavPack or OptimFrog could compress the
>> pre-processed song better, or hardly. And considering you'd also
>> have to add
>> the pool of frames, it would end up worse.
>
> This surprises me.  Have you tried aligning your frames to the
> standard FLAC frame size?
>
Not at all, because I have no idea how it works internally, I've only be 
using the standalone binary for now. What's the standard frame size?
(I work in Delphi, so playing with C++ API's require painfully 
translating/adapting them, so I wanted to stay away from that for now)

> As for the pool, it seems like the first occurrence of a repetition
> would compress like usual, and the subsequent ones would compress
> more than usual.
>
>> The problem is the discontinuities I think. Say you work with little,
>> non-tempo-synced frames, and you find a matching frame, which you
>> subtract
>> from the song at the places it matches. You'll have a discontinuity
>> around
>> it. If the frames around this one also match, it doesn't matter as
>> they will
>> be subtracted as well. But if they don't (enough), the
>> discontinuity will
>> stay.
>
> You may be right about the discontinuities.  Have you tried making
> your transitions only at zero-crossings?
>
no but that could be worth a try

>> I also tried windowing the frame before subtracting it, no more
>> discontinuity but with small frames it's not very useful anymore.
>
> Windowing may seem like a good idea, but remember that your decoder
> will have to recreate every step that your encoder uses so that it
> can be undone.  Thus, windowing may make it difficult to be lossless.
>
It would be undoable, as long as it's the original frame that you store, you
can compute the adapted frame from it, and do anything you want.
For the normalization, you'd store the frame gain along with the time where 
it repeats.

>> But if I run the pre-processing on something perfectly repeated
>> several
>> times, it really finds the frames, and it doesn't require knowing
>> the tempo.
>> If you don't know the tempo, the only problem will be misalignment,
>> which
>> will leave little bits of audio that were too short to find
>> matching frames,
>> but most of the processed waveform will still be silence.
>
> Seems like knowing the tempo would allow the encoding phase to take
> far less time.  It makes sense that you don't absolutely
"need" it,
> but you did say it takes a really long time to find matches.
>
>
>> Btw, are all lossless compression methods working in the time domain?
>
> I would guess that most lossless audio compression methods are time
> domain.  However, LJPG (lossless JPEG) uses a very efficient lossy
> compression followed by lossless compression of the difference.  I
> wouldn't be surprised if there is an audio codec which combines lossy
> frequency domain compression with lossless compression of the
> difference between the lossy version and the original.  If there
> isn't then I'll just patent that...
>
I've tried that already, and was very surprised by the results. Wikipedia 
told me about those lossy+correction methods, and that there's supposedly a 
version of AAC that can do this (& WavPack & others too, but in the time
domain I'd assume).
..so I started OGGing a song at various bitrates, subtracted it from the 
original, and tried encoding the residual using the lossless packers I 
mentioned.
To my surprise, the size of the OGG+the packed residual always roughly 
matched the size of the packed original. Tried with 32k, 128k & 450k oggs.. 
always the same! Not exactly the same of course, but I was expecting much 
bigger results (not really smaller, assuming someone had tried the same 
before me).
(haven't tried with MP3, but it's probably worse)

The residual from the OGG seemed to be very stable in gain, with a bit depth 
decreasing along with the increasing OGG bitrate. I wasn't expecting that, 
knowing it works in the freq domain.

Brian Willoughby

2009-Aug-10 11:20 UTC

head link

[Flac-dev] alternate compression

On Aug 9, 2009, at 21:18, Didier Dambrin wrote:>>> ..But sadly none of FLAC, WavPack or OptimFrog could compress the
>>> pre-processed song better, or hardly. And considering you'd
also
>>> have to add
>>> the pool of frames, it would end up worse.
>>
>> This surprises me.  Have you tried aligning your frames to the
>> standard FLAC frame size?
>
> Not at all, because I have no idea how it works internally, I've  
> only be
> using the standalone binary for now. What's the standard frame size?
> (I work in Delphi, so playing with C++ API's require painfully
> translating/adapting them, so I wanted to stay away from that for now)I was going to suggest that you could use the FLAC library via the C  
API, instead of the C++ API, but some quick research on Delphi  
doesn't seem to show support for C.  I use Objective C for object- 
oriented development, and it is very easy to incorporate C API.  I  
can't really use Delphi since it is Windows only, so I can't really  
help you there.

>> I would guess that most lossless audio compression methods are time
>> domain.  However, LJPG (lossless JPEG) uses a very efficient lossy
>> compression followed by lossless compression of the difference.  I
>> wouldn't be surprised if there is an audio codec which combines
lossy
>> frequency domain compression with lossless compression of the
>> difference between the lossy version and the original.  If there
>> isn't then I'll just patent that...
>
> I've tried that already, and was very surprised by the results.  
> Wikipedia
> told me about those lossy+correction methods, and that there's  
> supposedly a
> version of AAC that can do this (& WavPack & others too, but in the
> time
> domain I'd assume).
> ..so I started OGGing a song at various bitrates, subtracted it  
> from the
> original, and tried encoding the residual using the lossless packers I
> mentioned.
> To my surprise, the size of the OGG+the packed residual always roughly
> matched the size of the packed original. Tried with 32k, 128k &  
> 450k oggs..
> always the same! Not exactly the same of course, but I was  
> expecting much
> bigger results (not really smaller, assuming someone had tried the  
> same
> before me).
> (haven't tried with MP3, but it's probably worse)One thing to keep in mind is that FLAC isn't necessarily very  
efficient at compressing silence.  While amplitude does correlate  
with size to some extent, it does not continue to improve below a  
certain amplitude.  Perhaps this is due to the overhead of the format  
itself.  One solution might be a custom format which embeds FLAC- 
compressed packets along with the lossy packets, thus sharing the  
overhead instead of having two completely independent files.

After noticing that quieter tracks are compressed smaller, I tried  
compressing silence, and I seem to recall that it didn't do quite as  
well as I expected.  Even if I am recalling this benchmark correctly,  
I suppose it isn't really important since very little music is that  
quiet.

> The residual from the OGG seemed to be very stable in gain, with a  
> bit depth
> decreasing along with the increasing OGG bitrate. I wasn't  
> expecting that,
> knowing it works in the freq domain.
In some respects, it should not really matter whether the compression  
is time domain or frequency domain, because the end result of lossy  
compression is added "noise."  Whether this noise comes about from  
time domain errors or frequency domain errors should be irrelevant.   
In either case, the amplitude of the error should be quite small, and  
an algorithm like FLAC can compress low-amplitude signals quite well.

Brian Willoughby
Sound Consulting

Possibly Parallel Threads

Search for more seemingly similar threads

flac dev - Aug 2009 - alternate compression

[Flac-dev] alternate compression

[Flac-dev] alternate compression

[Flac-dev] alternate compression

Possibly Parallel Threads