Sounds like a good start. However, think what it would mean if we could get rid of any residual. In the best case scenario, the output would be a series of function coefficients describing a wave, and a length parameter. In the case of music, you can reasonably expect an unforseeable attack and a consistent decay for each sound component. That means if you can totally describe the first wave to hit (say a guitar string) you can use those function coefficients to help determine what the combined wave will look like when, say, the bassist starts up. The key here is the ability to come up with that exact-match equation. I expect it to take some serious crunching to accomplish, but I really don't know the magnitude of the problem. It would appear that either the full-prediction or a limited-residual target would need to be implemented in much the same way. If I'm getting you right, you're suggesting an "estimation" pass to determine block size before doing the final encode, and I'm trying to do it on one shot. I'm probably the one going too cycle-hungry here. On Mon, 2002-10-07 at 16:03, Miroslav Lichvar wrote:> On Sun, Oct 06, 2002 at 04:41:02PM -0400, Hod McWuff wrote: > > > > OK, then how about a speculative approach? > > > > I'm going to go on these assumptions: > > * linear predictive coding > > * exhaustive search option > > * lpc coding is capable of producing zero residual > > * doing so is practical with a tiny block size > > > > Start with say, 64 samples (arbitrary), and compute a zero-residual LPC > > coding. Then use that coding to try and "predict" ahead into the > > un-encoded input stream. Compare against the actual input, and end the > > block where residual starts to show up. > > Hmm, looks interesting for me. But i would rather use usual lpc with > order like 8 or so (your example -- i think residual will show up just > by 64th sample in most cases), and watch residual size for extrapolated > samples. What about something like this: > > * compute lpc for small block (as you say, 64 samples for example) > * watch residual size for extrapolated samples by the block, in sample > where starts significant changes, mark begin of a new block, estimate > framesize of completed block > * start by the mark again, process this way few thousands of samples ahead > * join small blocks according to frame header overhead > * encode marked blocks > > -- > Miroslav Lichvar-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.xiph.org/pipermail/flac-dev/attachments/20021007/75806aa2/attachment.pgp
On Sun, Oct 06, 2002 at 04:41:02PM -0400, Hod McWuff wrote:> > OK, then how about a speculative approach? > > I'm going to go on these assumptions: > * linear predictive coding > * exhaustive search option > * lpc coding is capable of producing zero residual > * doing so is practical with a tiny block size > > Start with say, 64 samples (arbitrary), and compute a zero-residual LPC > coding. Then use that coding to try and "predict" ahead into the > un-encoded input stream. Compare against the actual input, and end the > block where residual starts to show up.Hmm, looks interesting for me. But i would rather use usual lpc with order like 8 or so (your example -- i think residual will show up just by 64th sample in most cases), and watch residual size for extrapolated samples. What about something like this: * compute lpc for small block (as you say, 64 samples for example) * watch residual size for extrapolated samples by the block, in sample where starts significant changes, mark begin of a new block, estimate framesize of completed block * start by the mark again, process this way few thousands of samples ahead * join small blocks according to frame header overhead * encode marked blocks -- Miroslav Lichvar
Okay, I deleted most of this thread, so I was waiting for another message to respond to, so unfortunately this will be out of place in the thread. This is in response to Miroslav's idea about variable block sizes. I may be a bit out of my league here as I'm just starting to look at how the actual encoding gets done. But it seems to me that you could make a decent guess about when something "new" happens based on the second derivative of the signal (where the first derivative is the difference between a given sample and the previous, and the second is you-get-the-idea). Here's my rationale: high-amplitude, high-frequency sections are the hard ones to encode, or at least will work best in their own frame. Those characteristics imply a high first derivative. You want to put such sections in their own block, and boundaries of such blocks will be where the second derivative is relatively high. Okay, that's not quite right, since the first derivative will be negative about half the time, and large negative has the same effect as lange positive. So I think what you really want is the first derivative of the absolute value of the first derivative. Then there's the question of where to put the boundaries. Some trial-and-error is probably the best approach here. For files on which the above formula is consistently high, it will probably be desirable to set the limit high to avoid too much frame overhead. Hope this was interesting and/or useful :) . -- Brady Patterson (brady@spaceship.com)