I've been reading up on wavelet transforms the past week, and I plan to start on some video compression stuff next week, if it's any good (small chance :)) for Tarkin. So far I think I know what's happening, however there's one small thing I don't quite understand yet. If I understand correctly, you can do a 2d wavelet transform (I'm assuming a Haar transform here for simplicity) by running a wavelet filter on all rows of the image first. That'll yield a new image which has the wavelet coefficients of the first run in the right half of the pixels, and a lower res (lowpass filtered) representation in the left half. You then run the same wavelet filter on the columns of the left half, etc. So far so good, but now if you view the image as a 1D array, then at the end of the transform the coefficients are somewhat jumbled. Some ascii art: image: after 1st filter run: after 2nd run: xxxx xx11 xx11 xxxx xx11 xx11 xxxx xx11 2211 xxxx xx11 2211 after final run: x311 4311 2211 2211 So after the final run the 1D representation of the wavelet transform is x311431122112211. Is this correct, or should this be x433222211111111? I think it is correct, but I'd rather be sure (I already have a Haar wavelet implementation that build a binary tree, and then saves the coefficients walking the tree breadth-first, which yields the last format). What I hope to do is do a wavelet-transform of each frame. If I understand correctly this should yield a rather "spiky" "graph", due to the characteristics of the data. Now I want to encode the difference in location and amplitude of these spikes in a smooth curve. Ofcourse this only works well if the changes happen to lie on a curve, so I'll have some staring to graphs to do. Anyway, if the second permutation of the coefficients is used I think this will work better. (the first one with a 2D curved surface to represent change instead of a 1D curve may work too though) So, which on is the mathmatically correct one? Oh, and another question I just remembered: audio signals are signed, because they're representations of the movement of the microphone. Video data does not have that, but the wavelet transform does expect signed data (or doesn't it? Haar works on unsigned data but what about Daubechies and other wavelets?). So how do I convert? Simply subtracting 128 from each sample (ie casting to signed int) seems wrong. Hope I didn't steal too much time from your efforts of getting it to work on Solaris :) Cheers, Lourens --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
> So, which on is the mathmatically correct one?It doesn't matter, as long as your inverse transform expects the same ordering. Use the one that compresses better (I think that would be the second one).> Oh, and another question I just remembered: audio signals > are signed, because they're representations of the movement > of the microphone. Video data does not have that, but the > wavelet transform does expect signed data (or doesn't it?You can view unsigned data as signed data, that "just happens" to be all positive :-)> Haar works on unsigned data but what about Daubechies and > other wavelets?). So how do I convert? Simply subtracting > 128 from each sample (ie casting to signed int) seems wrong.If think you'll get better results if you transform the pixel luminances to log doamin first (take a dB or something).> Hope I didn't steal too much time from your efforts of > getting it to work on Solaris :)...or to work _at all_, in my case ;-) (Just found out that a 10^48 big codebook is a no-no). Dagdag, Segher --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
On Sat, Dec 30, 2000 at 10:15:18PM +0330, jsr@dds.nl wrote: [snip]> Oh, and another question I just remembered: audio signals > are signed, because they're representations of the movement > of the microphone. Video data does not have that, but the > wavelet transform does expect signed data (or doesn't it? > Haar works on unsigned data but what about Daubechies and > other wavelets?). So how do I convert? Simply subtracting > 128 from each sample (ie casting to signed int) seems wrong.Your input does not need to be signed. I would scale my input to be 0.0 .. 1.0 and degamma it in the process. (So that .5 is half bright to the viewer). --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Re-reading that mail, it has some weird grammar errors. And also casting to unsigned int ofcourse does not equal subtracting 128. The idea should still be clear though, and I'd still appreciate an answer. :) Lourens --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
>Your input does not need to be signed. > >I would scale my input to be 0.0 .. 1.0 and degamma it inthe process. (So>that .5 is half bright to the viewer).Okay, thanks. My signal processing knowledge really needs work. Then again, given that I completed my first Linear Algebra course only a month ago (silly stuff, knew most of it anyway) I suppose I'm relatively quite far. I should try to find a book on signal processing in the uni library, I don't think it's in the first year CS curriculum here. In fact it may not even be in it at all. Annoying. Why am I bothering you with this? --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
>It doesn't matter, as long as your inverse transformexpects>the same ordering. Use the one that compresses better >(I think that would be the second one).Yeah, I see it now. The second one is best, it looks illogical in 1D because it's a 2D transform of a 2D signal, which is 2D itself. It does look logical if you view it as a 2D result.>You can view unsigned data as signed data, that "justhappens">to be all positive :-)That won't give trouble? Okay, I wasn't sure...>If think you'll get better results if you transform thepixel>luminances to log doamin first (take a dB or something).Hmm, good idea. I haven't thought about preprocessing (conversion to yuv etc.) that much yet. I should.>...or to work _at all_, in my case ;-) >(Just found out that a 10^48 big codebook is a no-no).Heh. Want to borrow some disk space? :) Lourens --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.