Well, here you are. 24k; sorry if I'm not supposed to put this size things in your mailbox, didn't know where else to put it. And you all are subscribed to vorbis-dev, after all. I'm not that good at breaking patches apart, so it's one big patch. Sorry. Overview: configure.in make profiling easier & more useful decoder-example.c (#if 0'ed) dither output; sounds a lot more "open", but should be coloured dither. Someone else please step in. (The dither could be applied before the mdct, but we have to know the amplification. No problem with decoder_example, but it is in the general case). codebook.[ch] bitwise.[ch] bookinternal.h sharedbook.c decode the first N bits of a huffman word in one step via table look up. The huffman decoding could be much more efficient (i.e., no decode tree necessary -> saves a lot of cache, and decoding time O(log log N) instead of O(log N), where N is alphabet size), if the huffman tree would be left (or right; I prefer left) aligned. Ask me for pointers if you need convincing. The code I submit now could be more clean, I know. Seems to work though. Weird. unrolled && rerolled the decodevs loops. envelope.c if (fabs(...) < min) creates horrible assembler (gcc 2.95, x86), so changed to if (... < min && ... > -min). muchos faster. lsp.c put the fromdB() before linearmap cales.h todB_nn() for non-negative values. fabs() is horror. Added some prefixes to pack(), inverse(), et. al. (i.e., time0_pack() etc.) Think I still forgot some. I don't think you'll want to apply all of this, oh, especially not the debug output :-). Have some fun, Segher <HR NOSHADE> <UL> <LI>text/plain attachment: t/v </UL> -------------- next part -------------- A non-text attachment was scrubbed... Name: v Type: application/octet-stream Size: 24344 bytes Desc: not available Url : http://lists.xiph.org/pipermail/vorbis-dev/attachments/20000828/91cdec84/v-0001.obj
> Well, here you are. 24k; sorry if I'm not supposed to put this size > things in your mailbox, didn't know where else to put it.No, patches, even reasonably big ones, are OK. 24k is not anywhere near large.> I'm not that good at breaking patches apart, so it's one big patch.I hand apply these sorts of things anyway, if only so I know exactly the changes being made for future reference.> decoder-example.c > (#if 0'ed) dither output; sounds a lot more "open", but > should be coloured dither. Someone else please step in. (The > dither could be applied before the mdct, but we have to know > the amplification. No problem with decoder_example, but it is > in the general case).Is dithering really useful for samples when the current dynamic range is is using a good portion of the total range? I'd expect it to just add [white if not colored] noise and adding noise is generally frowned upon. Of course, for very low amplitude, the value of dithering is understood.> codebook.[ch] > bitwise.[ch] > bookinternal.h > sharedbook.c > decode the first N bits of a huffman word in one step via table > look up.Long ago I did Huffman this way, but found the additional lookup to be large and negate the benefits if the tree was oddly proportioned. I'd assume you'd not do this unless it worked, so I'll look and see if I learn something ;-)> The huffman decoding could be much more efficient (i.e., no > decode tree necessary -> saves a lot of cache, and decoding > time O(log log N) instead of O(log N), where N is alphabet > size), if the huffman tree would be left (or right; I prefer > left) aligned.Please describe in more detail what you mean by left/right aligned. I assume you mean something about the hufftree structure rather than the bitsex of the bitpacker (bitwise.c)? There is generally method to my madness, but I'm mostly good at theoretical complexity optimization, not so much optimizing for raw practicalities (lack of experience).> The code I submit now could be more clean, I know. Seems to > work though. Weird....however *that* statement does not instill confidence. Are you concerned about the code clenliness or some other problem? I worry about committing things titled 'seems to work. weird.' ;-)> unrolled && rerolled the decodevs loops.Yeah, *that* little bit of ugliness I actually somewhat regret.> envelope.c > if (fabs(...) < min) creates horrible assembler (gcc 2.95, x86), > so changed to if (... < min && ... > -min). muchos faster.Intersting. I'll keep it in mind.> lsp.c > put the fromdB() before linearmapGood. I'll be squashing this even further.> scales.h > todB_nn() for non-negative values. fabs() is horror.OK, but in most cases I'm using todB not knowing if I'm >= 0 or just real. Monty --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Patch applied. Committing to CVS later with some other changes. Monty --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
2 if statements is not faster than 1 fabs and 1 if statement. A floating point compare is very slow, and is subject to branch mispredictions. fabs is a 1 cycle operation (it just clears the sign bit in the fpu), and only incurs one floating point compare.>> >> > envelope.c >> > if (fabs(...) < min) creates horrible assembler (gcc 2.95, x86), >> > so changed to if (... < min && ... > -min). muchos faster. >> >> Intersting. I'll keep it in mind. > >I added -mno-ieee-fp to the compile flags; I _think_ fabs() is ok in >this case. I'll test; no ieee is a huge speedup on x86 anyway.--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Possibly Parallel Threads
- Question about Huffman Tables in Setup Header
- question about codebook
- [LLVMdev] Lowering switch statements with PGO
- theora encoder reordering, order of puting data from DCT 8x8 blocks to huffman compressor, and puting result of huffman compressor to buffer bitstream memory
- [LLVMdev] Lowering switch statements with PGO