I spent a fair amount of time optimizing tremor for the PS2, mostly by using dual-pipe multiplies in the X[N]PRODnn and the window apply code. Then, just for kicks, I re-enabled _LOW_ACCURACY_ and lo and behold it was still substantially faster. I also got some gains out of tremor by changing the longs in cookbook and sharedbook to ogg_int32_t's like I did for vorbis. I think _LOW_ACCURACY_ is a win mostly because I'm entirely cache-bound in mdct_backward. But my question is, is it good enough for production work? I diffed a few output files and it looked like the largest differences in sixteen bit data were not more than one or two lsb's. Are there any situations where it fails spectacularly I should watch out for? Looking at the lowmem-branch code of tremor, it looks like it downsamples to 16bits as the last step of mdct_backward (which makes a lot of sense), but some of the cvs comments lead me to believe that _LOW_ACCURACY_ doesn't work yet. I'm hoping lowmem-branch will be an even bigger win for me because I can afford more cpu time than cache misses (or memory for that matter). (For what it's worth, my current test case in tremor _LOW_ACCURACY_ runs in about 630M cycles; without _LOW_ACCURACY_ but with other PS2-specific optimizations, it runs in 710M cycles; vorbis with some minor floating-point-muladd optimizations and the longs changed to ogg_int32_t runs in about 800M cycles -- I suspect the huge trig lookups in mdct are killing me there). -Dave --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.