Hi all, The decoder is down 30% execution time, identical bit output. Didn't get the mdct yet; 1024 point mdct is a bit much to brute-force, and I'm not going to hand-unroll the whole thing either (the machine- unrolled version produced a 1.5M executable; understandably, it wasn't very fast. Still waiting for processors with 1.5M L1 code caches ;-) Slowest parts now are: -- mdct -- vorbis_lsp_to_curve (and the exp()'s afterwards, in fromdB macro; eliminated most of-em, but not all). -- main (the float to s16 loop) Hope to send a patch tomorrow, as the sneakernet had some transmission problems today :-| Dagdag, Segher --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/
Hi all,>>>>> "S" == Segher Boessenkool <segher@eastsite.nl> writes:S> Didn't get the mdct yet; 1024 point mdct is a bit much to S> brute-force, and I'm not going to hand-unroll the whole thing S> either (the machine- unrolled version produced a 1.5M S> executable; understandably, it wasn't very fast. Still waiting S> for processors with 1.5M L1 code caches ;-) I have just made a quick hack on the vorbis's mdct code. NO UNROLLING is there :). the decoding speed gain is only 5% on my celeron box. and the output is not identical. I think this is becuase of floating point rounding problem, but there may be a bug. The attached patch is for yesterday's cvs nighty snapshot. pls your gain and the result. --- Takehiro TOMINAGA // may the source be with you! <HR NOSHADE> <UL> <LI>Text/Plain attachment: mdct-patch </UL> -------------- next part -------------- A non-text attachment was scrubbed... Name: mdct-patch Type: application/octet-stream Size: 8315 bytes Desc: not available Url : http://lists.xiph.org/pipermail/vorbis-dev/attachments/20000823/cf6d0180/mdct-patch-0001.obj