Hi all,
The decoder is down 30% execution time, identical bit output.
Didn't get the mdct yet; 1024 point mdct is a bit much to brute-force,
and I'm not going to hand-unroll the whole thing either (the machine-
unrolled version produced a 1.5M executable; understandably, it wasn't
very fast. Still waiting for processors with 1.5M L1 code caches ;-)
Slowest parts now are:
-- mdct
-- vorbis_lsp_to_curve (and the exp()'s afterwards, in fromdB macro;
eliminated most of-em, but not all).
-- main (the float to s16 loop)
Hope to send a patch tomorrow, as the sneakernet had some transmission
problems today :-|
Dagdag,
Segher
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
Hi all,>>>>> "S" == Segher Boessenkool <segher@eastsite.nl> writes:S> Didn't get the mdct yet; 1024 point mdct is a bit much to S> brute-force, and I'm not going to hand-unroll the whole thing S> either (the machine- unrolled version produced a 1.5M S> executable; understandably, it wasn't very fast. Still waiting S> for processors with 1.5M L1 code caches ;-) I have just made a quick hack on the vorbis's mdct code. NO UNROLLING is there :). the decoding speed gain is only 5% on my celeron box. and the output is not identical. I think this is becuase of floating point rounding problem, but there may be a bug. The attached patch is for yesterday's cvs nighty snapshot. pls your gain and the result. --- Takehiro TOMINAGA // may the source be with you! <HR NOSHADE> <UL> <LI>Text/Plain attachment: mdct-patch </UL> -------------- next part -------------- A non-text attachment was scrubbed... Name: mdct-patch Type: application/octet-stream Size: 8315 bytes Desc: not available Url : http://lists.xiph.org/pipermail/vorbis-dev/attachments/20000823/cf6d0180/mdct-patch-0001.obj