Displaying 1 result from an estimated 1 matches for "1629k".
Did you mean:
1629
2008 Apr 04
1
Resampler experimental speedups
...ly give a marginal
improvement (5% or so). Note that no loop unrolling has been done; for
the direct case unrolling 4 times will reduce instruction count noticeably.
Using '-ftree-vectorize -ffast-math -O3' and a profile run:
Original: Direct 3419k, Interpolate 9255k
This version: Direct 1629k, Interpolate 8588k
My loop transformations allow GCC to recognize it as vectorizable for
the direct case, giving a very nice speedup. For interpolate, we're
again hurt by the loop doing too much work. Note though that GCC
currently does not vectorize the inner loop for interpolate as it'...