search for: fstride_2

Displaying 2 results from an estimated 2 matches for "fstride_2".

Did you mean: fstride
2014 Nov 09
0
[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics
...vcreate_f32(ONES_MINUS_ONE); + float32x2_t minusones_2 = vcreate_f32(MINUS_ONE); + float32x4_t ones = vcombine_f32(ones_2, ones_2); + float32x4_t minusones = vcombine_f32(minusones_2, minusones_2); + float32x4_t t; + float32x4x2_t tv; + float *tw1, *tw2, *tw3; + float *tw1_2, *tw2_2, *tw3_2; + int fstride_2 = 2*fstride; + int fs_tw1 = 2*fstride_2; + int fs_tw2 = 4*fstride_2; + int fs_tw3 = 6*fstride_2; + int fs_x = 3*fstride_2; + const int m1 = 2*m; + const int m2 = 4*m; // 2*(2*m) + const int m3 = 6*m; // 3*(2*m) + kiss_fft_cpx *Fout_beg = Fout; + float32x4_t tw[3]; + float32x2_t tw_2[6]; + float *ai...
2014 Nov 09
3
[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics
Hello, This patch introduces ARM NEON Intrinsics to optimize kf_bfly4 routine in celt part of libopus. Using NEON optimized kf_bfly4(_neon) routine helped improve performance of opus_fft_impl function by about 21.4%. The end use case was decoding a music opus ogg file. The end use case saw performance improvement of about 4.47%. This patch has 2 components i. Actual neon code to improve