search for: xcorr3to1_kernel_neon_float

Displaying 4 results from an estimated 4 matches for "xcorr3to1_kernel_neon_float".

2014 Dec 09
1
[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
...> + SUMM = vmlaq_lane_f32(SUMM, YY[0], XX_2, 0); > + YY[0] = vld1q_f32(yi++); > + case 1: > + XX_2 = vld1_dup_f32(xi++); > + SUMM = vmlaq_lane_f32(SUMM, YY[0], XX_2, 0); > + } > + > + vst1q_f32(sum, SUMM); > +} > + > +/* > + * Function: xcorr3to1_kernel_neon_float > + * --------------------------------- > + * Computes single correlation values and stores in *sum > + */ > +void xcorr3to1_kernel_neon_float(const float *x, const float *y, > + float *sum, int len) { I had to think quite a bit about what "3to1" meant (since...
2014 Dec 07
2
[RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
Hi, Optimizes celt_pitch_xcorr for floating point. Changes from RFCv1: - Rebased on top of commit aad281878: Fix celt_pitch_xcorr_c signature. which got rid of ugly code around CELT_PITCH_XCORR_IMPL passing of "arch" parameter. - Unified with --enable-intrinsics used by x86 - Modified algorithm to be more in-line with algorithm in celt_pitch_xcorr_arm.s Viswanath Puttagunta
2014 Dec 07
0
[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
...d1q_f32(yi++); + case 2: + XX_2 = vld1_dup_f32(xi++); + SUMM = vmlaq_lane_f32(SUMM, YY[0], XX_2, 0); + YY[0] = vld1q_f32(yi++); + case 1: + XX_2 = vld1_dup_f32(xi++); + SUMM = vmlaq_lane_f32(SUMM, YY[0], XX_2, 0); + } + + vst1q_f32(sum, SUMM); +} + +/* + * Function: xcorr3to1_kernel_neon_float + * --------------------------------- + * Computes single correlation values and stores in *sum + */ +void xcorr3to1_kernel_neon_float(const float *x, const float *y, + float *sum, int len) { + int i; + float32x4_t XX[4]; + float32x4_t YY[4]; + float32x4_t SUMM; + float32x2_...
2014 Dec 07
3
[RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
From: Viswanath Puttagunta <viswanath.puttagunta at linaro.org> Hi, Optimizes celt_pitch_xcorr for floating point. Changes from RFCv1: - Rebased on top of commit aad281878: Fix celt_pitch_xcorr_c signature. which got rid of ugly code around CELT_PITCH_XCORR_IMPL passing of "arch" parameter. - Unified with --enable-intrinsics used by x86 - Modified algorithm to be more