search for: rnum_s16x4

Displaying 3 results from an estimated 3 matches for "rnum_s16x4".

2016 Jul 14
0
[PATCH 2/5] Optimize fixed-point celt_fir_c() for ARM NEON
...+ for (i=0;i<N-7;i+=8) + { + int16x8_t x_s16x8 = vld1q_s16(_x+i); + int32x4_t sum0_s32x4 = vshll_n_s16(vget_low_s16 (x_s16x8), SIG_SHIFT); + int32x4_t sum1_s32x4 = vshll_n_s16(vget_high_s16(x_s16x8), SIG_SHIFT); + for (j=0;j<ord;j+=4) + { + const int16x4_t rnum_s16x4 = vld1_s16(rnum+j); + x_s16x8 = vld1q_s16(x+i+j+0); + sum0_s32x4 = vmlal_lane_s16(sum0_s32x4, vget_low_s16 (x_s16x8), rnum_s16x4, 0); + sum1_s32x4 = vmlal_lane_s16(sum1_s32x4, vget_high_s16(x_s16x8), rnum_s16x4, 0); + x_s16x8 = vld1q_s16(x+i+j+1); + sum0_s32x...
2016 Jun 17
5
ARM NEON optimization -- celt_fir()
Hi all, This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the next few months. I'm submitting 2 patches in the following couple of emails, which have the new created celt_fir_neon(). I revised celt_fir_c() to not pass in argument "mem" in Patch 1. If there are concerns to this change, please let me know. Many thanks to your comments. Linfeng Zhang
2016 Jul 14
6
Several patches of ARM NEON optimization
I rebased my previous 3 patches to the current master with minor changes. Patches 1 to 3 replace all my previous submitted patches. Patches 4 and 5 are new. Thanks, Linfeng Zhang