thr3ads.net - search: "interpolate_product

2009 Oct 26

1

[PATCH] Fix miscompile of SSE resampler

...erpolate_single(SpeexResamplerState *st, spx_uint3 sum = MULT16_32_Q15(interp[0],SHR32(accum[0], 1)) + MULT16_32_Q15(interp[1],SHR32(accum[1], 1)) + MULT16_32_Q15(interp[2],SHR32(accum[2], 1)) + MULT16_32_Q15(interp[3],SHR32(accum[3], 1)); #else cubic_coef(frac, interp); - sum = interpolate_product_single(iptr, st->sinc_table + st->oversample + 4 - offset - 2, N, st->oversample, interp); + interpolate_product_single(&sum, iptr, st->sinc_table + st->oversample + 4 - offset - 2, N, st->oversample, interp); #endif out[out_stride * out_sample++] = SATURATE32(PS...

[PATCH 3/5] resample: Add NEON optimized inner_product_single for fixed point

2011 Sep 01

0

[PATCH 3/5] resample: Add NEON optimized inner_product_single for fixed point

...erp); sum = MULT16_32_Q15(interp[0],SHR32(accum[0], 1)) + MULT16_32_Q15(interp[1],SHR32(accum[1], 1)) + MULT16_32_Q15(interp[2],SHR32(accum[2], 1)) + MULT16_32_Q15(interp[3],SHR32(accum[3], 1)); + sum = SATURATE32PSHR(sum, 15, 32767); #else cubic_coef(frac, interp); sum = interpolate_product_single(iptr, st->sinc_table + st->oversample + 4 - offset - 2, N, st->oversample, interp); #endif - out[out_stride * out_sample++] = SATURATE32(PSHR32(sum, 14), 32767); + out[out_stride * out_sample++] = sum; last_sample += int_advance; samp_frac_num += frac_advan...

Resampler (no api)

2008 May 03

2

Resampler (no api)

...a+i), _mm_loadu_ps(b+i))); + sum = _mm_add_ps(sum, _mm_mul_ps(_mm_loadu_ps(a+i+4), _mm_loadu_ps(b+i+4))); + } + sum = _mm_add_ps(sum, _mm_movehl_ps(sum, sum)); + sum = _mm_add_ss(sum, _mm_shuffle_ps(sum, sum, 0x55)); + _mm_store_ss(&ret, sum); + return ret; +} + +#define OVERRIDE_INTERPOLATE_PRODUCT_SINGLE +static inline float interpolate_product_single(const float *a, const float *b, unsigned int len, const spx_uint32_t oversample, float *frac) { + int i; + float ret; + __m128 sum = _mm_setzero_ps(); + __m128 f = _mm_loadu_ps(frac); + for(i=0;i<len;i+=2) + { + sum = _mm_add_ps(sum, _mm_m...

Resampler, memory only variant

2008 May 03

0

Resampler, memory only variant

...a+i), _mm_loadu_ps(b+i))); + sum = _mm_add_ps(sum, _mm_mul_ps(_mm_loadu_ps(a+i+4), _mm_loadu_ps(b+i+4))); + } + sum = _mm_add_ps(sum, _mm_movehl_ps(sum, sum)); + sum = _mm_add_ss(sum, _mm_shuffle_ps(sum, sum, 0x55)); + _mm_store_ss(&ret, sum); + return ret; +} + +#define OVERRIDE_INTERPOLATE_PRODUCT_SINGLE +static inline float interpolate_product_single(const float *a, const float *b, unsigned int len, const spx_uint32_t oversample, float *frac) { + int i; + float ret; + __m128 sum = _mm_setzero_ps(); + __m128 f = _mm_loadu_ps(frac); + for(i=0;i<len;i+=2) + { + sum = _mm_add_ps(sum, _mm_m...

[PATCH 0/5] ARM NEON optimization for samplerate converter

2011 Sep 01

6

[PATCH 0/5] ARM NEON optimization for samplerate converter

From: Jyri Sarha <jsarha at ti.com> I optimized Speex resampler for NEON capable ARM CPUs. The first patch should speed up resampling on any platform that can spare the increased memory usage. It would be nice to have these merged to the master branch. Please let me know if there is anything I can do to help the the merge. The patches have been rebased on top of master branch in

search for: interpolate_product_single