search for: s_mul

Displaying 20 results from an estimated 20 matches for "s_mul".

2014 Feb 21
2
Make check failure on clone from 31 January
..., t1; + /* We swap real and imag because we're using an FFT instead of an IFFT. */ + re = yp0[1]; + im = yp0[0]; + t0 = t[i]; + t1 = t[N4+i]; + /* We'd scale up by 2 here, but instead it's done when mixing the windows */ + yp0[0] = S_MUL(re,t0) + S_MUL(im,t1); + yp0[1] = S_MUL(re,t1) - S_MUL(im,t0); + } } /* Mirror on both sides for TDAC */ Regards, Marcello On 05/02/2014 18:46, "Gregory Maxwell" <gmaxwell at gmail.com> wrote: >On Wed, Feb 5, 2014 at 8:05 AM, Marcello Caramma (mcaram...
2015 Mar 04
0
[RFC PATCH v1] armv7(float): Optimize decode usecase using NE10 library
...alar * OPUS_RESTRICT xp1 = in; + const kiss_fft_scalar * OPUS_RESTRICT xp2 = in+stride*(N2-1); + kiss_fft_scalar * OPUS_RESTRICT yp = f; + const kiss_twiddle_scalar * OPUS_RESTRICT t = &trig[0]; + for(i=0;i<N4;i++) + { + kiss_fft_scalar yr, yi; + yr = S_MUL(*xp2, t[i]) + S_MUL(*xp1, t[N4+i]); + yi = S_MUL(*xp1, t[i]) - S_MUL(*xp2, t[N4+i]); + yp[2*i] = yr; + yp[2*i+1] = yi; + xp1+=2*stride; + xp2-=2*stride; + } + } + + opus_ifft(st, (kiss_fft_cpx *)f, (kiss_fft_cpx*)(out+(overlap>>1)), arch); + +...
2015 Apr 28
0
[RFC PATCH v1 2/8] armv7(float): Optimize decode usecase using NE10 library
...alar * OPUS_RESTRICT xp1 = in; + const kiss_fft_scalar * OPUS_RESTRICT xp2 = in+stride*(N2-1); + kiss_fft_scalar * OPUS_RESTRICT yp = f; + const kiss_twiddle_scalar * OPUS_RESTRICT t = &trig[0]; + for(i=0;i<N4;i++) + { + kiss_fft_scalar yr, yi; + yr = S_MUL(*xp2, t[i]) + S_MUL(*xp1, t[N4+i]); + yi = S_MUL(*xp1, t[i]) - S_MUL(*xp2, t[N4+i]); + yp[2*i] = yr; + yp[2*i+1] = yi; + xp1+=2*stride; + xp2-=2*stride; + } + } + + opus_ifft(st, (kiss_fft_cpx *)f, (kiss_fft_cpx*)(out+(overlap>>1)), arch); + +...
2015 Mar 04
1
[RFC PATCH v1] Decode(float) optimize using libNe10
Hello All, I extended the libNE10 optimizations for float towards mdct_backwards/opus_ifft. I am able to get about 14.26% improvement for Decode use case now on my Beaglebone Black. Please see [1] for measurements. Questions 1. Since this patch needs to go in after Encode [2] patch) should I submit this as patch series? 2. Since Jonathan Lennox posted intrinsics cleanup [3] patch, should
2014 Feb 24
1
Make check failure on clone from 31 January
...because we're using an FFT instead of > an IFFT. */ > + re = yp0[1]; > + im = yp0[0]; > + t0 = t[i]; > + t1 = t[N4+i]; > + /* We'd scale up by 2 here, but instead it's done when mixing > the windows */ > + yp0[0] = S_MUL(re,t0) + S_MUL(im,t1); > + yp0[1] = S_MUL(re,t1) - S_MUL(im,t0); > + } > } > > /* Mirror on both sides for TDAC */ > > > Regards, > > Marcello > > > > On 05/02/2014 18:46, "Gregory Maxwell" <gmaxwell at gmail.com> wrot...
2014 Feb 22
0
Make check failure on clone from 31 January
...because we're using an FFT instead of > an IFFT. */ > + re = yp0[1]; > + im = yp0[0]; > + t0 = t[i]; > + t1 = t[N4+i]; > + /* We'd scale up by 2 here, but instead it's done when mixing > the windows */ > + yp0[0] = S_MUL(re,t0) + S_MUL(im,t1); > + yp0[1] = S_MUL(re,t1) - S_MUL(im,t0); > + } > } > > /* Mirror on both sides for TDAC */ > > > Regards, > > Marcello > > > > On 05/02/2014 18:46, "Gregory Maxwell" <gmaxwell at gmail.com> wrot...
2014 Feb 05
4
Make check failure on clone from 31 January
Hi, Apologies if this is a known issue, but running make on revision e3187444692195957eb66989622c7b1ad8448b06 fails one of the tests when using fixed point configuration (floating point is ok) on my linux x86. Note that libopus1.1, as extracted from the tar ball, is OK. Specifically, the tests that fail are in celt/tests/test_unit_mdct: nfft=32 inverse=0,snr = 85.341197 nfft=32 inverse=1,snr =
2015 Jan 20
0
[RFC PATCH v1 2/2] armv7(float): Optimize encode usecase using NE10 library
...celt/tests/test_unit_dft.c | 14 +- celt/tests/test_unit_mdct.c | 19 +- celt_headers.mk | 3 + celt_sources.mk | 6 + configure.ac | 81 ++++++++ src/analysis.c | 2 +- src/opus_multistream_encoder.c | 3 +- 25 files changed, 1278 insertions(+), 24 deletions(-) create mode 100644 celt/arm/arm_celt_ne10_fft_map.c create mode 100644 celt/arm/arm_celt_ne10_mdct_map.c create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode...
2015 Jan 20
6
[RFC PATCH v1 0/2] Encode optimize using libNE10
...celt/tests/test_unit_mdct.c | 19 +- celt/x86/x86cpu.c | 22 +- celt_headers.mk | 3 + celt_sources.mk | 6 + configure.ac | 81 ++++++++ src/analysis.c | 2 +- src/opus_multistream_encoder.c | 3 +- 27 files changed, 1307 insertions(+), 36 deletions(-) create mode 100644 celt/arm/arm_celt_ne10_fft_map.c create mode 100644 celt/arm/arm_celt_ne10_mdct_map.c create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode...
2015 Feb 04
0
[RFC PATCH v2] armv7(float): Optimize encode usecase using NE10 library
..._headers.mk | 3 + celt_sources.mk | 4 + configure.ac | 81 +++++++ src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- 29 files changed, 1422 insertions(+), 105 deletions(-) create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode 100644 celt/arm/fft_arm.h create mode 100644 celt/arm/mdct_arm.h create mode 100644 celt/dump_modes/du...
2015 Mar 03
0
[RFC PATCHv3] armv7(float): Optimize encode usecase using NE10 library
...celt_headers.mk | 3 + celt_sources.mk | 4 + configure.ac | 81 +++++++ src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- 29 files changed, 1423 insertions(+), 105 deletions(-) create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode 100644 celt/arm/fft_arm.h create mode 100644 celt/arm/mdct_arm.h create mode 100644 celt/dump_modes/dum...
2015 May 08
0
[[RFC PATCH v2]: Ne10 fft fixed and previous 1/8] armv7(float): Optimize encode usecase using NE10 library
...celt_headers.mk | 3 + celt_sources.mk | 4 + configure.ac | 81 +++++++ src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- 30 files changed, 1435 insertions(+), 105 deletions(-) create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode 100644 celt/arm/fft_arm.h create mode 100644 celt/arm/mdct_arm.h create mode 100644 celt/dump_modes/dum...
2015 Mar 18
5
[RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10
...c_FIX_sse.h | 17 ++ silk/x86/main_sse.h | 48 ++++ silk/x86/x86_silk_map.c | 25 +- src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- win32/VS2010/celt.vcxproj | 17 +- win32/VS2010/celt.vcxproj.filters | 27 +++ win32/VS2010/silk_common.vcxproj | 17 +- win32/VS2010/silk_common.vcxproj.filters | 23 +- win32/VS2010/silk_fixed.vcxproj | 13 +- win32...
2015 Mar 03
1
[RFC PATCH v4] Enable optimize using libNe10
...celt_headers.mk | 3 + celt_sources.mk | 4 + configure.ac | 81 +++++++ src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- 29 files changed, 1423 insertions(+), 105 deletions(-) create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode 100644 celt/arm/fft_arm.h create mode 100644 celt/arm/mdct_arm.h create mode 100644 celt/dump_modes/dum...
2015 Feb 04
4
[RFC PATCH v2] Encode optimize using libNe10
..._headers.mk | 3 + celt_sources.mk | 4 + configure.ac | 81 +++++++ src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- 29 files changed, 1422 insertions(+), 105 deletions(-) create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode 100644 celt/arm/fft_arm.h create mode 100644 celt/arm/mdct_arm.h create mode 100644 celt/dump_modes/du...
2015 Mar 31
6
[RFC PATCH v1 0/5] aarch64: celt_pitch_xcorr: Fixed point series
...c_FIX_sse.h | 17 ++ silk/x86/main_sse.h | 48 ++++ silk/x86/x86_silk_map.c | 25 +- src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- win32/VS2010/celt.vcxproj | 17 +- win32/VS2010/celt.vcxproj.filters | 27 +++ win32/VS2010/silk_common.vcxproj | 17 +- win32/VS2010/silk_common.vcxproj.filters | 23 +- win32/VS2010/silk_fixed.vcxproj | 13 +- win32...
2015 Mar 03
2
[RFC PATCHv3] Encode optimize using libNe10
...celt_headers.mk | 3 + celt_sources.mk | 4 + configure.ac | 81 +++++++ src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- 29 files changed, 1423 insertions(+), 105 deletions(-) create mode 100644 celt/arm/celt_ne10_fft.c create mode 100644 celt/arm/celt_ne10_mdct.c create mode 100644 celt/arm/fft_arm.h create mode 100644 celt/arm/mdct_arm.h create mode 100644 celt/dump_modes/dum...
2015 May 08
8
[RFC PATCH v2]: Ne10 fft fixed and previous 0/8]
...c_FIX_sse.h | 17 ++ silk/x86/main_sse.h | 48 ++++ silk/x86/x86_silk_map.c | 25 +- src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- win32/VS2010/celt.vcxproj | 17 +- win32/VS2010/celt.vcxproj.filters | 27 +++ win32/VS2010/silk_common.vcxproj | 17 +- win32/VS2010/silk_common.vcxproj.filters | 23 +- win32/VS2010/silk_fixed.vcxproj | 13 +- win32...
2015 May 15
11
[RFC V3 0/8] Ne10 fft fixed and previous
...c_FIX_sse.h | 17 ++ silk/x86/main_sse.h | 48 ++++ silk/x86/x86_silk_map.c | 25 +- src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- win32/VS2010/celt.vcxproj | 17 +- win32/VS2010/celt.vcxproj.filters | 27 +++ win32/VS2010/silk_common.vcxproj | 17 +- win32/VS2010/silk_common.vcxproj.filters | 23 +- win32/VS2010/silk_fixed.vcxproj | 13 +- win32...
2015 Apr 28
10
[RFC PATCH v1 0/8] Ne10 fft fixed and previous
...c_FIX_sse.h | 17 ++ silk/x86/main_sse.h | 48 ++++ silk/x86/x86_silk_map.c | 25 +- src/analysis.c | 8 +- src/analysis.h | 2 +- src/opus_encoder.c | 2 +- src/opus_multistream_encoder.c | 9 +- win32/VS2010/celt.vcxproj | 17 +- win32/VS2010/celt.vcxproj.filters | 27 +++ win32/VS2010/silk_common.vcxproj | 17 +- win32/VS2010/silk_common.vcxproj.filters | 23 +- win32/VS2010/silk_fixed.vcxproj | 13 +- win32...