search for: celt_inner_prod

Displaying 20 results from an estimated 24 matches for "celt_inner_prod".

2017 Jun 01
4
celt_inner_prod() and dual_inner_prod() NEON intrinsics
Hi, Attached are 5 patches related to celt_inner_prod() and dual_inner_prod() NEON intrinsics optimization. In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the optimization changed the order of floating-point inner products, which will change the results. I created celt_inner_prod_neon_float_c_simulation() and dual_inner_prod_neon...
2017 Jun 05
4
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...ny) for each of your patches? Also, are these > all the patches you intend to merge for 1.2 or are there more upcoming > ones? > > Cheers, > > Jean-Marc > > On 01/06/17 06:33 PM, Linfeng Zhang wrote: > > Hi, > > > > Attached are 5 patches related to celt_inner_prod() > > and dual_inner_prod() NEON intrinsics optimization. > > > > In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the > > optimization changed the order of floating-point inner products, which > > will change the results. I > > created celt_in...
2017 Jun 06
2
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...ge them. Cheers, Jean-Marc > > Out of curiosity, what’s the CPU in the Chromebook you’re using to > test? > >> On Jun 1, 2017, at 6:33 PM, Linfeng Zhang <linfengz at google.com> >> wrote: >> >> Hi, >> >> Attached are 5 patches related to celt_inner_prod() and >> dual_inner_prod() NEON intrinsics optimization. >> >> In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, >> the optimization changed the order of floating-point inner >> products, which will change the results. I created >> celt_inner_p...
2015 Nov 05
2
AVX Optimizations
Yes, Thank you. I'll follow up with the AVX code and tests for pitch code. Radu -----Original Message----- From: opus-bounces at xiph.org [mailto:opus-bounces at xiph.org] On Behalf Of Timothy B. Terriberry Sent: Thursday, November 5, 2015 10:31 AM To: opus at xiph.org Subject: Re: [opus] AVX Optimizations Velea, Radu wrote: > I've created a pull request[1] to enable configuration
2017 Jun 05
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
On 05/06/17 03:28 PM, Linfeng Zhang wrote: > For fixed-point ARM, only > 0003-Optimize-fixed-point-celt_inner_prod-and-dual_inner_.patch changes > the performance. > For floating-point ARM, only > 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch changes the performance. Got any numbers? Cheers, Jean-Marc > Patch 1 and 2 are code clean-up and can only affect x86 performance. >...
2017 Jun 06
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
..._map tables, for the same reason we didn’t want it in the arm_silk_map tables. Out of curiosity, what’s the CPU in the Chromebook you’re using to test? > On Jun 1, 2017, at 6:33 PM, Linfeng Zhang <linfengz at google.com> wrote: > > Hi, > > Attached are 5 patches related to celt_inner_prod() and dual_inner_prod() NEON intrinsics optimization. > > In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the optimization changed the order of floating-point inner products, which will change the results. I created celt_inner_prod_neon_float_c_simulation() and dual_inner...
2017 Jun 06
3
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...;linfengz at google.com > <mailto:linfengz at google.com>> wrote: > > Hi Jean-Marc, > > I attached the new version in inner_prod_5patches_v2.zip which > synced to the current master. > > For fixed-point ARM, only > 0003-Optimize-fixed-point-celt_inner_prod-and-dual_inner_.patch > changes the performance. > For floating-point ARM, only > 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.pa > <http://elt_inner_prod-and-dual_inn.pa>tch changes the performance. > Patch 1 and 2 are code clean-up and can onl...
2017 Jun 06
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...> Out of curiosity, what’s the CPU in the Chromebook you’re using to > > test? > > > >> On Jun 1, 2017, at 6:33 PM, Linfeng Zhang <linfengz at google.com> > >> wrote: > >> > >> Hi, > >> > >> Attached are 5 patches related to celt_inner_prod() and > >> dual_inner_prod() NEON intrinsics optimization. > >> > >> In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, > >> the optimization changed the order of floating-point inner > >> products, which will change the results. I creat...
2017 Jun 06
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...to:linfengz at google.com>> wrote: > > > > Hi Jean-Marc, > > > > I attached the new version in inner_prod_5patches_v2.zip which > > synced to the current master. > > > > For fixed-point ARM, only > > 0003-Optimize-fixed-point-celt_inner_prod-and-dual_inner_.patch > > changes the performance. > > For floating-point ARM, only > > 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.pa > > <http://elt_inner_prod-and-dual_inn.pa>tch changes the performance. > > Patch 1 and 2 are...
2017 Jun 02
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...e know what's the expected effect on performance (if any) for each of your patches? Also, are these all the patches you intend to merge for 1.2 or are there more upcoming ones? Cheers, Jean-Marc On 01/06/17 06:33 PM, Linfeng Zhang wrote: > Hi, > > Attached are 5 patches related to celt_inner_prod() > and dual_inner_prod() NEON intrinsics optimization. > > In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the > optimization changed the order of floating-point inner products, which > will change the results. I > created celt_inner_prod_neon_float_c_simulat...
2017 Jun 05
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...e >> all the patches you intend to merge for 1.2 or are there more upcoming >> ones? >> >> Cheers, >> >> Jean-Marc >> >> On 01/06/17 06:33 PM, Linfeng Zhang wrote: >> > Hi, >> > >> > Attached are 5 patches related to celt_inner_prod() >> > and dual_inner_prod() NEON intrinsics optimization. >> > >> > In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the >> > optimization changed the order of floating-point inner products, which >> > will change the results. I >&...
2015 Nov 05
0
AVX Optimizations
...Y_HAVE_SSE4_1(xcorr_kernel) /* avx */ }; #endif #if (defined(OPUS_X86_MAY_HAVE_SSE4_1) && !defined(OPUS_X86_PRESUME_SSE4_1)) || \ (!defined(OPUS_X86_MAY_HAVE_SSE_4_1) && defined(OPUS_X86_MAY_HAVE_SSE2) && !defined(OPUS_X86_PRESUME_SSE2)) opus_val32 (*const CELT_INNER_PROD_IMPL[OPUS_ARCHMASK + 1])( const opus_val16 *x, const opus_val16 *y, int N ) = { celt_inner_prod_c, /* non-sse */ celt_inner_prod_c, MAY_HAVE_SSE2(celt_inner_prod), MAY_HAVE_SSE4_1(celt_inner_prod), /* sse4.1 */ + MAY_H...
2015 Nov 05
2
AVX Optimizations
...MAY_HAVE_SSE4_1(xcorr_kernel) /* avx */ }; #endif #if (defined(OPUS_X86_MAY_HAVE_SSE4_1) && !defined(OPUS_X86_PRESUME_SSE4_1)) || \ (!defined(OPUS_X86_MAY_HAVE_SSE_4_1) && defined(OPUS_X86_MAY_HAVE_SSE2) && !defined(OPUS_X86_PRESUME_SSE2)) opus_val32 (*const CELT_INNER_PROD_IMPL[OPUS_ARCHMASK + 1])( const opus_val16 *x, const opus_val16 *y, int N ) = { celt_inner_prod_c, /* non-sse */ celt_inner_prod_c, MAY_HAVE_SSE2(celt_inner_prod), MAY_HAVE_SSE4_1(celt_inner_prod), /* sse4.1 */ + MAY_H...
2017 Jun 06
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
Thank Ulrich! Yes, using celt_assert(1.0 + celt_inner_prod_neon_float_c_simulation(x, y, N) == 1.0 + xy); celt_assert(1.0 + xy1_c == 1.0 + *xy1); celt_assert(1.0 + xy2_c == 1.0 + *xy2); can avoid the useage of VERY_SMALL. Hi Jean-Marc, I added { const opus_val32 xy_c = celt_inner_prod_neon_float_c_simulation(x, y, N);...
2017 Jun 06
4
Antw: Re: celt_inner_prod() and dual_inner_prod() NEON intrinsics
...; wrote: >> > >> > Hi Jean-Marc, >> > >> > I attached the new version in inner_prod_5patches_v2.zip which >> > synced to the current master. >> > >> > For fixed-point ARM, only >> > 0003-Optimize-fixed-point-celt_inner_prod-and-dual_inner_.patch >> > changes the performance. >> > For floating-point ARM, only >> > 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.pa >> > <http://elt_inner_prod-and-dual_inn.pa>tch changes the performance. >> >...
2016 Sep 13
4
[PATCH 12/15] Replace call of celt_inner_prod_c() (step 1)
Should call celt_inner_prod(). --- celt/bands.c | 7 ++++--- celt/bands.h | 2 +- celt/celt_encoder.c | 6 +++--- celt/pitch.c | 2 +- src/opus_multistream_encoder.c | 2 +- 5 files changed, 10 insertions(+), 9 deletions(-) diff --git a/celt/bands.c b/celt/ban...
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
...+#ifndef OVERRIDE_DUAL_INNER_PROD +# define dual_inner_prod(x, y01, y02, N, xy1, xy2, arch) \ + ((void)(arch),dual_inner_prod_c(x, y01, y02, N, xy1, xy2)) #endif /*We make sure a C version is always available for cases where the overhead of @@ -169,6 +172,12 @@ static OPUS_INLINE opus_val32 celt_inner_prod_c(const opus_val16 *x, ((void)(arch),celt_inner_prod_c(x, y, N)) #endif +#ifdef NON_STATIC_COMB_FILTER_CONST_C +void comb_filter_const_c(opus_val32 *y, opus_val32 *x, int T, int N, + opus_val16 g10, opus_val16 g11, opus_val16 g12); +#endif + + #ifdef FIXED_POINT opus_val32 #else @@ -...
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
...+#ifndef OVERRIDE_DUAL_INNER_PROD +# define dual_inner_prod(x, y01, y02, N, xy1, xy2, arch) \ + ((void)(arch),dual_inner_prod_c(x, y01, y02, N, xy1, xy2)) #endif /*We make sure a C version is always available for cases where the overhead of @@ -169,6 +172,12 @@ static OPUS_INLINE opus_val32 celt_inner_prod_c(const opus_val16 *x, ((void)(arch),celt_inner_prod_c(x, y, N)) #endif +#ifdef NON_STATIC_COMB_FILTER_CONST_C +void comb_filter_const_c(opus_val32 *y, opus_val32 *x, int T, int N, + opus_val16 g10, opus_val16 g11, opus_val16 g12); +#endif + + #ifdef FIXED_POINT opus_val32 #else @@ -...
2015 Mar 02
13
Patch cleaning up Opus x86 intrinsics configury
The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in
2015 Mar 18
5
[RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10
Hi All, Since I continue to base my work on top of Jonathan's patch, and my previous Ne10 fft/ifft/mdct_forward/backward patches, I thought it would be better to just post all new patches as a patch series. Please let me know if anyone disagrees with this approach. You can see wip branch of all latest patches at https://git.linaro.org/people/viswanath.puttagunta/opus.git Branch: