Displaying 16 results from an estimated 16 matches for "pitch_sse4_1".
2017 Aug 18
1
[PATCH] fix alignment exceptions
...rsion I used for this is:
| Android clang version 5.0.300080 (based on LLVM 5.0.300080)
| Target: x86_64-unknown-linux
If we think enough people use older versions of clang, a version of the
patch that looked at __clang_major__ and friends seems fair.
-- Ray
826% diff -c *old *new
*** pitch_sse4_1.s-old 2017-08-18 13:51:39.359084637 -0700
--- pitch_sse4_1.s-new 2017-08-18 13:51:54.595106450 -0700
***************
*** 73,80 ****
cmpl $4, %eax
jl .LBB0_8
# BB#7:
! movdqa (%edx,%edi,2), %xmm2
! movdqa (%esi,%edi,2), %xmm1
addl $4, %edi
movdqa %xmm2, %xmm3
pmullw %xmm1, %xmm2
--- 73,8...
2017 Aug 18
2
[PATCH] fix alignment exceptions
We see the MOVQ instruction but this patch deliberately uses it rather than
MOVQDA (load 128-bits aligned). We were seeing that with the trace below,
the final invocation is not 128-bit aligned but MOVQDA insists on it (the
calling function was pitch_sse4_1.c:90, in the 4-way N - i >= 4 loop).
07-31 11:00:13.469 210 2540 <(469)%20210-2540> D opus_sse1: RBE
celt_inner_prod_sse4_1: x 0xeff3deb0 y 0xeff3deb0 N 32
07-31 11:00:13.469 210 2540 <(469)%20210-2540> D opus_sse1: RBE
celt_inner_prod_sse4_1: x 0xeff3d7b0 y 0xeff3d7b0 N 32
07...
2017 Aug 22
0
[PATCH] fix alignment exceptions
...rsion I used for this is:
| Android clang version 5.0.300080 (based on LLVM 5.0.300080)
| Target: x86_64-unknown-linux
If we think enough people use older versions of clang, a version of the patch that looked at __clang_major__ and friends seems fair.
-- Ray
826% diff -c *old *new
*** pitch_sse4_1.s-old 2017-08-18 13:51:39.359084637 -0700
--- pitch_sse4_1.s-new 2017-08-18 13:51:54.595106450 -0700
***************
*** 73,80 ****
cmpl $4, %eax
jl .LBB0_8
# BB#7:
! movdqa (%edx,%edi,2), %xmm2
! movdqa (%esi,%edi,2), %xmm1
addl $4, %edi
movdqa %xmm2, %xmm3
pmullw %xmm1, %xmm2
--- 73,8...
2015 Aug 03
0
[PATCH 00/10] Patched cleaning up Opus x86 intrinsics configury
....c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 +++++++++++++------------------
celt/x86/pitch_sse.h | 261 ++++++++++--------------
celt/x86/pitch_sse2.c | 95 +++++++++
celt/x86/pitch_sse4_1.c | 195 ++++++++++++++++++
celt/x86/x86_celt_map.c | 76 ++++++-
celt/x86/x86cpu.c | 47 ++++-
celt/x86/x86cpu.h | 26 ++-
celt_sources.mk | 5 +-
configure.ac...
2016 Sep 01
1
[PATCH] vs2015: include files added in 76674fea
...t_guts.h" />
<ClInclude Include="..\..\include\opus.h" />
@@ -913,6 +914,7 @@
<ClCompile Include="..\..\celt\x86\pitch_sse.c" />
<ClCompile Include="..\..\celt\x86\pitch_sse2.c" />
<ClCompile Include="..\..\celt\x86\pitch_sse4_1.c" />
+ <ClCompile Include="..\..\celt\x86\vq_sse2.c" />
<ClCompile Include="..\..\celt\x86\x86cpu.c" />
<ClCompile Include="..\..\celt\x86\x86_celt_map.c" />
<ClCompile Include="..\..\silk\A2NLSF.c" />
diff -...
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
...e.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 +++++++++++++------------------
celt/x86/pitch_sse.h | 256 ++++++++++-------------
celt/x86/pitch_sse2.c | 95 +++++++++
celt/x86/pitch_sse4_1.c | 195 ++++++++++++++++++
celt/x86/x86_celt_map.c | 76 ++++++-
celt/x86/x86cpu.c | 47 ++++-
celt/x86/x86cpu.h | 26 ++-
celt_sources.mk | 5 +-
configure.ac...
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
...e.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 +++++++++++++------------------
celt/x86/pitch_sse.h | 256 ++++++++++-------------
celt/x86/pitch_sse2.c | 95 +++++++++
celt/x86/pitch_sse4_1.c | 195 ++++++++++++++++++
celt/x86/x86_celt_map.c | 76 ++++++-
celt/x86/x86cpu.c | 47 ++++-
celt/x86/x86cpu.h | 26 ++-
celt_sources.mk | 5 +-
configure.ac...
2015 Mar 02
13
Patch cleaning up Opus x86 intrinsics configury
The attached patch cleans up Opus's x86 intrinsics configury.
It:
* Makes ?enable-intrinsics work with clang and other non-GCC compilers
* Enables RTCD for the floating-point-mode SSE code in Celt.
* Disables use of RTCD in cases where the compiler targets an instruction set by default.
* Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in
2017 Aug 18
2
[PATCH] fix alignment exceptions
Hi,
Please find attached a patch to fix alignment exceptions. Without this
change, we were seeing occasional alignment faults when using this with
clang.
Thanks,
Felicia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170818/e0d6bb06/attachment.html>
-------------- next part --------------
A non-text
2015 Mar 18
5
[RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10
...elt_lpc_sse.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 ++++++++++---------------
celt/x86/pitch_sse.h | 256 ++++++++------------
celt/x86/pitch_sse2.c | 95 ++++++++
celt/x86/pitch_sse4_1.c | 195 +++++++++++++++
celt/x86/x86_celt_map.c | 76 +++++-
celt/x86/x86cpu.c | 47 +++-
celt/x86/x86cpu.h | 26 +-
celt_headers.mk | 3 +
celt_sources.mk | 9 +...
2015 Mar 31
6
[RFC PATCH v1 0/5] aarch64: celt_pitch_xcorr: Fixed point series
...elt_lpc_sse.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 ++++++++++---------------
celt/x86/pitch_sse.h | 256 ++++++++------------
celt/x86/pitch_sse2.c | 95 ++++++++
celt/x86/pitch_sse4_1.c | 195 +++++++++++++++
celt/x86/x86_celt_map.c | 76 +++++-
celt/x86/x86cpu.c | 47 +++-
celt/x86/x86cpu.h | 26 +-
celt_headers.mk | 3 +
celt_sources.mk | 9 +...
2015 May 08
8
[RFC PATCH v2]: Ne10 fft fixed and previous 0/8]
...elt_lpc_sse.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 ++++++++++---------------
celt/x86/pitch_sse.h | 256 ++++++++------------
celt/x86/pitch_sse2.c | 95 ++++++++
celt/x86/pitch_sse4_1.c | 195 +++++++++++++++
celt/x86/x86_celt_map.c | 76 +++++-
celt/x86/x86cpu.c | 47 +++-
celt/x86/x86cpu.h | 26 +-
celt_headers.mk | 4 +
celt_sources.mk | 9 +...
2015 May 15
11
[RFC V3 0/8] Ne10 fft fixed and previous
...elt_lpc_sse.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 ++++++++++---------------
celt/x86/pitch_sse.h | 256 ++++++++------------
celt/x86/pitch_sse2.c | 95 ++++++++
celt/x86/pitch_sse4_1.c | 195 +++++++++++++++
celt/x86/x86_celt_map.c | 76 +++++-
celt/x86/x86cpu.c | 47 +++-
celt/x86/x86cpu.h | 26 +-
celt_headers.mk | 4 +
celt_sources.mk | 9 +...
2015 Apr 28
10
[RFC PATCH v1 0/8] Ne10 fft fixed and previous
...elt_lpc_sse.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 ++++++++++---------------
celt/x86/pitch_sse.h | 256 ++++++++------------
celt/x86/pitch_sse2.c | 95 ++++++++
celt/x86/pitch_sse4_1.c | 195 +++++++++++++++
celt/x86/x86_celt_map.c | 76 +++++-
celt/x86/x86cpu.c | 47 +++-
celt/x86/x86cpu.h | 26 +-
celt_headers.mk | 4 +
celt_sources.mk | 9 +...
2016 Jun 17
5
ARM NEON optimization -- celt_fir()
Hi all,
This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the
next few months.
I'm submitting 2 patches in the following couple of emails, which have the new
created celt_fir_neon().
I revised celt_fir_c() to not pass in argument "mem" in Patch 1. If there are
concerns to this change, please let me know.
Many thanks to your comments.
Linfeng Zhang
2016 Jul 01
1
silk_warped_autocorrelation_FIX() NEON optimization
Hi all,
I'm sending patch "Optimize silk_warped_autocorrelation_FIX() for ARM NEON" in an separate email.
It is based on Tim’s aarch64v8 branch https://git.xiph.org/?p=users/tterribe/opus.git;a=shortlog;h=refs/heads/aarch64v8
Thanks for your comments.
Linfeng