search for: sse4.1

Displaying 20 results from an estimated 163 matches for "sse4.1".

2014 Dec 02
2
[LLVMdev] Should more vector [zs]extloads be legal for X86 SSE4.1?
Hi Chandler, all, Why aren't the vector [zs]extloads introduced by SSE4.1/AVX2 declared legal? Is it a simple oversight, or did I miss a deeper reason? While cleaning up PMOV*X patterns, I stumbled upon this braindead testcase: %0 = load <8 x i8>* %src, align 1 %1 = zext <8 x i8> %0 to <8 x i16> turning into: pmovzxbw (%rsi), %xmm0
2015 Nov 05
2
AVX Optimizations
Yes, Thank you. I'll follow up with the AVX code and tests for pitch code. Radu -----Original Message----- From: opus-bounces at xiph.org [mailto:opus-bounces at xiph.org] On Behalf Of Timothy B. Terriberry Sent: Thursday, November 5, 2015 10:31 AM To: opus at xiph.org Subject: Re: [opus] AVX Optimizations Velea, Radu wrote: > I've created a pull request[1] to enable configuration
2009 Aug 14
0
[LLVMdev] udis86 sse4.1 and 4.2?
To disassemble jit code I typically use the udis86 support. Are there patches floating around to support SSE4.1 and SSE4.2 in this? I'd like to use it on a nehalem based machine and investigate the llvm code generation for SSE4.2 in a jit context. thanks bill -------------- next part -------------- An HTML attachment was scrubbed... URL:
2014 Mar 11
2
x86_64 SSE2/SSE41 optim not used
Hi Guys, In stream_decoder.c when assigning lpc restore function, only IA32 processor benefits from SS2 and SSE4.1 optimization. Shouldn't it be the case for x86_64 processor as well ? Thanks, -- Olivier TRISTAN uvi.net -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/flac-dev/attachments/20140311/1d49b5c2/attachment.htm
2015 Nov 20
1
[PATCH] Fix x86 build if we presume SSE4.1 (and earlier), but not AVX.
--- celt/cpu_support.h | 3 ++- celt/x86/x86cpu.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/celt/cpu_support.h b/celt/cpu_support.h index 133abbf..68fc606 100644 --- a/celt/cpu_support.h +++ b/celt/cpu_support.h @@ -45,7 +45,8 @@ #elif (defined(OPUS_X86_MAY_HAVE_SSE) && !defined(OPUS_X86_PRESUME_SSE)) || \ (defined(OPUS_X86_MAY_HAVE_SSE2) &&
2015 Mar 04
2
Patch cleaning up Opus x86 intrinsics configury
Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I mentioned in the IRC room a problem I discovered in creating my patch, and then later improved the fix Tim had made for the problem. Where do you get conflicts merging it to tip? In terms of merging, you posted your patch before I posted mine, so probably I should be
2015 Nov 05
0
AVX Optimizations
Velea, Radu wrote: > Yes, > > Thank you. I'll follow up with the AVX code and tests for pitch code. Actually, I lied. Because you update opus_select_arch(), you can now return a value for arch (4) that is larger than the maximum we currently support (3). This doesn't actually cause failures, because we mask with OPUS_ARCHMASK, but it does mean that a CPU with AVX will invoke
2008 Nov 20
4
[LLVMdev] changing -mattr behavior with mmx and sse
Hi, When setting -mattr option on X86, I would like to treat MMX separately from SSE levels. This would allow a client who sets the attributes directly to set the SSE level independent of MMX, e.g., llc -march=x86 -mattr=sse41, one would get sse4.1 with mmx disabled while llc -march=x86 -mattr=mmx -mattr=sse42 will get mmx and sse42. If anyone objects to this change, please let me
2014 May 13
1
Performance tests of the current version (git-b1b6caf)
Current sources (git-b1b6caf) were compiled with GCC 4.8.2 and GCC 4.9.0 with various -msseN options (the default is -msse2). Then I took two WAV files (one is 16-bit and the other is 24-bit) and compressed them using best compression mode. The results are in the table below. (please remember that the resulting value is an encoding time, not encoding speed) CPU: Intel Core i7 950 (up to SSE4.2)
2017 Feb 18
2
[PATCH 4/5] SIMD: accelerate decoding of 16-bit FLAC
This patch adds 2 new functions, FLAC__lpc_restore_signal_intrin_sse41() and FLAC__lpc_restore_signal_16_intrin_sse41(). The decoding speed of Subset-compatible 16-bit FLAC files is slightly increased on SSE4.1-compatible CPUs. -------------- next part -------------- A non-text attachment was scrubbed... Name: 04_add_new_intrin_func.patch Type: application/octet-stream Size: 9851 bytes Desc: not
2015 Nov 05
2
AVX Optimizations
Sorry. I missed that. Good observation. Please go ahead and correct the patch. Thanks, Radu -----Original Message----- From: opus-bounces at xiph.org [mailto:opus-bounces at xiph.org] On Behalf Of Timothy B. Terriberry Sent: Thursday, November 5, 2015 11:08 AM To: opus at xiph.org Subject: Re: [opus] AVX Optimizations Velea, Radu wrote: > Yes, > > Thank you. I'll follow up with
2015 Nov 26
2
Test failed!!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi Jesus, Thanks for the report. As far as I can tell, what's happening is that when intrinsics are enabled, we compile all tests with -msse4.1, even when it's only run-time detected. In most cases, that doesn't cause any issue, but sometimes the compiler will take the C code and generate an SSEx instruction on its own. I think this is
2015 Mar 04
2
Patch cleaning up Opus x86 intrinsics configury
On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote: Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I
2015 Nov 26
2
Opus 1.1.1 is out!
Hi everyone, After much waiting, Opus 1.1.1 is finally here. The main changes are: - x86 SSE, SSE2 and SSE4.1 optimizations contributed by Cisco, - MIPS optimizations contributed by Imagination Technologies, - ARM Neon optimizations contributed by Linaro and ARM, - many architecture-independent optimizations, - memory footprint reductions, and - several minor bug fixes. The quality of the
2015 Mar 02
13
Patch cleaning up Opus x86 intrinsics configury
The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in
2017 May 08
2
LLVM and Xeon Skylake v5
getProcessTriple just determines operation system, and architecture. It doesn't deal with specific instruction set features. The CPU should be controlled by MCPU on the EngineBuilder i think. The CPU autodetection code lives in getHostCPUName in lib/Support/Host.cpp, but I don't think the JIT calls into. I think its expected the user would call it or pass a specific CPU string to the MCPU
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 07
1
Patch cleaning up Opus x86 intrinsics configury
Hello Jonathan, Just FYI, I started doing review of your patch and will get back to you in few days. After review, I would like to rebase your patch (as necessary) myself and do some testing.. and re-submit. Regards, Vish On 4 March 2015 at 09:00, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at
2013 Sep 15
2
PATCH: x86-64 support and SSE intrinscis code
Erik de Castro Lopo <mle+la at mega-nerd.com> wrote: > The biggest of these tweaks weas to disable the intrinsics version > fero FLAC__CPU_IA32 because I couldn't get this to compile on > i386-linux (and we have the nasm versions). Still open to re-enabling > this if someone can get it to work. I know you're a skilled programmer, but... maybe you forgot to add -msse