thr3ads.net - search: "sse4.1"

Displaying 20 results from an estimated 163 matches for "sse4.1".

[LLVMdev] Should more vector [zs]extloads be legal for X86 SSE4.1?

2014 Dec 02

[LLVMdev] Should more vector [zs]extloads be legal for X86 SSE4.1?

Hi Chandler, all, Why aren't the vector [zs]extloads introduced by SSE4.1/AVX2 declared legal? Is it a simple oversight, or did I miss a deeper reason? While cleaning up PMOV*X patterns, I stumbled upon this braindead testcase: %0 = load <8 x i8>* %src, align 1 %1 = zext <8 x i8> %0 to <8 x i16> turning into: pmovzxbw (%rsi), %xmm0

AVX Optimizations

2015 Nov 05

AVX Optimizations

Yes, Thank you. I'll follow up with the AVX code and tests for pitch code. Radu -----Original Message----- From: opus-bounces at xiph.org [mailto:opus-bounces at xiph.org] On Behalf Of Timothy B. Terriberry Sent: Thursday, November 5, 2015 10:31 AM To: opus at xiph.org Subject: Re: [opus] AVX Optimizations Velea, Radu wrote: > I've created a pull request[1] to enable configuration

[LLVMdev] udis86 sse4.1 and 4.2?

2009 Aug 14

[LLVMdev] udis86 sse4.1 and 4.2?

To disassemble jit code I typically use the udis86 support. Are there patches floating around to support SSE4.1 and SSE4.2 in this? I'd like to use it on a nehalem based machine and investigate the llvm code generation for SSE4.2 in a jit context. thanks bill -------------- next part -------------- An HTML attachment was scrubbed... URL:

x86_64 SSE2/SSE41 optim not used

2014 Mar 11

x86_64 SSE2/SSE41 optim not used

Hi Guys, In stream_decoder.c when assigning lpc restore function, only IA32 processor benefits from SS2 and SSE4.1 optimization. Shouldn't it be the case for x86_64 processor as well ? Thanks, -- Olivier TRISTAN uvi.net -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/flac-dev/attachments/20140311/1d49b5c2/attachment.htm

[PATCH] Fix x86 build if we presume SSE4.1 (and earlier), but not AVX.

2015 Nov 20

[PATCH] Fix x86 build if we presume SSE4.1 (and earlier), but not AVX.

--- celt/cpu_support.h | 3 ++- celt/x86/x86cpu.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/celt/cpu_support.h b/celt/cpu_support.h index 133abbf..68fc606 100644 --- a/celt/cpu_support.h +++ b/celt/cpu_support.h @@ -45,7 +45,8 @@ #elif (defined(OPUS_X86_MAY_HAVE_SSE) && !defined(OPUS_X86_PRESUME_SSE)) || \ (defined(OPUS_X86_MAY_HAVE_SSE2) &&

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I mentioned in the IRC room a problem I discovered in creating my patch, and then later improved the fix Tim had made for the problem. Where do you get conflicts merging it to tip? In terms of merging, you posted your patch before I posted mine, so probably I should be

AVX Optimizations

2015 Nov 05

AVX Optimizations

Velea, Radu wrote: > Yes, > > Thank you. I'll follow up with the AVX code and tests for pitch code. Actually, I lied. Because you update opus_select_arch(), you can now return a value for arch (4) that is larger than the maximum we currently support (3). This doesn't actually cause failures, because we mask with OPUS_ARCHMASK, but it does mean that a CPU with AVX will invoke

[LLVMdev] changing -mattr behavior with mmx and sse

2008 Nov 20

[LLVMdev] changing -mattr behavior with mmx and sse

Hi, When setting -mattr option on X86, I would like to treat MMX separately from SSE levels. This would allow a client who sets the attributes directly to set the SSE level independent of MMX, e.g., llc -march=x86 -mattr=sse41, one would get sse4.1 with mmx disabled while llc -march=x86 -mattr=mmx -mattr=sse42 will get mmx and sse42. If anyone objects to this change, please let me

Performance tests of the current version (git-b1b6caf)

2014 May 13

Performance tests of the current version (git-b1b6caf)

Current sources (git-b1b6caf) were compiled with GCC 4.8.2 and GCC 4.9.0 with various -msseN options (the default is -msse2). Then I took two WAV files (one is 16-bit and the other is 24-bit) and compressed them using best compression mode. The results are in the table below. (please remember that the resulting value is an encoding time, not encoding speed) CPU: Intel Core i7 950 (up to SSE4.2)

[PATCH 4/5] SIMD: accelerate decoding of 16-bit FLAC

2017 Feb 18

[PATCH 4/5] SIMD: accelerate decoding of 16-bit FLAC

This patch adds 2 new functions, FLAC__lpc_restore_signal_intrin_sse41() and FLAC__lpc_restore_signal_16_intrin_sse41(). The decoding speed of Subset-compatible 16-bit FLAC files is slightly increased on SSE4.1-compatible CPUs. -------------- next part -------------- A non-text attachment was scrubbed... Name: 04_add_new_intrin_func.patch Type: application/octet-stream Size: 9851 bytes Desc: not

AVX Optimizations

2015 Nov 05

AVX Optimizations

Sorry. I missed that. Good observation. Please go ahead and correct the patch. Thanks, Radu -----Original Message----- From: opus-bounces at xiph.org [mailto:opus-bounces at xiph.org] On Behalf Of Timothy B. Terriberry Sent: Thursday, November 5, 2015 11:08 AM To: opus at xiph.org Subject: Re: [opus] AVX Optimizations Velea, Radu wrote: > Yes, > > Thank you. I'll follow up with

Test failed!!

2015 Nov 26

Test failed!!

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi Jesus, Thanks for the report. As far as I can tell, what's happening is that when intrinsics are enabled, we compile all tests with -msse4.1, even when it's only run-time detected. In most cases, that doesn't cause any issue, but sometimes the compiler will take the C code and generate an SSEx instruction on its own. I think this is

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote: Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I

Opus 1.1.1 is out!

2015 Nov 26

Opus 1.1.1 is out!

Hi everyone, After much waiting, Opus 1.1.1 is finally here. The main changes are: - x86 SSE, SSE2 and SSE4.1 optimizations contributed by Cisco, - MIPS optimizations contributed by Imagination Technologies, - ARM Neon optimizations contributed by Linaro and ARM, - many architecture-independent optimizations, - memory footprint reductions, and - several minor bug fixes. The quality of the

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 02

Patch cleaning up Opus x86 intrinsics configury

The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in

LLVM and Xeon Skylake v5

2017 May 08

LLVM and Xeon Skylake v5

getProcessTriple just determines operation system, and architecture. It doesn't deal with specific instruction set features. The CPU should be controlled by MCPU on the EngineBuilder i think. The CPU autodetection code lives in getHostCPUName in lib/Support/Host.cpp, but I don't think the JIT calls into. I think its expected the user would call it or pass a specific CPU string to the MCPU

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 12

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 13

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 07

Patch cleaning up Opus x86 intrinsics configury

Hello Jonathan, Just FYI, I started doing review of your patch and will get back to you in few days. After review, I would like to rebase your patch (as necessary) myself and do some testing.. and re-submit. Regards, Vish On 4 March 2015 at 09:00, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at

PATCH: x86-64 support and SSE intrinscis code

2013 Sep 15

PATCH: x86-64 support and SSE intrinscis code

Erik de Castro Lopo <mle+la at mega-nerd.com> wrote: > The biggest of these tweaks weas to disable the intrinsics version > fero FLAC__CPU_IA32 because I couldn't get this to compile on > i386-linux (and we have the nasm versions). Still open to re-enabling > this if someone can get it to work. I know you're a skilled programmer, but... maybe you forgot to add -msse

search for: sse4.1