thr3ads.net - search: "smmintrin"

Displaying 20 results from an estimated 20 matches for "smmintrin".

Did you mean: xmmintrin

[PATCH 1/2] Modify autoconf tests for intrinsics to stop clang from optimizing them away.

2016 May 31

[PATCH 1/2] Modify autoconf tests for intrinsics to stop clang from optimizing them away.

...86_MAY_HAVE_SSE2" = x"1" && test x"$OPUS_X86_PRESUME_SSE2" != x"1"], @@ -557,11 +564,13 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [OPUS_X86_MAY_HAVE_SSE4_1], [OPUS_X86_PRESUME_SSE4_1], [[#include <smmintrin.h> + #include <time.h> ]], [[ - static __m128i mtest; - mtest = _mm_setzero_si128(); - mtest = _mm_cmpeq_epi64(mtest, mtest); + __m128i mtest; + mtest = _mm_set1_epi32((int)time(NULL)); + mtest...

[LLVMdev] long double type on ARM

2009 Sep 30

[LLVMdev] long double type on ARM

...aders="mmintrin.h" ;; ... i[34567]86-*-*) cpu_type=i386 # LLVM LOCAL begin out_cxx_file=i386/llvm-i386.cpp # LLVM LOCAL end # APPLE LOCAL begin 5612787 mainline sse4 extra_headers="mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h nmmintrin.h" (out_cxx_file variable is empty for ARM target) I wonder if llvm-gcc 4.2 front-end support bitcode conversion for ARM target. Thank you. Best regards, Jin-Gu Kang ________________________________ From: Bob Wilson [bob.wilson at apple.com] Sent: Thursday, October...

[LLVMdev] long double type on ARM

2009 Sep 30

[LLVMdev] long double type on ARM

...t; i[34567]86-*-*) > cpu_type=i386 > # LLVM LOCAL begin > out_cxx_file=i386/llvm-i386.cpp > # LLVM LOCAL end > # APPLE LOCAL begin 5612787 mainline sse4 > extra_headers="mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h > pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h > nmmintrin.h" > > (out_cxx_file variable is empty for ARM target) > I wonder if llvm-gcc 4.2 front-end support bitcode conversion for > ARM target. > > Thank you. > > Best regards, > > Jin-Gu Kang > > From: Bob Wilson [bob.wilson at apple....

Test failed!!

2015 Nov 26

Test failed!!

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi Jesus, Thanks for the report. As far as I can tell, what's happening is that when intrinsics are enabled, we compile all tests with -msse4.1, even when it's only run-time detected. In most cases, that doesn't cause any issue, but sometimes the compiler will take the C code and generate an SSEx instruction on its own. I think this is

Test failed!!

2015 Nov 27

Test failed!!

...sive make[1]: Entering directory `/tmp/opus-1.1.1' make[2]: Entering directory `/tmp/opus-1.1.1' CC celt/tests/test_unit_mathops.o In file included from ./celt/x86/celt_lpc_sse.c:34:0, from celt/tests/test_unit_mathops.c:63: /usr/lib/gcc/x86_64-linux-gnu/4.8/include/smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled" # error "SSE4.1 instruction set not enabled" ^ make[2]: *** [celt/tests/test_unit_mathops.o] Error 1 make[2]: Leaving directory `/tmp/opus-1.1.1' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/...

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

2014 Oct 13

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

...86_64 xmm register, the backend may spill that register in order to load scalars. The effect was observed on two targets: corei7-avx and btver1 (I haven't checked other targets). Here's a test case with spilling/no-spilling code put on conditional compile: #if __SSE4_1__ != 0 #include <smmintrin.h> #else #include <emmintrin.h> #endif #include <stdint.h> #include <assert.h> #if SPILLING_ENSUES == 1 static int32_t geti(const __m128i v, const size_t i) { switch (i) { case 0: return _mm_cvtsi128_si32(v); case 1: return _mm_cvtsi128_si32(_mm_shuffle_epi32(v, 0xe5)); case 2...

Test failed!!

2015 Nov 27

Test failed!!

...ive make[1]: Entering directory > `/tmp/opus-1.1.1' make[2]: Entering directory `/tmp/opus-1.1.1' CC > celt/tests/test_unit_mathops.o In file included from > ./celt/x86/celt_lpc_sse.c:34:0, from > celt/tests/test_unit_mathops.c:63: > /usr/lib/gcc/x86_64-linux-gnu/4.8/include/smmintrin.h:31:3: error: > #error "SSE4.1 instruction set not enabled" # error "SSE4.1 > instruction set not enabled" ^ make[2]: *** > [celt/tests/test_unit_mathops.o] Error 1 make[2]: Leaving directory > `/tmp/opus-1.1.1' make[1]: *** [all-recursive] Error 1 make[1]: &gt...

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 13

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

...ig.h" #endif -#include <xmmintrin.h> -#include <emmintrin.h> - #include "macros.h" #include "celt_lpc.h" #include "stack_alloc.h" #include "mathops.h" #include "pitch.h" -#if defined(OPUS_X86_MAY_HAVE_SSE4_1) -#include <smmintrin.h> -#include "x86cpu.h" - -opus_val32 celt_inner_prod_sse4_1(const opus_val16 *x, const opus_val16 *y, - int N) -{ - opus_int i, dataSize16; - opus_int32 sum; - __m128i inVec1_76543210, inVec1_FEDCBA98, acc1; - __m128i inVec2_76543210, inVec2_FEDCBA98, acc2; - __m1...

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 12

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

[LLVMdev] long double type on ARM

2009 Sep 30

[LLVMdev] long double type on ARM

Unlike llvm itself, llvm-gcc needs to be configured for a particular target architecture. It looks like you're using a copy of llvm-gcc that was built to generate x86 code. On Sep 30, 2009, at 6:27 AM, Jin Gu Kang wrote: > Dear LLVM members. > > I am compiling coreutils-7.4 package for ARM linux using LLVM 2.5 > version. > > When i compiled 'od' program in

[LLVMdev] Compiling llvm and Clang on Linux

2012 Jul 11

[LLVMdev] Compiling llvm and Clang on Linux

It's undocumented FAQ, if you are using RHEL5 (or clone). - install gcc44-c++ - Build with CC=gcc44 CXX=g++44 - You may need "CC=clang -std=gnu89" to use clang with its glibc. Have fun! ps. AFAIK, clang can be built more easily on centos6. ...Takumi 2012/7/11 Sitvanit Ruah <RUAH at il.ibm.com>: > > Hello all, > I am new to this mailing list so I hope this is

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 02

Patch cleaning up Opus x86 intrinsics configury

The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in

[LLVMdev] Compiling llvm and Clang on Linux

2012 Jul 12

[LLVMdev] Compiling llvm and Clang on Linux

.../include/mmintrin.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/nmmintrin.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/omp.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/pmmintrin.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/popcntintrin.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/smmintrin.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/stdarg.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/stdbool.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/stddef.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/stdfix.h /usr/lib/gcc/i386-redhat-linux6E/4.4.6/include/syslimits.h /usr/lib/gcc/...

[LLVMdev] Compiling llvm and Clang on Linux

2012 Jul 11

[LLVMdev] Compiling llvm and Clang on Linux

Hello all, I am new to this mailing list so I hope this is the right place to post the following question. We are considering using Clang front end for our tool. I tried to compile LLVM (using configure followed by make from the llvm top directory) on LINUX X86 with gcc 4.1.2. I got several compilation error messages of the form /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../include/c+

[LLVMdev] long double type on ARM

2009 Sep 30

[LLVMdev] long double type on ARM

Dear LLVM members. I am compiling coreutils-7.4 package for ARM linux using LLVM 2.5 version. When i compiled 'od' program in coreutils package using LLVM 2.5, i could see the error message on llc processing. > llvm-gcc -emit-llvm ./od.c -c -o ./od.bc -other-options... > llc -march=arm ./od.bc -f -o ./od.s llc:

[RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10

2015 Mar 18

[RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10

Hi All, Since I continue to base my work on top of Jonathan's patch, and my previous Ne10 fft/ifft/mdct_forward/backward patches, I thought it would be better to just post all new patches as a patch series. Please let me know if anyone disagrees with this approach. You can see wip branch of all latest patches at https://git.linaro.org/people/viswanath.puttagunta/opus.git Branch:

[RFC PATCH v1 0/5] aarch64: celt_pitch_xcorr: Fixed point series

2015 Mar 31

[RFC PATCH v1 0/5] aarch64: celt_pitch_xcorr: Fixed point series

Hi Timothy, As I mentioned earlier [1], I now fixed compile issues with fixed point and resubmitting the patch. I also have new patch that does intrinsics optimizations for celt_pitch_xcorr targetting aarch64. You can find my latest work-in-progress branch at [2] For reference, you can use the Ne10 pre-built libraries at [3] Note that I am working with Phil at ARM to get my patch at [4]

[RFC PATCH v2]: Ne10 fft fixed and previous 0/8]

2015 May 08

[RFC PATCH v2]: Ne10 fft fixed and previous 0/8]

Hi All, As per Timothy's suggestion, disabling mdct_forward for fixed point. Only effects armv7,armv8: Extend fixed fft NE10 optimizations to mdct Rest of patches are same as in [1] For reference, latest wip code for opus is at [2] Still working with NE10 team at ARM to get corner cases of mdct_forward. Will update with another patch when issue in NE10 gets fixed. Regards, Vish [1]:

[RFC V3 0/8] Ne10 fft fixed and previous

2015 May 15

[RFC V3 0/8] Ne10 fft fixed and previous

Hi All, Changes from RFC v2 [1] armv7,armv8: Extend fixed fft NE10 optimizations to mdct - Overflow issue fixed by Phil at ARM. Ne10 wip at [2]. Should be upstream soon. - So, re-enabled using fixed fft for mdct_forward which was disabled in RFCv2 armv7,armv8: Optimize fixed point fft using NE10 library - Thanks to Jonathan Lennox, fixed some build fixes on iOS and some copy-paste errors Rest

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

2015 Apr 28

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

Hello Timothy / Jean-Marc / opus-dev, This patch series is follow up on work I posted on [1]. In addition to what was posted on [1], this patch series mainly integrates Fixed point FFT implementations in NE10 library into opus. You can view my opus wip code at [2]. Note that while I found some issues both with the NE10 library(fixed fft) and with Linaro toolchain (armv8 intrinsics), the work

search for: smmintrin