thr3ads.net - similar to: "[LLVMdev] instcombine does silly things with vector x+x"

Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] instcombine does silly things with vector x+x"

[LLVMdev] instcombine does silly things with vector x+x

2011 Oct 28

[LLVMdev] instcombine does silly things with vector x+x

On Oct 28, 2011, at 2:13 PM, andrew adams wrote: > Consider the following function which doubles a <16 x i8> vector: > > define <16 x i8> @test(<16 x i8> %a) { > %b = add <16 x i8> %a, %a > ret <16 x i8> %b > } > > If I compile it for x86 with llc like so: > > llc paddb.ll -filetype=asm -o=/dev/stdout > > I get a

[LLVMdev] instcombine does silly things with vector x+x

2011 Oct 30

[LLVMdev] instcombine does silly things with vector x+x

Opened pr11266. I will try to make time to work on it. -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chris Lattner Sent: Saturday, October 29, 2011 01:04 To: andrew adams Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] instcombine does silly things with vector x+x On Oct 28, 2011, at 2:13 PM, andrew adams wrote: >

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

2014 Oct 13

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

Hello, Depending on how I extract integer lanes from an x86_64 xmm register, the backend may spill that register in order to load scalars. The effect was observed on two targets: corei7-avx and btver1 (I haven't checked other targets). Here's a test case with spilling/no-spilling code put on conditional compile: #if __SSE4_1__ != 0 #include <smmintrin.h> #else #include

[PATCH 1/2] Modify autoconf tests for intrinsics to stop clang from optimizing them away.

2016 May 31

[PATCH 1/2] Modify autoconf tests for intrinsics to stop clang from optimizing them away.

--- configure.ac | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/configure.ac b/configure.ac index a67aa37..c722556 100644 --- a/configure.ac +++ b/configure.ac @@ -472,6 +472,7 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [[ static float32x4_t A0, A1, SUMM; SUMM = vmlaq_f32(SUMM, A0,

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 13

[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.

From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

2015 Mar 12

[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

[LLVMdev] Help on DAG pattern matching string

2009 Jul 04

[LLVMdev] Help on DAG pattern matching string

Hello, I'm new to LLVM and I'm using it to translate from LLVM to another language rather than emitting actual machine code. The target language has instructions that operate on pointers which aren't naturally exposed in LLVM. Here's what I've done to add pointer support for an instruction called PADD that takes a pointers and an offset and returns the new pointer value:

[LLVMdev] [RFC] Integer Saturation Intrinsics

2015 Jan 14

[LLVMdev] [RFC] Integer Saturation Intrinsics

Hi all, The patches linked below introduce a new family of intrinsics, for integer saturation: @llvm.usat, and @llvm.ssat (unsigned/signed). Quoting the added documentation: %r = call i32 @llvm.ssat.i32(i32 %x, i32 %n) is equivalent to the expression min(max(x, -2^(n-1)), 2^(n-1)-1), itself implementable as the following IR: %min_sint_n = i32 ... ; the min. signed integer of

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 02

Patch cleaning up Opus x86 intrinsics configury

The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in

[LLVMdev] Help on DAG pattern matching string

2009 Jul 04

[LLVMdev] Help on DAG pattern matching string

Are there any other patterns in your TD file? If so, then one of the ones before this pattern will match everything, and this pattern will never be matched. -bw On Jul 3, 2009, at 8:27 PM, Javier Martinez wrote: > Hello, > > I'm new to LLVM and I'm using it to translate from LLVM to another > language rather than emitting actual machine code. The target language > has

[LLVMdev] Help on DAG pattern matching string

2009 Jul 06

[LLVMdev] Help on DAG pattern matching string

Hi Bill, Yes, there are other patterns. I tried commenting out all the other instructions definitions and I still get this error. After debugging TblGen I found that the second pattern is being generated as a variant of the first. I think the reason is that the PADD instruction is inheriting the commutative property from ADD defined inTargetSelectionDAG.td. The variant ends up being the same

[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?

2013 Nov 22

[LLVMdev] [clang] SSE2 intrinsics (emmintrin.h): _mm_movpi64_pi64 should be _mm_movpi64_epi64?

Hi there, I've recently encountered a piece of code that uses some SSE2 intrinsics and builds with gcc46, but not clang: clang can't find _mm_movpi64_epi64(), while gcc46 defines it in its lib/gcc46/gcc/.../4.6.3/include/emmintrin.h: extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_movpi64_epi64 (__m64 __A) { return _mm_set_epi64

Fwd: User defined panel functions in lattice

2012 Apr 19

Fwd: User defined panel functions in lattice

Hi ilai Thank you for your suggestions. I do not know what happened yesterday I must have omitted a few changes out in going from R to email and apologies for the double posting - I had troubles sending it as my ISP gave a message of not being connected for email but was for the web I was trying to get panel.Locfit to work in a number of situations. 1. Conditioned by Farm (3 panels) with 2

User defined panel functions in lattice

2012 Apr 19

User defined panel functions in lattice

Hi I have a problem with passing line and symbol parameters to user defined panel functions I had a look at the archives and created a panel function on what was shown and on panel.loess. I could not to get panel.locfit to work for what I intend it for. There is another layer to work with before success as lp() is called from locfit. xx <- structure(list(Farm = c("A",

[PATCH 3a/3] Add shadow VRAM

2006 Mar 16

[PATCH 3a/3] Add shadow VRAM

This is a slightly modified version of the original VGA patch that removes changes to the configure script to check for SSE2 capabilities. SSE2 is now only checked at run time. Signed-off-by: Don Dugger <donald.d.dugger@intel.com> -- Don Dugger "Censeo Toto nos in Kansa esse decisse." - D. Gale Donald.D.Dugger@intel.com Ph: (303)440-1368 diff -r c445d4a0dd76

[LLVMdev] Aliasing confusion

2011 Oct 07

[LLVMdev] Aliasing confusion

Hi all, I'm having trouble understanding how llvm determines if pointers alias. Consider the following two functions that each do a redundant load: define float @A(float * noalias %ptr1) { %ptr2 = getelementptr float* %ptr1, i32 1024 %val1a = load float* %ptr1 store float %val1a, float* %ptr2 %val1b = load float* %ptr1 ret float %val1b } define float @B(float * noalias %ptr1,

[RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10

2015 Mar 18

[RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10

Hi All, Since I continue to base my work on top of Jonathan's patch, and my previous Ne10 fft/ifft/mdct_forward/backward patches, I thought it would be better to just post all new patches as a patch series. Please let me know if anyone disagrees with this approach. You can see wip branch of all latest patches at https://git.linaro.org/people/viswanath.puttagunta/opus.git Branch:

[RFC PATCH v1 0/5] aarch64: celt_pitch_xcorr: Fixed point series

2015 Mar 31

[RFC PATCH v1 0/5] aarch64: celt_pitch_xcorr: Fixed point series

Hi Timothy, As I mentioned earlier [1], I now fixed compile issues with fixed point and resubmitting the patch. I also have new patch that does intrinsics optimizations for celt_pitch_xcorr targetting aarch64. You can find my latest work-in-progress branch at [2] For reference, you can use the Ne10 pre-built libraries at [3] Note that I am working with Phil at ARM to get my patch at [4]

[LLVMdev] Aliasing confusion

2011 Oct 07

[LLVMdev] Aliasing confusion

On Fri, Oct 7, 2011 at 2:15 PM, andrew adams <andrew.b.adams at gmail.com> wrote: > Hi all, > > I'm having trouble understanding how llvm determines if pointers > alias. Consider the following two functions that each do a redundant > load: > > define float @A(float * noalias %ptr1) { > %ptr2 = getelementptr float* %ptr1, i32 1024 > %val1a = load float*

[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work

2009 Jul 23

[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work

On Jul 21, 2009, at 11:14 PM, Eli Friedman wrote: > Testcase (compile with clang >= r76726): > #include <emmintrin.h> > __m128i a(__m128 a, __m128 b) { return a==a & b==b; } > > CodeGen ends up scalarizing the comparison, which is really bad, and > AFAIK different from what we did before vsetcc was removed. The ideal > code is a single cmpordps, although I

similar to: [LLVMdev] instcombine does silly things with vector x+x