thr3ads.net - similar to: "[LLVMdev] Merging AVX"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Merging AVX"

2010 Jul 09

[LLVMdev] [PATCH] Start of SIMD Reorg

Now that Bruno is putting in some AVX stuff, it's a good motivator to move my x86 SIMD reorg work into trunk (and got management to agree to prioritize it - Thanks Bruno! :) ). Attached is the first patch of many to accomplish this. The overall goal is to have all x86 SIMD instructions share a set of common patterns so that we can have a more maintainable machine description (e.g. SS, SD,

[LLVMdev] [PATCH] Start of SIMD Reorg

2010 Jul 10

[LLVMdev] [PATCH] Start of SIMD Reorg

Hi David, On Fri, Jul 9, 2010 at 3:25 PM, David Greene <dag at cray.com> wrote: > Now that Bruno is putting in some AVX stuff, it's a good motivator to > move my x86 SIMD reorg work into trunk (and got management to agree to > prioritize it - Thanks Bruno! :) ). > > Attached is the first patch of many to accomplish this. The overall > goal is to have all x86 SIMD

[LLVMdev] [PATCH] Start of SIMD Reorg

2010 Jul 12

[LLVMdev] [PATCH] Start of SIMD Reorg

Bruno Cardoso Lopes <bruno.cardoso at gmail.com> writes: >> This patch merely moves some common pattern fragments (memop, >> alignedload, etc.) to a file separate from X86InstrSSE.td so that all >> current x86 SIMD implementations can still use the classes while the >> transition happens. >> >> Ok to commit? > > I'm Ok with this patch. So

[LLVMdev] use AVX automatically if present

2012 May 24

[LLVMdev] use AVX automatically if present

I wonder why AVX is not used automatically if available at the host machine. In contrast to that, SSE41 instructions (like pmulld) are automatically used if the host machine supports SSE41. E.g. $ cat avx.ll define void @_fun1(<8 x float>*, <8 x float>*) { _L1: %x = load <8 x float>* %0 %y = load <8 x float>* %1 %z = fadd <8 x float> %x, %y store

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

2011 Feb 26

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

David Greene <dag at cray.com> writes: > In ToT, LowerVECTOR_SHUFFLE for x86 has this code: > > if (X86::isUNPCKLMask(SVOp)) > getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG); > > why would this not be: > > if (X86::isUNPCKLMask(SVOp)) > return SVOp; Ok, I discovered that Bruno did this in revisions 112934, 112942 and 113020 but the logs

[LLVMdev] RFC: AVX Pattern Specification [LONG]

2009 Apr 30

[LLVMdev] RFC: AVX Pattern Specification [LONG]

Here's the big RFC. A I've gone through and designed patterns for AVX, I quickly realized that the existing SSE pattern specification, while functional, is less than ideal in terms of maintenance. In particular, a number of nearly-identical patterns are specified all over for nearly-identical instructions. For example: let Constraints = "$src1 = $dst" in { multiclass

[LLVMdev] Moving AVX Upstream

2009 Nov 02

[LLVMdev] Moving AVX Upstream

Hey everyone, I'm at the point with our local AVX tree that I'm ready to move some stuff upstream. We've got most of the basic stuff implemented. The more esoteric stuff still has to be done. Because the more esoteric stuff might require some extensive changes to the existing AVX infrastructure, I suspect there might be quite a bit of church until we get things stabilized. Due to

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

2011 Feb 25

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

In ToT, LowerVECTOR_SHUFFLE for x86 has this code: if (X86::isUNPCKLMask(SVOp)) getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG); why would this not be: if (X86::isUNPCKLMask(SVOp)) return SVOp; I'm trying to add support for VUNPCKL and am getting into trouble because the existing code ends up creating: VUNPCKLPS load load which is badness come selection

[LLVMdev] RFC: AVX Pattern Specification [LONG]

2009 May 01

[LLVMdev] RFC: AVX Pattern Specification [LONG]

On Apr 30, 2009, at 3:59 PM, David Greene wrote: > Here's the big RFC. > > A I've gone through and designed patterns for AVX, I quickly > realized that the > existing SSE pattern specification, while functional, is less than > ideal in > terms of maintenance. In particular, a number of nearly-identical > patterns > are specified all over for

[LLVMdev] RFC: AVX Pattern Specification [LONG]

2009 May 01

[LLVMdev] RFC: AVX Pattern Specification [LONG]

On Apr 30, 2009, at 3:59 PM, David Greene wrote: > Here's the big RFC. > > > Of course we would not transition away from X86InstrSSE.td until > X86InstrSIMD.td is proven to cover all current uses of SSE correctly. > > The pros of the scheme: > > * Unify all "important" x86 SIMD instructions into one framework and > provide > consistency While

[LLVMdev] Poor code generation for odd sized vectors

2011 Sep 27

[LLVMdev] Poor code generation for odd sized vectors

Hi all, I'm compiling LLCM IR code like this on x86-64: define linkonce ccc <16 x float> @vector_add_float(<16 x float> %a.78, <16 x float> %a.79) align 8 { entry: %result.80 = fadd <16 x float> %a.78, %a.79 ret <18 x float> %result.80 } This works really well when the vector length (16 in the above) is an integer multiple of the SSE vector

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2

[LLVMdev] Moving AVX Upstream

2009 Nov 02

[LLVMdev] Moving AVX Upstream

On Nov 2, 2009, at 11:48 AM, David Greene wrote: > Hey everyone, > > I'm at the point with our local AVX tree that I'm ready to move some > stuff upstream. We've got most of the basic stuff implemented. The > more esoteric stuff still has to be done. > > Because the more esoteric stuff might require some extensive changes > to > the existing AVX

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

On Sat, Jul 7, 2012 at 12:25 AM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: >> On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: >>> [...] >>> movaps 32(%rdi), %xmm3 >>> movaps 48(%rdi), %xmm2 >>>

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

This load instruction assumes the default ABI alignment for the <4 x float> type, which is 16: %15 = load <4 x float>* %14 You can set the alignment of loads to something lower than 16 in your frontend, and this will make LLVM use movups instructions: %15 = load <4 x float>* %14, align 4 If some LLVM mid-level pass is introducing this load without proving that the vector is

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > >> I've noticed that LLVM tends to generate suboptimal code and spill an >> excessive amount of registers in large functions, such as in those >> that are automatically generated by FFTW. >

[LLVMdev] passing vector of booleans to functions

2013 Feb 26

[LLVMdev] passing vector of booleans to functions

Hi all, I'm currently trying to figure out the best way to pass vector of booleans to other functions. Take this small example: define <4 x float> @vcmp_add(<4 x float> %a, <4 x float> %b) { entry: %cmp = fcmp olt <4 x float> %a, %b %add = fadd <4 x float> %a, %b %sel = select <4 x i1> %cmp, <4 x float> %add, <4 x float> %a ret <4 x

[LLVMdev] Moving AVX Upstream

2009 Nov 02

[LLVMdev] Moving AVX Upstream

On Monday 02 November 2009 13:55, Tanya Lattner wrote: > You should do incremental development on trunk. If you create a > branch, no one is going to look at those changes. Ok. but I want to be very clear what that means. It means for each AVX instruction I rip out ALL of the existing SSE support for it. So when ADD gets implemented, ADD goes away from X86InstSSE.td. As things progress,

[LLVMdev] Pattern matching questions

2007 Jan 09

[LLVMdev] Pattern matching questions

On Tue, 9 Jan 2007, Evan Cheng wrote: >> - How does one deal with multiple instruction sequences in a pattern? >> To load a constant is a two instruction sequence, but both >> instructions only take two operands (assume that r3 is a 32-bit >> register): >> >> ilhu $3, 45 # r3 = (45 << 16) >> iohl $3, 5 # r3 |= 5

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

2014 Sep 19

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Tom Stellard > Sent: 19 September 2014 01:36 > To: Sanjay Patel > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] predicates vs. requirements [TableGen, > X86InstrInfo.td] > > On Thu, Sep 18, 2014 at 03:25:07PM -0600, Sanjay Patel wrote: >

similar to: [LLVMdev] Merging AVX