similar to: [LLVMdev] Merging AVX

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Merging AVX"

2010 Jul 09
3
[LLVMdev] [PATCH] Start of SIMD Reorg
Now that Bruno is putting in some AVX stuff, it's a good motivator to move my x86 SIMD reorg work into trunk (and got management to agree to prioritize it - Thanks Bruno! :) ). Attached is the first patch of many to accomplish this. The overall goal is to have all x86 SIMD instructions share a set of common patterns so that we can have a more maintainable machine description (e.g. SS, SD,
2010 Jul 10
0
[LLVMdev] [PATCH] Start of SIMD Reorg
Hi David, On Fri, Jul 9, 2010 at 3:25 PM, David Greene <dag at cray.com> wrote: > Now that Bruno is putting in some AVX stuff, it's a good motivator to > move my x86 SIMD reorg work into trunk (and got management to agree to > prioritize it - Thanks Bruno! :) ). > > Attached is the first patch of many to accomplish this.  The overall > goal is to have all x86 SIMD
2010 Jul 12
2
[LLVMdev] [PATCH] Start of SIMD Reorg
Bruno Cardoso Lopes <bruno.cardoso at gmail.com> writes: >> This patch merely moves some common pattern fragments (memop, >> alignedload, etc.) to a file separate from X86InstrSSE.td so that all >> current x86 SIMD implementations can still use the classes while the >> transition happens. >> >> Ok to commit? > > I'm Ok with this patch. So
2012 May 24
4
[LLVMdev] use AVX automatically if present
I wonder why AVX is not used automatically if available at the host machine. In contrast to that, SSE41 instructions (like pmulld) are automatically used if the host machine supports SSE41. E.g. $ cat avx.ll define void @_fun1(<8 x float>*, <8 x float>*) { _L1: %x = load <8 x float>* %0 %y = load <8 x float>* %1 %z = fadd <8 x float> %x, %y store
2011 Feb 26
0
[LLVMdev] X86 LowerVECTOR_SHUFFLE Question
David Greene <dag at cray.com> writes: > In ToT, LowerVECTOR_SHUFFLE for x86 has this code: > > if (X86::isUNPCKLMask(SVOp)) > getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG); > > why would this not be: > > if (X86::isUNPCKLMask(SVOp)) > return SVOp; Ok, I discovered that Bruno did this in revisions 112934, 112942 and 113020 but the logs
2009 Apr 30
6
[LLVMdev] RFC: AVX Pattern Specification [LONG]
Here's the big RFC. A I've gone through and designed patterns for AVX, I quickly realized that the existing SSE pattern specification, while functional, is less than ideal in terms of maintenance. In particular, a number of nearly-identical patterns are specified all over for nearly-identical instructions. For example: let Constraints = "$src1 = $dst" in { multiclass
2009 Nov 02
2
[LLVMdev] Moving AVX Upstream
Hey everyone, I'm at the point with our local AVX tree that I'm ready to move some stuff upstream. We've got most of the basic stuff implemented. The more esoteric stuff still has to be done. Because the more esoteric stuff might require some extensive changes to the existing AVX infrastructure, I suspect there might be quite a bit of church until we get things stabilized. Due to
2011 Feb 25
2
[LLVMdev] X86 LowerVECTOR_SHUFFLE Question
In ToT, LowerVECTOR_SHUFFLE for x86 has this code: if (X86::isUNPCKLMask(SVOp)) getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG); why would this not be: if (X86::isUNPCKLMask(SVOp)) return SVOp; I'm trying to add support for VUNPCKL and am getting into trouble because the existing code ends up creating: VUNPCKLPS load load which is badness come selection
2009 May 01
0
[LLVMdev] RFC: AVX Pattern Specification [LONG]
On Apr 30, 2009, at 3:59 PM, David Greene wrote: > Here's the big RFC. > > A I've gone through and designed patterns for AVX, I quickly > realized that the > existing SSE pattern specification, while functional, is less than > ideal in > terms of maintenance. In particular, a number of nearly-identical > patterns > are specified all over for
2009 May 01
0
[LLVMdev] RFC: AVX Pattern Specification [LONG]
On Apr 30, 2009, at 3:59 PM, David Greene wrote: > Here's the big RFC. > > > Of course we would not transition away from X86InstrSSE.td until > X86InstrSIMD.td is proven to cover all current uses of SSE correctly. > > The pros of the scheme: > > * Unify all "important" x86 SIMD instructions into one framework and > provide > consistency While
2011 Sep 27
2
[LLVMdev] Poor code generation for odd sized vectors
Hi all, I'm compiling LLCM IR code like this on x86-64: define linkonce ccc <16 x float> @vector_add_float(<16 x float> %a.78, <16 x float> %a.79) align 8 { entry: %result.80 = fadd <16 x float> %a.78, %a.79 ret <18 x float> %result.80 } This works really well when the vector length (16 in the above) is an integer multiple of the SSE vector
2015 Jul 29
2
[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address
When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2
2009 Nov 02
0
[LLVMdev] Moving AVX Upstream
On Nov 2, 2009, at 11:48 AM, David Greene wrote: > Hey everyone, > > I'm at the point with our local AVX tree that I'm ready to move some > stuff upstream. We've got most of the basic stuff implemented. The > more esoteric stuff still has to be done. > > Because the more esoteric stuff might require some extensive changes > to > the existing AVX
2012 Jul 06
0
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
On Sat, Jul 7, 2012 at 12:25 AM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: >> On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: >>> [...] >>> movaps 32(%rdi), %xmm3 >>> movaps 48(%rdi), %xmm2 >>>
2015 Jul 29
0
[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address
This load instruction assumes the default ABI alignment for the <4 x float> type, which is 16: %15 = load <4 x float>* %14 You can set the alignment of loads to something lower than 16 in your frontend, and this will make LLVM use movups instructions: %15 = load <4 x float>* %14, align 4 If some LLVM mid-level pass is introducing this load without proving that the vector is
2012 Jul 06
2
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > >> I've noticed that LLVM tends to generate suboptimal code and spill an >> excessive amount of registers in large functions, such as in those >> that are automatically generated by FFTW. >
2013 Feb 26
2
[LLVMdev] passing vector of booleans to functions
Hi all, I'm currently trying to figure out the best way to pass vector of booleans to other functions. Take this small example: define <4 x float> @vcmp_add(<4 x float> %a, <4 x float> %b) { entry: %cmp = fcmp olt <4 x float> %a, %b %add = fadd <4 x float> %a, %b %sel = select <4 x i1> %cmp, <4 x float> %add, <4 x float> %a ret <4 x
2009 Nov 02
2
[LLVMdev] Moving AVX Upstream
On Monday 02 November 2009 13:55, Tanya Lattner wrote: > You should do incremental development on trunk. If you create a > branch, no one is going to look at those changes. Ok. but I want to be very clear what that means. It means for each AVX instruction I rip out ALL of the existing SSE support for it. So when ADD gets implemented, ADD goes away from X86InstSSE.td. As things progress,
2007 Jan 09
2
[LLVMdev] Pattern matching questions
On Tue, 9 Jan 2007, Evan Cheng wrote: >> - How does one deal with multiple instruction sequences in a pattern? >> To load a constant is a two instruction sequence, but both >> instructions only take two operands (assume that r3 is a 32-bit >> register): >> >> ilhu $3, 45 # r3 = (45 << 16) >> iohl $3, 5 # r3 |= 5
2014 Sep 19
2
[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]
> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Tom Stellard > Sent: 19 September 2014 01:36 > To: Sanjay Patel > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] predicates vs. requirements [TableGen, > X86InstrInfo.td] > > On Thu, Sep 18, 2014 at 03:25:07PM -0600, Sanjay Patel wrote: >