search for: vfmaddsd

Displaying 10 results from an estimated 10 matches for "vfmaddsd".

2012 Jul 25
6
[LLVMdev] X86 FMA4
We're migrating to LLVM 3.1 and trying to use the upstream FMA patterns. Why is VFMADDSD4 defined with vector types? Is this simply because the gcc intrinsic uses vector types? It's quite unnatural if you have a compiler that generates FMAs as opposed to requiring user intrinsics. -Dave
2012 Jul 26
0
[LLVMdev] X86 FMA4
Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch. -Cameron On Thu, Jul 26, 2012 at 2:27 PM, Jan Sjodin <jan_sjod...
2012 Jul 26
1
[LLVMdev] X86 FMA4
Hey Jan and Dave, It's not obvious, but there is a significant scalar performance issue following the GCC intrinsics. Let's look at the VFMADDSD pattern. We're operating on scalars with undefineds as the remaining vector elements of the operands. This sounds okay, but when one looks closer... vmovsd fp4_+1088(%rip), %xmm3 # fpppp.f:647 vmovaps %xmm3, 18560(%rsp) # fpppp.f:647 <= 16-byte spill vfmaddsd...
2012 Jul 27
2
[LLVMdev] X86 FMA4
...eel it is a safe assumption to make that vmovsd has the same stats as well. Michael On Jul 26, 2012, at 11:46 AM, Cameron McInally wrote: > Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. > > I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch. > > -Cameron > > On Thu, Jul 26, 2012 at 2:...
2012 Nov 08
0
[LLVMdev] X86 Tablegen Description and VEX.W
...emOp4" like those of "rm" or "rr" ? > Hey Anitha, The VEX.W bit is used to denote operand order. In other words, this bit allows for a memop to be used as either the second or third source operand of an FMA instruction, offering greater flexibility. To conceptualize: VFMADDSD xmm1, xmm2, xmm3/mem64, xmm4 VEX.W == 0 VFMADDSD xmm1, xmm2, xmm3, xmm4/mem64 VEX.W == 1 So, logically, one could create the rr pattern with the VEX.W bit set or not. The MemOp4 flag is a similar mechanism for setting the ModRM byte, indicating that the second and third source operands have b...
2012 Jul 27
0
[LLVMdev] X86 FMA4
...ake that vmovsd has the same stats as well. > > Michael > > On Jul 26, 2012, at 11:46 AM, Cameron McInally wrote: > > Ah, bad example. This is a general problem for all (maybe most) SSE and > AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can > swap out VFMADDSD in my example for VADDSD or whatever you like. > > I have a lion's share of such a change implemented already and performance > is greatly affected. If the community is interested in this change, I would > be happy to prepare a patch. > > -Cameron > > On Thu, Jul 26, 201...
2012 Nov 08
2
[LLVMdev] X86 Tablegen Description and VEX.W
Hi, A question from r162999 changes: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?r1=162999&r2=162998&pathrev=162999 For the multiclass "fma4s", why is "mr" not inherited from "VEX_W" and "MemOp4" like those of "rm" or "rr" ? multiclass fma4s< > ... def mr : FMA4<opc, MRMSrcMem, (outs
2012 Jul 26
0
[LLVMdev] X86 FMA4
...ssage ----- > From: "dag at cray.com" <dag at cray.com> > To: llvmdev at cs.uiuc.edu > Cc: > Sent: Wednesday, July 25, 2012 3:26 PM > Subject: [LLVMdev] X86 FMA4 > > We're migrating to LLVM 3.1 and trying to use the upstream FMA patterns. > > Why is VFMADDSD4 defined with vector types?  Is this simply because the > gcc intrinsic uses vector types?  It's quite unnatural if you have a > compiler that generates FMAs as opposed to requiring user intrinsics. > >                               -Dave > _____________________________________...
2012 Jul 27
3
[LLVMdev] X86 FMA4
...o make that vmovsd has the same stats as well. > > Michael > > On Jul 26, 2012, at 11:46 AM, Cameron McInally wrote: > >> Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. >> >> I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch. >> >> -Cameron >> >> On...
2012 Nov 08
2
[LLVMdev] X86 Tablegen Description and VEX.W
...uot;rr" ? > > > Hey Anitha, > > The VEX.W bit is used to denote operand order. In other words, this bit > allows for a memop to be used as either the second or third source operand > of an FMA instruction, offering greater flexibility. > > To conceptualize: > > VFMADDSD xmm1, xmm2, xmm3/mem64, xmm4 VEX.W == 0 > VFMADDSD xmm1, xmm2, xmm3, xmm4/mem64 VEX.W == 1 > > So, logically, one could create the rr pattern with the VEX.W bit set or > not. > I actually have confusion in mapping the role of vex_w during instruction selection. For the moment, l...