thr3ads.net - similar to: "[LLVMdev] X86 Tablegen Description and VEX.W"

Displaying 20 results from an estimated 600 matches similar to: "[LLVMdev] X86 Tablegen Description and VEX.W"

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

On Wed, Nov 7, 2012 at 10:52 PM, Anitha Boyapati <anitha.boyapati at gmail.com>wrote: ... > For the multiclass "fma4s", why is "mr" not inherited from "VEX_W" and > "MemOp4" like those of "rm" or "rr" ? > Hey Anitha, The VEX.W bit is used to denote operand order. In other words, this bit allows for a memop to be used as

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

On 8 November 2012 11:12, Cameron McInally <cameron.mcinally at nyu.edu> wrote: > On Wed, Nov 7, 2012 at 10:52 PM, Anitha Boyapati <anitha.boyapati at gmail.com> > wrote: > ... >> >> For the multiclass "fma4s", why is "mr" not inherited from "VEX_W" and >> "MemOp4" like those of "rm" or "rr" ? >

[LLVMdev] Operand order in dag pattern matching in td files

2012 Nov 16

[LLVMdev] Operand order in dag pattern matching in td files

Hi, I have a simple question w.r.t the order of operands used in dag pattern matching in target files. Some of them seem intuitive. But I want to get it clarified anyway. I am using a pattern from X86InstrFMA.td in the below example. Consider FMA3 pattern (simplified). let Constraints = "$src1 = $dst" in { multiclass fma3s_rm<bits<8> opc, string OpcodeStr, X86MemOperand

[LLVMdev] Operand order in dag pattern matching in td files

2012 Nov 16

[LLVMdev] Operand order in dag pattern matching in td files

On 16 November 2012 13:41, Anitha B Gollamudi <anitha.boyapati at gmail.com> wrote: > Hi, > > I have a simple question w.r.t the order of operands used in dag > pattern matching in target files. Some of them seem intuitive. But I > want to get it clarified anyway. I am using a pattern from > X86InstrFMA.td in the below example. Consider FMA3 pattern > (simplified). >

[LLVMdev] Operand order in dag pattern matching in td files

2012 Nov 16

[LLVMdev] Operand order in dag pattern matching in td files

You've unfortunately chosen a complex example. Your second question is needs be answered first. null_frag causes the pattern to be dropped. Now having covered that the reason the operands are in the order they are is because the only instruction that doesn't use null_frag is this one defm r213 : fma3s_rm<opc213, !strconcat(OpStr, !strconcat("213", PackTy)),

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

On Thu, Nov 8, 2012 at 1:34 AM, Anitha Boyapati <anitha.boyapati at gmail.com>wrote: ... > > I actually have confusion in mapping the role of vex_w during > instruction selection. For the moment, lets just consider vex_w and > not memop. > > [1]. What does " def rr : FMA4<>, VEX_W" mean? As per tablegen > description, "rr" now inherits FMA4 and

[LLVMdev] X86 FMA4

2012 Jul 25

[LLVMdev] X86 FMA4

We're migrating to LLVM 3.1 and trying to use the upstream FMA patterns. Why is VFMADDSD4 defined with vector types? Is this simply because the gcc intrinsic uses vector types? It's quite unnatural if you have a compiler that generates FMAs as opposed to requiring user intrinsics. -Dave

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like. I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch.

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

Just looked up the numbers from Agner Fog for Sandy Bridge for vmovaps/etc for loading/storing from memory. vmovaps - load takes 1 load mu op, 3 latency, with a reciprocal throughput of 0.5. vmovaps - store takes 1 store mu op, 1 load mu op for address calculation, 3 latency, with a reciprocal throughput of 1. He does not list vmovsd, but movsd has the same stats as vmovaps, so I feel it is a

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Hey Jan and Dave, It's not obvious, but there is a significant scalar performance issue following the GCC intrinsics. Let's look at the VFMADDSD pattern. We're operating on scalars with undefineds as the remaining vector elements of the operands. This sounds okay, but when one looks closer... vmovsd fp4_+1088(%rip), %xmm3 # fpppp.f:647 vmovaps %xmm3, 18560(%rsp)

[LLVMdev] X86 FMA4

2012 Jul 26

[LLVMdev] X86 FMA4

Because the intrinsics uses vector types (same as gcc). - Jan ----- Original Message ----- > From: "dag at cray.com" <dag at cray.com> > To: llvmdev at cs.uiuc.edu > Cc: > Sent: Wednesday, July 25, 2012 3:26 PM > Subject: [LLVMdev] X86 FMA4 > > We're migrating to LLVM 3.1 and trying to use the upstream FMA patterns. > > Why is VFMADDSD4

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

Hey Michael, Thanks for the legwork! It appears that the stats you listed are for movaps [SSE], not vmovaps [AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256), since they are both AVX instructions. Although, yes, I agree that this is not clear from Agner's report. Please correct me if I am misunderstanding. As I am sure you are aware, we cannot use SSE (movaps)

[LLVMdev] Help needed on debugging llvm

2013 Mar 12

[LLVMdev] Help needed on debugging llvm

I'm still slightly confused. Is the error now fixed or is there still a bug in LLVM's integrated assembler? On Mon, Mar 11, 2013 at 4:49 AM, Anitha B Gollamudi < anitha.boyapati at gmail.com> wrote: > On 11 March 2013 17:00, Duncan Sands <baldrick at free.fr> wrote: > > Hi Anitha, > > > > > >> Ah, I am taking back my above words w.r.t encoding.

[LLVMdev] Help needed on debugging llvm

2013 Mar 11

[LLVMdev] Help needed on debugging llvm

On 11 March 2013 10:06, Anitha B Gollamudi <anitha.boyapati at gmail.com> wrote: > On 23 January 2013 00:20, Craig Topper <craig.topper at gmail.com> wrote: >> >> Are you still having issues with FMA4? I wonder if PR15040 is related. A >> fix was just committed. Unfortunately r173176 does not fix this. I have updated the trunk and ran...Miscompare still persists.

[LLVMdev] Help needed on debugging llvm

2013 Mar 13

[LLVMdev] Help needed on debugging llvm

Can you send the binaries compiled with and without the integrated assembler. Maybe I can figure out the encoding problem. I've been unsuccessful figuring it out myself so far. On Tue, Mar 12, 2013 at 12:34 AM, Anitha B Gollamudi < anitha.boyapati at gmail.com> wrote: > On 12 March 2013 09:51, Craig Topper <craig.topper at gmail.com> wrote: > > I'm still slightly

[LLVMdev] Help needed on debugging llvm

2013 Mar 12

[LLVMdev] Help needed on debugging llvm

On 12 March 2013 09:51, Craig Topper <craig.topper at gmail.com> wrote: > I'm still slightly confused. Is the error now fixed or is there still a bug > in LLVM's integrated assembler? > The error is not fixed yet (even with fix mentioned in PR15040 http://llvm.org/bugs/show_bug.cgi?id=15040#c4) With the updated trunk, clang still gives an error when FMA4 is enabled but

[LLVMdev] Error running spec benchmark with FMA4 on X86

2012 Sep 06

[LLVMdev] Error running spec benchmark with FMA4 on X86

Hi All, I am facing miscompare error when running povray (and few other C/C++ benchmarks) from spec cpu2006 suite enabling FMA4 (and disabling FMA3). I have used -ffp-contract=fast to turn on this option. (Compilation options and targets pasted below). >>>>>>>> clang version 3.2 (trunk 163295:163308) (llvm/trunk 163295) Target: x86_64-unknown-linux-gnu Thread model: posix

[LLVMdev] Help needed on debugging llvm

2013 Mar 11

[LLVMdev] Help needed on debugging llvm

Hi Anitha, > Ah, I am taking back my above words w.r.t encoding. -no-integrated-as > does fix the issue! This definitely points towards FMA4 encoding in > clang's integrated assembler. This fits into the analysis as well - > dragonegg *might not* be using integrated assembler at all. you are right, dragonegg does not use the integrated assembler. Ciao, Duncan.

[LLVMdev] X86 disassembler & assembler mismatch

2014 Dec 26

[LLVMdev] X86 disassembler & assembler mismatch

hi, some instructions mismatch between assembler & disassembler, like below. it seems this happens with all SSECC related instructions? thanks, Jun $ echo "cmpps xmm1, xmm2, 23" | ./Release+Asserts/bin/llvm-mc -assemble -triple=x86_64 --output-asm-variant=1 -x86-asm-syntax=intel -show-encoding .text cmpps xmm1, xmm2, 23 # encoding: [0x0f,0xc2,0xca,0x17] $

[LLVMdev] X86 disassembler & assembler mismatch

2014 Dec 26

[LLVMdev] X86 disassembler & assembler mismatch

The IMM3/IMM5 come from here X86RecognizableInstr.cpp 943 TYPE("SSECC", TYPE_IMM3) 944: TYPE("AVXCC", TYPE_IMM5) On Thu, Dec 25, 2014 at 8:22 PM, Jun Koi <junkoi2004 at gmail.com> wrote: > > > On Fri, Dec 26, 2014 at 11:54 AM, Jun Koi <junkoi2004 at gmail.com> wrote: > >> hi, >> >> some instructions

similar to: [LLVMdev] X86 Tablegen Description and VEX.W