search for: vr128

Displaying 20 results from an estimated 99 matches for "vr128".

Did you mean: v128
2011 Sep 22
3
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...AVX 128-bit versions too? I added the avx checks to the same file (in which case calling it sse3-haddsub.ll is not so great). > 4) Your tablegen modifications are totally fine, for the intrinsics just do: > > let Predicates = [HasSSE3] in { > def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR128:$src2), > (HADDPSrr VR128:$src1, VR128:$src2)>; > def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), (memop addr:$src2)), > (HADDPSrm VR128:$src1, addr:$src2)>; > ... > > and > > let Predicates = [HasAVX] in { > def : Pat<...
2012 Jul 26
2
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
...gt; XMM0 They are supposed to represent the 32-bit and 64-bit low parts of the xmm registers, but since we don't define explicit registers for those sub-registers, we are left with idempotent sub-register indexes. We have three different register classes for the xmm registers: FR32, FR64, and VR128. The sub_ss and sub_sd indexes used to play a role in selecting the right register class, but not any longer. That is all derived from the instruction descriptions now. As far as I can tell, all sub-register operations involving sub_ss and sub_sd can simply be replaced with COPY_TO_REGCLASS: de...
2012 Jul 26
0
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
...onfused. Below you note that they are used in patterns, so they are certainly mentioned more than just in the code above. > As far as I can tell, all sub-register operations involving sub_ss and > sub_sd can simply be replaced with COPY_TO_REGCLASS: > > def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), > (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), > sub_sd))>; > > Becomes: > > def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), > (VMOVSDrr VR128:$src1, (...
2012 Jul 26
2
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
..., at 9:43 AM, dag at cray.com wrote: > Jakob Stoklund Olesen <jolesen at apple.com> writes: > >> As far as I can tell, all sub-register operations involving sub_ss and >> sub_sd can simply be replaced with COPY_TO_REGCLASS: >> >> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), >> (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), >> sub_sd))>; >> >> Becomes: >> >> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), >>...
2011 Sep 21
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...rizontal.ll to sse3-haddsub.ll 3) Can you duplicate the testcase file to something like avx-haddsub.ll, and check for the AVX 128-bit versions too? 4) Your tablegen modifications are totally fine, for the intrinsics just do: let Predicates = [HasSSE3] in { def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR128:$src2), (HADDPSrr VR128:$src1, VR128:$src2)>; def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), (memop addr:$src2)), (HADDPSrm VR128:$src1, addr:$src2)>; ... and let Predicates = [HasAVX] in { def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR...
2011 Sep 22
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...AVX 128-bit versions too? I added the avx checks to the same file (in which case calling it sse3-haddsub.ll is not so great). > 4) Your tablegen modifications are totally fine, for the intrinsics just do: > > let Predicates = [HasSSE3] in { > def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR128:$src2), > (HADDPSrr VR128:$src1, VR128:$src2)>; def : > Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), (memop addr:$src2)), > (HADDPSrm VR128:$src1, addr:$src2)>; ... > > and > > let Predicates = [HasAVX] in { > def : Pat<(int...
2011 Sep 21
2
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
This patch synthesizes haddps/haddpd/hsubps/hsubpd instructions from floating point additions and subtractions of appropriate vector shuffles. To do this I introduced new x86 FHADD and FHSUB opcodes. These need to be wired up somehow in the .td file to the appropriate instructions. Since I have no idea how tablegen works I just hacked it in horribly. It works, but breaks support for the hadd
2009 Mar 24
2
[LLVMdev] Reducing .td redundancy
...ins FR32:$src1, FR32: $src2), !strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst, $src2}"), [(set FR32:$dst, (!SOME_CONCAT("x86f", OpNode) FR32: $src1, FR32:$src2))]>; // Vector operation def PSrr : PSI<opc, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src1, VR128: $src2), !strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst, $src2}"), [(set VR128:$dst, v2i64 (OpNode VR128:$src1, VR128: $src2))]>; // Bitconverted vector operation def PSrm : PSI<opc, MRMSrcMem,...
2012 Jul 09
2
[LLVMdev] question on table gen TIED_TO constraint
I need to implement an instruction which has 2 read-write registers, so I added let Constraints = "$src1 = $dst, $mask = $mask_wb" in { ... def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb), (ins VR128:$src1, v128mem:$src2, VR128:$mask), ... } There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand. See the section about MRMSrcMem in RecognizableInstr::emitInstructionSpecifier. And the above gives us $dst, $mask_wb, $sr...
2008 Nov 17
2
[LLVMdev] Patterns with Multiple Stores
I want to write a pattern that looks something like this: def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, (MOVSDmr addr:$dst, FR64:$src))), imm:3) So I want to convert an unaligned vector store to a scalar store, a shuffle and a scalar store. There are several question I have: - Is the imm:3 syntax...
2007 Dec 12
2
[LLVMdev] Bogus X86-64 Patterns
Tracking down a problem with one of our benchmark codes, we've discovered that some of the patterns in X86InstrX86-64.td are wrong. Specifically: def MOV64toPQIrm : RPDI<0x6E, MRMSrcMem, (outs VR128:$dst), (ins i64mem:$src), "mov{d|q}\t{$src, $dst|$dst, $src}", [(set VR128:$dst, (v2i64 (scalar_to_vector (loadi64 addr:$src))))]>; def MOVPQIto64mr : RPDI<0x7E, MRMDestMem, (outs), (ins i64mem:$dst, VR128:...
2008 Nov 17
0
[LLVMdev] Patterns with Multiple Stores
On Monday 17 November 2008 14:28, David Greene wrote: > I want to write a pattern that looks something like this: > > def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), > (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, > (MOVSDmr addr:$dst, FR64:$src))), imm:3) > > So I want to convert an unaligned vector store to a scalar store, a shuffle > and a scalar store. I got a little further with this:...
2012 Jul 26
0
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
...lt;jolesen at apple.com> writes: >> What happens if the result of the above pattern using COPY_TO_REGCLASS >> is spilled? Will we get a 64-bit store or a 128-bit store? > > This behavior isn't affected by the change. FR64 registers are spilled > with 64-bit stores, and VR128 registers are spilled with 128-bit > stores. > > When the register coalescer removes a copy between VR128 and FR64 > registers, it chooses the larger spill size for the result. This is > the same for sub-register copies and full register copies. So if I understand this correctly, a...
2010 Aug 04
2
[LLVMdev] x86 Vector Shuffle Patterns
...$src1, node:$src2), [{ return X86::isVPERM2F128Mask(cast<ShuffleVectorSDNode>(N)); }], SHUFFLE_get_vperm2f128_imm>; I don't understand completely how the new system all works. Take a simple SHUFPS match: def SHUFPSrri : PSIi8<0xC6, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src1, VR128:$src2, i8imm:$src3), "shufps\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(set VR128:$dst, (v4f32 (shufp:$src3 VR128:$src1, VR128:$src2)))]>; "...
2009 Apr 30
6
[LLVMdev] RFC: AVX Pattern Specification [LONG]
...(ins FR32:$src1, f32mem:$src2), !strconcat(OpcodeStr, "ss\t{$src2, $dst|$dst, $src2}"), [(set FR32:$dst, (OpNode FR32:$src1, (load addr:$src2)))]>; // Vector operation, reg+reg. def PSrr : PSI<opc, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src1, VR128:$src2), !strconcat(OpcodeStr, "ps\t{$src2, $dst|$dst, $src2}"), [(set VR128:$dst, (v4f32 (OpNode VR128:$src1, VR128:$src2)))]> { let isComm...
2009 Mar 24
0
[LLVMdev] Reducing .td redundancy
...!strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst, > $src2}"), > [(set FR32:$dst, (!SOME_CONCAT("x86f", OpNode) > FR32: > $src1, FR32:$src2))]>; > > // Vector operation > def PSrr : PSI<opc, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src1, > VR128: > $src2), > !strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst, > $src2}"), > [(set VR128:$dst, v2i64 (OpNode VR128:$src1, > VR128: > $src2))]>; > > // Bitconverted vector oper...
2009 Nov 03
1
[LLVMdev] Pat<> & tlbgen
Can someone explain the magic behind the Pat<> construct and tblgen. >From X86InstrSSE.td: def : Pat<(v4f32 (vector_shuffle VR128:$src, (undef), MOVDDUP_shuffle_mask)), (MOVLHPSrr VR128:$src, VR128:$src)>, Requires<[HasSSE1]>; Where's the code in tblgen to emit the matching code for this? I'm trying to extend it so that Pat<> can be used as a general subclass for AVX: class Base<dag pat,...
2009 Apr 28
1
[LLVMdev] Register class intersection
...isters - it also holds information about spill size and alignment. Value types are no longer interesting once the selection DAG has been destroyed. X86 has the weird examples as usual: Classes RFP32, RFP64, and RFP80 are identical (FP0-6) except for the spill size. The same goes for FR64 and VR128 (XMM0-15). The coalescer will join these classes as follows: RFP32 + RFP64 -> RFP64 FR64 + VR128 -> VR128 This seems perfectly reasonable - choose the larger spill size and avoid losing data. TableGen thinks these classes are unrelated - it currently defines register subclasses as fol...
2012 Jul 10
0
[LLVMdev] question on table gen TIED_TO constraint
On Jul 9, 2012, at 4:15 PM, Manman Ren <mren at apple.com> wrote: > > I need to implement an instruction which has 2 read-write registers, so I added > let Constraints = "$src1 = $dst, $mask = $mask_wb" in { > ... > def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb), > (ins VR128:$src1, v128mem:$src2, VR128:$mask), > ... > } > There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand. > See the section about MRMSrcMem in RecognizableInstr::emitInstructionSpecifier. Can this be fixed...
2012 Jul 10
2
[LLVMdev] question on table gen TIED_TO constraint
...at 4:15 PM, Manman Ren <mren at apple.com> wrote: > >> >> I need to implement an instruction which has 2 read-write registers, so I added >> let Constraints = "$src1 = $dst, $mask = $mask_wb" in { >> ... >> def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb), >> (ins VR128:$src1, v128mem:$src2, VR128:$mask), >> ... >> } >> There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand. >> See the section about MRMSrcMem in RecognizableInstr::emitInstructionSpecifier....