Displaying 20 results from an estimated 99 matches for "vr128".
Did you mean:
v128
2011 Sep 22
3
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...AVX 128-bit versions too?
I added the avx checks to the same file (in which case calling it
sse3-haddsub.ll is not so great).
> 4) Your tablegen modifications are totally fine, for the intrinsics just do:
>
> let Predicates = [HasSSE3] in {
> def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR128:$src2),
> (HADDPSrr VR128:$src1, VR128:$src2)>;
> def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), (memop addr:$src2)),
> (HADDPSrm VR128:$src1, addr:$src2)>;
> ...
>
> and
>
> let Predicates = [HasAVX] in {
> def : Pat<...
2012 Jul 26
2
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
...gt; XMM0
They are supposed to represent the 32-bit and 64-bit low parts of the xmm registers, but since we don't define explicit registers for those sub-registers, we are left with idempotent sub-register indexes.
We have three different register classes for the xmm registers: FR32, FR64, and VR128. The sub_ss and sub_sd indexes used to play a role in selecting the right register class, but not any longer. That is all derived from the instruction descriptions now.
As far as I can tell, all sub-register operations involving sub_ss and sub_sd can simply be replaced with COPY_TO_REGCLASS:
de...
2012 Jul 26
0
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
...onfused. Below you note that they are used in patterns, so they
are certainly mentioned more than just in the code above.
> As far as I can tell, all sub-register operations involving sub_ss and
> sub_sd can simply be replaced with COPY_TO_REGCLASS:
>
> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
> (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2),
> sub_sd))>;
>
> Becomes:
>
> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
> (VMOVSDrr VR128:$src1, (...
2012 Jul 26
2
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
..., at 9:43 AM, dag at cray.com wrote:
> Jakob Stoklund Olesen <jolesen at apple.com> writes:
>
>> As far as I can tell, all sub-register operations involving sub_ss and
>> sub_sd can simply be replaced with COPY_TO_REGCLASS:
>>
>> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
>> (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2),
>> sub_sd))>;
>>
>> Becomes:
>>
>> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
>>...
2011 Sep 21
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...rizontal.ll to sse3-haddsub.ll
3) Can you duplicate the testcase file to something like
avx-haddsub.ll, and check for the AVX 128-bit versions too?
4) Your tablegen modifications are totally fine, for the intrinsics just do:
let Predicates = [HasSSE3] in {
def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR128:$src2),
(HADDPSrr VR128:$src1, VR128:$src2)>;
def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), (memop addr:$src2)),
(HADDPSrm VR128:$src1, addr:$src2)>;
...
and
let Predicates = [HasAVX] in {
def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR...
2011 Sep 22
0
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
...AVX 128-bit versions too?
I added the avx checks to the same file (in which case calling it sse3-haddsub.ll is not so great).
> 4) Your tablegen modifications are totally fine, for the intrinsics just do:
>
> let Predicates = [HasSSE3] in {
> def : Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), VR128:$src2),
> (HADDPSrr VR128:$src1, VR128:$src2)>; def :
> Pat<(int_x86_sse3_hadd_ps (v4f32 VR128:$src1), (memop addr:$src2)),
> (HADDPSrm VR128:$src1, addr:$src2)>; ...
>
> and
>
> let Predicates = [HasAVX] in {
> def : Pat<(int...
2011 Sep 21
2
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
This patch synthesizes haddps/haddpd/hsubps/hsubpd instructions from floating
point additions and subtractions of appropriate vector shuffles. To do this I
introduced new x86 FHADD and FHSUB opcodes. These need to be wired up somehow
in the .td file to the appropriate instructions. Since I have no idea how
tablegen works I just hacked it in horribly. It works, but breaks support for
the hadd
2009 Mar 24
2
[LLVMdev] Reducing .td redundancy
...ins FR32:$src1, FR32:
$src2),
!strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst, $src2}"),
[(set FR32:$dst, (!SOME_CONCAT("x86f", OpNode) FR32:
$src1, FR32:$src2))]>;
// Vector operation
def PSrr : PSI<opc, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src1, VR128:
$src2),
!strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst, $src2}"),
[(set VR128:$dst, v2i64 (OpNode VR128:$src1, VR128:
$src2))]>;
// Bitconverted vector operation
def PSrm : PSI<opc, MRMSrcMem,...
2012 Jul 09
2
[LLVMdev] question on table gen TIED_TO constraint
I need to implement an instruction which has 2 read-write registers, so I added
let Constraints = "$src1 = $dst, $mask = $mask_wb" in {
...
def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb),
(ins VR128:$src1, v128mem:$src2, VR128:$mask),
...
}
There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand.
See the section about MRMSrcMem in RecognizableInstr::emitInstructionSpecifier.
And the above gives us $dst, $mask_wb, $sr...
2008 Nov 17
2
[LLVMdev] Patterns with Multiple Stores
I want to write a pattern that looks something like this:
def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst),
(MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src,
(MOVSDmr addr:$dst, FR64:$src))), imm:3)
So I want to convert an unaligned vector store to a scalar store, a shuffle
and a scalar store.
There are several question I have:
- Is the imm:3 syntax...
2007 Dec 12
2
[LLVMdev] Bogus X86-64 Patterns
Tracking down a problem with one of our benchmark codes, we've discovered that
some of the patterns in X86InstrX86-64.td are wrong. Specifically:
def MOV64toPQIrm : RPDI<0x6E, MRMSrcMem, (outs VR128:$dst), (ins i64mem:$src),
"mov{d|q}\t{$src, $dst|$dst, $src}",
[(set VR128:$dst,
(v2i64 (scalar_to_vector (loadi64 addr:$src))))]>;
def MOVPQIto64mr : RPDI<0x7E, MRMDestMem, (outs), (ins i64mem:$dst, VR128:...
2008 Nov 17
0
[LLVMdev] Patterns with Multiple Stores
On Monday 17 November 2008 14:28, David Greene wrote:
> I want to write a pattern that looks something like this:
>
> def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst),
> (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src,
> (MOVSDmr addr:$dst, FR64:$src))), imm:3)
>
> So I want to convert an unaligned vector store to a scalar store, a shuffle
> and a scalar store.
I got a little further with this:...
2012 Jul 26
0
[LLVMdev] X86 sub_ss and sub_sd sub-register indexes
...lt;jolesen at apple.com> writes:
>> What happens if the result of the above pattern using COPY_TO_REGCLASS
>> is spilled? Will we get a 64-bit store or a 128-bit store?
>
> This behavior isn't affected by the change. FR64 registers are spilled
> with 64-bit stores, and VR128 registers are spilled with 128-bit
> stores.
>
> When the register coalescer removes a copy between VR128 and FR64
> registers, it chooses the larger spill size for the result. This is
> the same for sub-register copies and full register copies.
So if I understand this correctly, a...
2010 Aug 04
2
[LLVMdev] x86 Vector Shuffle Patterns
...$src1, node:$src2), [{
return X86::isVPERM2F128Mask(cast<ShuffleVectorSDNode>(N));
}], SHUFFLE_get_vperm2f128_imm>;
I don't understand completely how the new system all works. Take a
simple SHUFPS match:
def SHUFPSrri : PSIi8<0xC6, MRMSrcReg,
(outs VR128:$dst), (ins VR128:$src1,
VR128:$src2, i8imm:$src3),
"shufps\t{$src3, $src2, $dst|$dst, $src2, $src3}",
[(set VR128:$dst,
(v4f32 (shufp:$src3 VR128:$src1, VR128:$src2)))]>;
"...
2009 Apr 30
6
[LLVMdev] RFC: AVX Pattern Specification [LONG]
...(ins FR32:$src1, f32mem:$src2),
!strconcat(OpcodeStr, "ss\t{$src2, $dst|$dst, $src2}"),
[(set FR32:$dst, (OpNode FR32:$src1, (load addr:$src2)))]>;
// Vector operation, reg+reg.
def PSrr : PSI<opc, MRMSrcReg, (outs VR128:$dst),
(ins VR128:$src1, VR128:$src2),
!strconcat(OpcodeStr, "ps\t{$src2, $dst|$dst, $src2}"),
[(set VR128:$dst, (v4f32 (OpNode VR128:$src1,
VR128:$src2)))]> {
let isComm...
2009 Mar 24
0
[LLVMdev] Reducing .td redundancy
...!strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst,
> $src2}"),
> [(set FR32:$dst, (!SOME_CONCAT("x86f", OpNode)
> FR32:
> $src1, FR32:$src2))]>;
>
> // Vector operation
> def PSrr : PSI<opc, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src1,
> VR128:
> $src2),
> !strconcat(OpcodeStr, "ps"\t{$src2, $dst|$dst,
> $src2}"),
> [(set VR128:$dst, v2i64 (OpNode VR128:$src1,
> VR128:
> $src2))]>;
>
> // Bitconverted vector oper...
2009 Nov 03
1
[LLVMdev] Pat<> & tlbgen
Can someone explain the magic behind the Pat<> construct and tblgen.
>From X86InstrSSE.td:
def : Pat<(v4f32 (vector_shuffle VR128:$src, (undef), MOVDDUP_shuffle_mask)),
(MOVLHPSrr VR128:$src, VR128:$src)>, Requires<[HasSSE1]>;
Where's the code in tblgen to emit the matching code for this? I'm trying to
extend it so that Pat<> can be used as a general subclass for AVX:
class Base<dag pat,...
2009 Apr 28
1
[LLVMdev] Register class intersection
...isters - it also holds information about spill size and alignment.
Value types are no longer interesting once the selection DAG has been
destroyed. X86 has the weird examples as usual:
Classes RFP32, RFP64, and RFP80 are identical (FP0-6) except for the
spill size.
The same goes for FR64 and VR128 (XMM0-15).
The coalescer will join these classes as follows:
RFP32 + RFP64 -> RFP64
FR64 + VR128 -> VR128
This seems perfectly reasonable - choose the larger spill size and
avoid losing data.
TableGen thinks these classes are unrelated - it currently defines
register subclasses as fol...
2012 Jul 10
0
[LLVMdev] question on table gen TIED_TO constraint
On Jul 9, 2012, at 4:15 PM, Manman Ren <mren at apple.com> wrote:
>
> I need to implement an instruction which has 2 read-write registers, so I added
> let Constraints = "$src1 = $dst, $mask = $mask_wb" in {
> ...
> def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb),
> (ins VR128:$src1, v128mem:$src2, VR128:$mask),
> ...
> }
> There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand.
> See the section about MRMSrcMem in RecognizableInstr::emitInstructionSpecifier.
Can this be fixed...
2012 Jul 10
2
[LLVMdev] question on table gen TIED_TO constraint
...at 4:15 PM, Manman Ren <mren at apple.com> wrote:
>
>>
>> I need to implement an instruction which has 2 read-write registers, so I added
>> let Constraints = "$src1 = $dst, $mask = $mask_wb" in {
>> ...
>> def rm : AVX28I<opc, MRMSrcMem, (outs VR128:$dst, VR128:$mask_wb),
>> (ins VR128:$src1, v128mem:$src2, VR128:$mask),
>> ...
>> }
>> There is a problem since MRMSrcMem assumes the 2nd physical operand is a memory operand.
>> See the section about MRMSrcMem in RecognizableInstr::emitInstructionSpecifier....