thr3ads.net - search: "vector

2013 Jul 23

3

[LLVMdev] Vector DAG Patterns

...here elements of a vector are anded/ored together. My approach thus far has been to extract the sub elements of the vector and and/or those elements. This is ok for 4 vectors of i32s, but becomes cumbersome for v16i8s. Example instruction: andx $dst $v1 Pattern: [(set RC:$dst, (and (i32 (vector_extract(vt VC:$src), 0 ) ), (and (i32 (vector_extract(vt VC:$src), 1 ) ), (and (i32 (vector_extract(vt VC:$src), 2 ) ), (i32 (vector_extract(vt VC:$src), 3 ) ) ) ) ) )] Is there a better way to do this? Regards --- Conor Mac Aoidh

[LLVMdev] Vector DAG Patterns

2013 Jul 26

0

[LLVMdev] Vector DAG Patterns

To elaborate, it is not only cumbersome writing these patterns for vectors of 16 characters (v16i8), it does not work. When I compile with this pattern for an andx operation on v16i8: /[(set RC:$dst,// // (and (i8 (vector_extract(vt VC:$src), 0 ) ), // // (and (i8 (vector_extract(vt VC:$src), 1 ) ),// // (and (i8 (vector_extract(vt VC:$src), 2 ) ),// ////(and (i8 (vector_extract(vt VC:$src), 3 ) ),// // (and (i8 (vector_extract(vt VC:$src), 4 ) ),// // (and (i8 (vector_extract(vt VC:$sr...

sum elements in the vector

2016 May 18

3

sum elements in the vector

...class HORIZ_Op4<SDNode opc, RegisterClass regVT, ValueType rt, ValueType vt, string asmstr> : SHAVE_Instr<(outs regVT:$dst), (ins VRF128:$src), !strconcat(asmstr, " $dst $src"), [(set regVT:$dst, (opc (rt (vector_extract(vt VRF128:$src), 0 ) ), (opc (rt (vector_extract(vt VRF128:$src), 1 ) ), (opc (rt (vector_extract(vt VRF128:$src), 2 ) ), (rt (vector_extract(vt VRF128:$src), 3 ) ) )...

[LLVMdev] Multi-Instruction Patterns

2008 Sep 24

0

[LLVMdev] Multi-Instruction Patterns

...r-register / sub-register relationship. > > I'm not seeing how this is "conceptually correct." It's a vector > extract, not > a subregister. It's just that we want to reuse the same register. It is though. Sub-register is a machine specific concept. It means vector_extract can be modeled as subreg_extract on this machine. Nothing is wrong with thatt. > > > Perhaps the answer is to add vector extract support to the > coalescer, in > the same way you added subregister support. I don't understand the > nitty > gritty of that, though. I...

[LLVMdev] Bogus X86-64 Patterns

2007 Dec 12

2

[LLVMdev] Bogus X86-64 Patterns

...[(set VR128:$dst, (v2i64 (scalar_to_vector (loadi64 addr:$src))))]>; def MOVPQIto64mr : RPDI<0x7E, MRMDestMem, (outs), (ins i64mem:$dst, VR128: $src), "mov{d|q}\t{$src, $dst|$dst, $src}", [(store (i64 (vector_extract (v2i64 VR128:$src), (iPTR 0))), addr:$dst)]>; These say that for an AT&T-style assembler, output movd and for an Intel assembler, output movq. The problem is that such movs to and from memory must either use movq or put a rex prefix before the movd....

[LLVMdev] Multi-Instruction Patterns

2008 Sep 24

2

[LLVMdev] Multi-Instruction Patterns

On Wednesday 24 September 2008 02:10, Evan Cheng wrote: > > I wrote a pattern that looks something like the above in form, but how > > do I tell the selection DAG to prefer my pattern over another that > > already exists. I can't easily just disable that other pattern > > because > > it generates Machine Instruction opcode enums that are assumed to be > >

[LLVMdev] Question on tablegen

2009 May 06

2

[LLVMdev] Question on tablegen

Hello, I am trying to create a machine instruction for "extractelement". I want to translate r <- extractelement v, 0 to mov r, v.x I was looking at the dag I can use and I found vector_extract. The inputs for this SDnode are a register and a iPtr constant. With that, I need to create 4 separate def's to extract element 0, 1, 2, and 3 and translate to v.x, v.y, v.z, and v.w. I was wondering if I can use the dag's 2nd input as an index into a list of strings and form the assembly i...

Question on pattern matching extractelt

2019 Nov 28

2

Question on pattern matching extractelt

Hi, I have an issue with pattern matching. I have the following SelectionDAG: t13: i32 = extract_vector_elt t2, Constant:i64<1> That I am trying to match with the following pattern: def : Pat<(extractelt (v4i16 SingleReg:$v), 1), (SRADd1 SingleReg :$v, (i64 16))>; But for some reason the pattern does not match. It seems to be due to the fact extract_vector_elt's result

[LLVMdev] Need Advice on AVX

2009 Nov 24

0

[LLVMdev] Need Advice on AVX

...> pattern explicitly? > > It depends what you're want to do. But I guess you need to model > subreg access properly... Modeling subregisters isn't hard. Do you have some guidance as to when one method is preferable? I am leaning toward using the modifier since conceptually, a vector_extract of element zero on a v4i64 makes sense with AVX (so it is "legal"). You just have to emit the register name as "xmm" rather than "ymm." Why write an additional complicated pattern for this case? -Dave

[LLVMdev] Need Advice on AVX

2009 Nov 26

1

[LLVMdev] Need Advice on AVX

...>> It depends what you're want to do. But I guess you need to model >> subreg access properly... > > Modeling subregisters isn't hard. Do you have some guidance as to when > one method is preferable? I am leaning toward using the modifier since > conceptually, a vector_extract of element zero on a v4i64 makes sense with AVX > (so it is "legal"). You just have to emit the register name as "xmm" rather > than "ymm." Why write an additional complicated pattern for this case? Please don't use asmprinter modifiers. I'm trying...

[LLVMdev] Need Advice on AVX

2009 Nov 24

2

[LLVMdev] Need Advice on AVX

Hello, David > How does ${dst:subreg32} work? This is just modifier provided to asmprinting code. Here, it seems, 16 bit register is passed to asmprinter, but it sees modifier and grabs 32-bit superreg. > Can one do the same for sources? Yes, this is just modifier for printing, nothing more... > Is it preferable to use the source modifier or write an EXTRACT_SUBREG > pattern

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

0

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...stores. > > When the register coalescer removes a copy between VR128 and FR64 > registers, it chooses the larger spill size for the result. This is > the same for sub-register copies and full register copies. So if I understand this correctly, a pattern like this: def : Pat<(f64 (vector_extract (v2f64 VR128:$src), (iPTR 0))), (f64 (EXTRACT_SUBREG (v2f64 VR128:$src), sub_sd))>; will currently use a 128-bit store if it is spilled? That's really not good. If the 128-bit register is not ever used as a 128-bit register, shouldn't the coalescer pick the 64- or 32-bit r...

[LLVMdev] More AVX Advice Needed

2009 Dec 02

1

[LLVMdev] More AVX Advice Needed

...o need for an IR intrinsic. > X86ISD::INSERTPS is an extra instruction for ISel; it's used inside > the custom lowering for INSERT_VECTOR_ELT and VECTOR_SHUFFLE. Yes, that's how I found out about it. :) Why not just use ISD::INSERT_VECTOR_ELT? And what's the difference between vector_extract and extractelt in TargetSelectionDAG.td? Ditto vector_insert vs. insertelt. -Dave

[LLVMdev] Question on tablegen

2009 May 06

0

[LLVMdev] Question on tablegen

.... Dan On May 5, 2009, at 10:23 PM, Manjunath Kudlur wrote: > Hello, > > I am trying to create a machine instruction for "extractelement". I > want to translate > r <- extractelement v, 0 > to > mov r, v.x > > I was looking at the dag I can use and I found vector_extract. The > inputs for this SDnode are a register and a iPtr constant. With that, > I need to create 4 separate def's to extract element 0, 1, 2, and 3 > and translate to v.x, v.y, v.z, and v.w. I was wondering if I can use > the dag's 2nd input as an index into a list of strings and...

[LLVMdev] Multi-Instruction Patterns

2008 Sep 24

1

[LLVMdev] Multi-Instruction Patterns

...Evan Cheng wrote: > > I'm not seeing how this is "conceptually correct." It's a vector > > extract, not > > a subregister. It's just that we want to reuse the same register. > > It is though. Sub-register is a machine specific concept. It means > vector_extract can be modeled as subreg_extract on this machine. > Nothing is wrong with thatt. I didn't mean to imply anything was "wrong." It just strikes me as kind of strange, in a mind-warping kind of way. :) > > Perhaps the answer is to add vector extract support to the > >...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

2

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

On Jul 26, 2012, at 9:43 AM, dag at cray.com wrote: > Jakob Stoklund Olesen <jolesen at apple.com> writes: > >> As far as I can tell, all sub-register operations involving sub_ss and >> sub_sd can simply be replaced with COPY_TO_REGCLASS: >> >> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), >> (VMOVSDrr VR128:$src1,

[LLVMdev] Question on tablegen

2009 May 08

2

[LLVMdev] Question on tablegen

...njunath Kudlur wrote: > >> Hello, >> >> I am trying to create a machine instruction for "extractelement". I >> want to translate >> r <- extractelement v, 0 >> to >> mov r, v.x >> >> I was looking at the dag I can use and I found vector_extract. The >> inputs for this SDnode are a register and a iPtr constant. With that, >> I need to create 4 separate def's to extract element 0, 1, 2, and 3 >> and translate to v.x, v.y, v.z, and v.w. I was wondering if I can use >> the dag's 2nd input as an index into a lis...

[LLVMdev] Multi-Instruction Patterns

2008 Sep 24

3

[LLVMdev] Multi-Instruction Patterns

...tions. I saw this in >> x86InstSSE.td: >> >> // FIXME: may not be able to eliminate this movss with coalescing the >> src and >> // dest register classes are different. We really want to write this >> pattern >> // like this: >> // def : Pat<(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))), >> // (f32 FR32:$src)>; >> >> (this is actually a very useful and important pattern, I wish it was >> available!) > > Right. It would be nice to be able to eliminate the unnecessary > movss. It hasn't shown up on my...

[LLVMdev] Multi-Instruction Patterns

2008 Sep 24

2

[LLVMdev] Multi-Instruction Patterns

...n as transforming from one DAG to another, not down to machine instructions. I saw this in x86InstSSE.td: // FIXME: may not be able to eliminate this movss with coalescing the src and // dest register classes are different. We really want to write this pattern // like this: // def : Pat<(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))), // (f32 FR32:$src)>; (this is actually a very useful and important pattern, I wish it was available!) I had actually written my pattern in a similar style before I found this. When I tried to build, tblgen complained about the pattern being of an unkn...

[LLVMdev] Question on tablegen

2009 May 08

0

[LLVMdev] Question on tablegen

...njunath Kudlur wrote: > >> Hello, >> >> I am trying to create a machine instruction for "extractelement". I >> want to translate >> r <- extractelement v, 0 >> to >> mov r, v.x >> >> I was looking at the dag I can use and I found vector_extract. The >> inputs for this SDnode are a register and a iPtr constant. With that, >> I need to create 4 separate def's to extract element 0, 1, 2, and 3 >> and translate to v.x, v.y, v.z, and v.w. I was wondering if I can use >> the dag's 2nd input as an index into a lis...

search for: vector_extract