thr3ads.net - search: "v2f64"

Displaying 20 results from an estimated 96 matches for "v2f64".

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 02

[LLVMdev] Making Sense of ISel DAG Output

On Thursday 02 October 2008 11:37, David Greene wrote: > I'll try ot write a small example and send it in a bit. Ok, here's what I'm trying to do: let AddedComplexity = 40 in { def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 addr: $src1))), (v2f64 (scalar_to_vector (loadf64 addr: $src2))), SHUFP_shuffle_mask:$sm)), (SHUFPDrri (v2f64 (MOVSD2PDrm addr:$src1)), (v2f64 (MOVSD2...

[LLVMdev] problem compiling x86 intrinsic function

2009 Dec 29

[LLVMdev] problem compiling x86 intrinsic function

I am trying to compile this little C-program: ================= typedef double v2f64 __attribute__((ext_vector_type(2))); int sse2_cmp_sd(v2f64, v2f64, char ) asm("llvm.x86.sse2.cmp.sd"); int main() { static int i; static float x[10]; static float y[10]; v2f64 m1; v2f64 m2; int j; i = sse2_cmp_sd(m1,m2,'z'); ========================== I ex...

[LLVMdev] Patterns with Multiple Stores

2008 Nov 17

[LLVMdev] Patterns with Multiple Stores

On Monday 17 November 2008 14:28, David Greene wrote: > I want to write a pattern that looks something like this: > > def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), > (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, > (MOVSDmr addr:$dst, FR64:$src))), imm:3) > > So I want to convert an unaligned vector store to a scalar store, a shuffle > and a scalar store. I got a little further with t...

[LLVMdev] Patterns with Multiple Stores

2008 Nov 17

[LLVMdev] Patterns with Multiple Stores

I want to write a pattern that looks something like this: def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, (MOVSDmr addr:$dst, FR64:$src))), imm:3) So I want to convert an unaligned vector store to a scalar store, a shuffle and a scalar store. There are several question I have: - Is the imm:3...

[LLVMdev] Legal action type for BUILD_VECTOR

2011 Sep 30

[LLVMdev] Legal action type for BUILD_VECTOR

...g-point operations (IBM's double-hummer instruction set used on the BG/P supercomputers). In this instruction set, each of the existing floating-point registers becomes the first of two vector elements. I am having trouble optimizing the BUILD_VECTOR operation for the case where I am building a v2f64 vector out of two operands. I tried writing this pattern as: def : Pat<(v2f64 (build_vector F8RC:$A, F8RC:$B)), (FSMFP (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), F8RC:$A, sub_64), (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), F8RC:$B, sub_64))>; Where the FSMFP instruction c...

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 02

[LLVMdev] Making Sense of ISel DAG Output

I'm debugging some X86 patterns and I want to understand the debug dumps from isel better. Here's some example output: 0x391bc40: i64,ch = load 0x3922c50, 0x391b8d0, 0x38dc530 <0x39053e0:0> <sext i32> alignment=4 srcLineNum= 10 0x3922c50: <multiple use> 0x391bc40: <multiple use> 0x3856ab0: <multiple use> 0x3914520: i64 =

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 07

[LLVMdev] Making Sense of ISel DAG Output

...ring selection. Some parts of selection know how to clean up > nodes that become dead during selection, but my guess is that > it's missing some cases. Ok, as far as I can tell, here's what's happening. I have the following pattern: let AddedComplexity = 40 in { def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 addr: $src1))), (v2f64 (scalar_to_vector (loadf64 addr: $src2))), SHUFP_shuffle_mask:$sm)), (SHUFPDrri (v2f64 (MOVSD2PDrm addr:$src1)), (v2f64...

[LLVMdev] problem compiling x86 intrinsic function

2009 Dec 29

[LLVMdev] problem compiling x86 intrinsic function

On Dec 29, 2009, at 5:50 AM, fima rabin wrote: > I am trying to compile this little C-program: > ================= > typedef double v2f64 __attribute__((ext_vector_type(2))); > > int sse2_cmp_sd(v2f64, v2f64, char ) asm("llvm.x86.sse2.cmp.sd"); We used to support this, but there are problems with it. I actually just went to go implement this again, which illustrated why this is a bad idea. If you apply this patc...

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

2014 Sep 18

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

I tried to add an 'OptForSize' requirement to a pattern in X86InstrSSE.td, but it appears to be ignored. However, the condition was detected when specified as a predicate. So this doesn't work: def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm addr: $src)>, *Requires<[OptForSize**]>*; But this does: * let Predicates = [OptForSize] in* { def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm addr :$src)>; } I see both forms used on so...

[LLVMdev] Patterns with Multiple Stores

2008 Nov 18

[LLVMdev] Patterns with Multiple Stores

On Nov 17, 2008, at 3:50 PM, David Greene wrote: > On Monday 17 November 2008 14:28, David Greene wrote: >> I want to write a pattern that looks something like this: >> >> def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), >> (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri >> (VR128:$src, >> (MOVSDmr addr:$dst, FR64:$src))), imm:3) >> >> So I want to convert an unaligned vector store to a scalar store, a >> shuffle >> and a scal...

[LLVMdev] Opinions Wanted: New asm Comments

2011 Jul 11

[LLVMdev] Opinions Wanted: New asm Comments

I have a patch I'd like to commit that adds commentary to asm files about which TableGen pattern generated a particular instruction. The output looks like this: cvtpd2ps %xmm0, %xmm0 # source.c:39 # Src: (intrinsic_wo_chain:v4f32 927:iPTR, VR128:v2f64:$src) # Dst: (Int_CVTPD2PSrr:v4f32 VR128:v2f64:$src) This is enormously helpful when trying to track down codegen bugs but clutters the asm file pretty badly for "ordinary" users. Right now I have this under control of a separate -asm-pattern fla...

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 07

[LLVMdev] Making Sense of ISel DAG Output

...know how to clean up >> nodes that become dead during selection, but my guess is that >> it's missing some cases. > > Ok, as far as I can tell, here's what's happening. > > I have the following pattern: > > let AddedComplexity = 40 in { > def : Pat<(v2f64 (vector_shuffle (v2f64 (scalar_to_vector (loadf64 > addr: > $src1))), > (v2f64 (scalar_to_vector (loadf64 > addr: > $src2))), > SHUFP_shuffle_mask:$sm)), > (SHUFPDrri (v2f64 (MOVSD2PDrm addr:$src...

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

2014 Sep 19

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

...jay Patel wrote: > > I tried to add an 'OptForSize' requirement to a pattern in X86InstrSSE.td, > > but it appears to be ignored. However, the condition was detected when > > specified as a predicate. > > > > So this doesn't work: > > def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm > addr: > > $src)>, > > *Requires<[OptForSize**]>*; > > > > But this does: > > * let Predicates = [OptForSize] in* { > > def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src)))...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...; When the register coalescer removes a copy between VR128 and FR64 > registers, it chooses the larger spill size for the result. This is > the same for sub-register copies and full register copies. So if I understand this correctly, a pattern like this: def : Pat<(f64 (vector_extract (v2f64 VR128:$src), (iPTR 0))), (f64 (EXTRACT_SUBREG (v2f64 VR128:$src), sub_sd))>; will currently use a 128-bit store if it is spilled? That's really not good. If the 128-bit register is not ever used as a 128-bit register, shouldn't the coalescer pick the 64- or 32-bit register...

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 03

[LLVMdev] Making Sense of ISel DAG Output

On Fri, October 3, 2008 9:10 am, David Greene wrote: > On Thursday 02 October 2008 19:32, Dan Gohman wrote: > >> Looking at your dump() output above, it looks like the pre-selection >> loads have multiple uses, so even though you've managed to match a >> larger pattern that incorporates them, they still need to exist to >> satisfy some other users. > > Yes,

[LLVMdev] SelectionDAG legality: isel creating cycles

2010 Feb 22

[LLVMdev] SelectionDAG legality: isel creating cycles

I've run into a situation in isel where it seems like the selector is generating a cycle in the DAG. I have something like this: 0x215f140: v2f64 = llvm.x86.sse2.min.sd 0x215efd0, 0x21606d0, 0x215eb80 [0] 0x215efd0: i64 = Constant <647> [0] 0x21606d0: v2f64 = scalar_to_vector 0x213b8f0 [0] 0x213b8f0: f64,ch = load 0x213b780, 0x213aa90, 0x213b610 <0x2113690:0> alignment=8 [0] 0x213b780: ch = Prefetch 0x213aee0:1, 0x213b1c0,...

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 03

[LLVMdev] Making Sense of ISel DAG Output

On Thursday 02 October 2008 19:32, Dan Gohman wrote: > Looking at your dump() output above, it looks like the pre-selection > loads have multiple uses, so even though you've managed to match a > larger pattern that incorporates them, they still need to exist to > satisfy some other users. Yes, I looked at that too. It looks like these other uses end up being chains to

[LLVMdev] Making Sense of ISel DAG Output

2008 Oct 02

[LLVMdev] Making Sense of ISel DAG Output

On Thursday 02 October 2008 12:42, David Greene wrote: > But let's say you _could_ write such a pattern (because I can). The input > DAG looks like this: > > 0x391a220: <multiple use> > 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10 > 0x391ac10: <multiple use> > 0x391c8b0: v2f64 = scalar_to_vector 0x391ac10 srcLineNum= 10 > 0x3927b10: <multiple use> > 0x3923100: v2f64 = vector_shuffle 0x391c970, 0x391c8b0, > 0x3927b10...

[LLVMdev] Instruction pattern type inference problem

2007 Apr 23

[LLVMdev] Instruction pattern type inference problem

Digging deeper... 1. Is there a good reason that v2f32 types are excluded from the isFloatingPoint filter? Looks like a bug to me. v2f32 = 22, // 2 x f32 v4f32 = 23, // 4 x f32 <== start ?? v2f64 = 24, // 2 x f64 <== end static inline bool isFloatingPoint(ValueType VT) { return (VT >= f32 && VT <= f128) || (VT >= v4f32 && VT <= v2f64); } 2. My problem seems to stem from what appears to be under-constrained typing of patterns. With v...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

On Jul 26, 2012, at 9:43 AM, dag at cray.com wrote: > Jakob Stoklund Olesen <jolesen at apple.com> writes: > >> As far as I can tell, all sub-register operations involving sub_ss and >> sub_sd can simply be replaced with COPY_TO_REGCLASS: >> >> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), >> (VMOVSDrr VR128:$src1,

search for: v2f64