thr3ads.net - search: "fr64"

Displaying 20 results from an estimated 32 matches for "fr64".

Did you mean: f64

[LLVMdev] help with X86 DAG->DAG Instruction Selection

2013 Feb 08

[LLVMdev] help with X86 DAG->DAG Instruction Selection

...SP<imp-def,dead>, %EFLAGS<imp-def,dead>, %ESP<imp-use> ; line 1 %vreg187<def> = COPY %ESP; GR32:%vreg187 ; line 2 MOVSDmr %vreg187, 1, %noreg, 0, %noreg, %vreg36; mem:ST8[Stack] GR32:%vreg187 FR64:%vreg36 ; line 3 %vreg188<def> = MOV32rm %vreg112, 1, %noreg, 252, %noreg; mem:LD4[%108] GR32:%vreg188,%vreg112 %vreg189<def> = MOV32rm %vreg112, 1, %noreg, 256, %noreg; mem:LD4[%111] GR32:%vreg189,%vreg112 %vreg190<def> = MOVSDrm <fi#0>, 1, %noreg, 120, %nore...

[LLVMdev] help with X86 DAG->DAG Instruction Selection

2013 Feb 08

[LLVMdev] help with X86 DAG->DAG Instruction Selection

...ead>, %ESP<imp-use> ; line 1 > %vreg187<def> = COPY %ESP; GR32:%vreg187 ; line 2 > MOVSDmr %vreg187, 1, %noreg, 0, %noreg, %vreg36; mem:ST8[Stack] GR32:%vreg187 FR64:%vreg36 ; line 3 > %vreg188<def> = MOV32rm %vreg112, 1, %noreg, 252, %noreg; mem:LD4[%108] GR32:%vreg188,%vreg112 > %vreg189<def> = MOV32rm %vreg112, 1, %noreg, 256, %noreg; mem:LD4[%111] GR32:%vreg189,%vreg112 > %vreg190<def> = MOVSDrm <fi#0>, 1, %...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...28:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), >> sub_sd))>; >> >> Becomes: >> >> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), >> (VMOVSDrr VR128:$src1, (COPY_TO_REGCLASS VR128:$src2, FR64))>; > > A few questions: > > Will COPY_TO_REGCLASS actually generate a copy instruction or can > TableGen/isel fold it away? Both EXTRACT_SUBREG and COPY_TO_REGCLASS are emitted as COPY instructions by InstrEmitter. One as a sub-register copy, one as a full register copy. Both...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...ub_ss) --> XMM0 They are supposed to represent the 32-bit and 64-bit low parts of the xmm registers, but since we don't define explicit registers for those sub-registers, we are left with idempotent sub-register indexes. We have three different register classes for the xmm registers: FR32, FR64, and VR128. The sub_ss and sub_sd indexes used to play a role in selecting the right register class, but not any longer. That is all derived from the instruction descriptions now. As far as I can tell, all sub-register operations involving sub_ss and sub_sd can simply be replaced with COPY_TO_REGC...

[LLVMdev] Register class intersection

2009 Apr 28

[LLVMdev] Register class intersection

...of registers - it also holds information about spill size and alignment. Value types are no longer interesting once the selection DAG has been destroyed. X86 has the weird examples as usual: Classes RFP32, RFP64, and RFP80 are identical (FP0-6) except for the spill size. The same goes for FR64 and VR128 (XMM0-15). The coalescer will join these classes as follows: RFP32 + RFP64 -> RFP64 FR64 + VR128 -> VR128 This seems perfectly reasonable - choose the larger spill size and avoid losing data. TableGen thinks these classes are unrelated - it currently defines register subclas...

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...(VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2), > sub_sd))>; > > Becomes: > > def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), > (VMOVSDrr VR128:$src1, (COPY_TO_REGCLASS VR128:$src2, FR64))>; A few questions: Will COPY_TO_REGCLASS actually generate a copy instruction or can TableGen/isel fold it away? What happens if the result of the above pattern using COPY_TO_REGCLASS is spilled? Will we get a 64-bit store or a 128-bit store? -Dave

[LLVMdev] Bogus X86-64 Patterns

2007 Dec 12

[LLVMdev] Bogus X86-64 Patterns

...q {$src, $dst|$dst, $src}", + "mov{d|q} {$src, $dst|$dst, $src}", [(store (i64 (vector_extract (v2i64 VR128:$src), (iPTR 0))), addr:$dst)]>; def MOV64toSDrr : RPDI<0x6E, MRMSrcReg, (ops FR64:$dst, GR64:$src), - "movq {$src, $dst|$dst, $src}", + "mov{d|q} {$src, $dst|$dst, $src}", [(set FR64:$dst, (bitconvert GR64:$src))]>; def MOV64toSDrm : RPDI<0x6E, MRMSrcMem, (ops FR64:$dst, i64mem:$src),...

[LLVMdev] Reducing .td redundancy

2009 Mar 24

[LLVMdev] Reducing .td redundancy

...SSrr_Int SSI<opc, MRMSrcReg, (outs FR32:$dst), (ins FR32:$src1, FR32: $src2), !strconcat(OpcodeStr, "ss\t{$src2, $dst|$dst, $src2}"), [(set FR32:$dst, (SOME_CONCAT(Intr, _ss) FR32:$src1, FR32: $src2))]> { def SDrr_Int SSI<opc, MRMSrcReg, (outs FR64:$dst), (ins FR64:$src1, FR64: $src2), !strconcat(OpcodeStr, "ss\t{$src2, $dst|$dst, $src2}"), [(set FR32:$dst, (SOME_CONCAT(Intr, _sd) FR32:$src1, FR32: $src2))]> { } defm ADD : myintrinsics<0x58, "add", int_x86_sse2_add>; I want the...

[LLVMdev] Patterns with Multiple Stores

2008 Nov 17

[LLVMdev] Patterns with Multiple Stores

I want to write a pattern that looks something like this: def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, (MOVSDmr addr:$dst, FR64:$src))), imm:3) So I want to convert an unaligned vector store to a scalar store, a shuffle and a scalar store. There are several question I have: - Is the imm:3 syntax correct? Basically I want to hard-code the shuffle mask - The first MOVSD doesn't really "feed" the SHUFPD. H...

[LLVMdev] Patterns with Multiple Stores

2008 Nov 17

[LLVMdev] Patterns with Multiple Stores

On Monday 17 November 2008 14:28, David Greene wrote: > I want to write a pattern that looks something like this: > > def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), > (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, > (MOVSDmr addr:$dst, FR64:$src))), imm:3) > > So I want to convert an unaligned vector store to a scalar store, a shuffle > and a scalar store. I got a little further with this: def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), (MOVSDmr (ADD64ri8 (LEA64r addr:$dst), 8), (MOVPD2SDrr (SHUFPDrri...

2019 Oct 25

Hello, I have studied register allocation in theoretical aspects and exploring the same in the implementation level. I need a minimal testcase for register spilling to analyze spilling procedure in llvm. I tried with a testcase taking 20 variables but all the 20 variables are getting stored in the stack using %rbp. Maybe my live variable analysis is wrong. Please help me with a minimal testcase

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

Jakob Stoklund Olesen <jolesen at apple.com> writes: >> What happens if the result of the above pattern using COPY_TO_REGCLASS >> is spilled? Will we get a 64-bit store or a 128-bit store? > > This behavior isn't affected by the change. FR64 registers are spilled > with 64-bit stores, and VR128 registers are spilled with 128-bit > stores. > > When the register coalescer removes a copy between VR128 and FR64 > registers, it chooses the larger spill size for the result. This is > the same for sub-register copies and ful...

Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)

2018 Feb 28

Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)

On 02/27/2018 10:21 AM, Alex Wang via llvm-dev wrote: > Hello all! > > I was looking through the results of disassembling a heavily-used > short function > in the program I'm working on, and ended up wondering why LLVM was > generating > that assembly and what changes would be necessary to improve the code. > I asked > on #llvm, but it seems that the people with

Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)

2018 Feb 27

Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)

Hello all! I was looking through the results of disassembling a heavily-used short function in the program I'm working on, and ended up wondering why LLVM was generating that assembly and what changes would be necessary to improve the code. I asked on #llvm, but it seems that the people with the necessary expertise weren't around. Here is a condensed version of the code:

[LLVMdev] Reducing .td redundancy

2009 Mar 24

[LLVMdev] Reducing .td redundancy

On Mar 23, 2009, at 5:56 PM, David Greene wrote: > Is it legal to do something like a !strconcat on a non-string > entity? That > is, is there some operation that will let me do this (replace > SOME_CONCAT with > an appropriate operator): I don't get it, can you try a simpler example on me? :) -Chris > > > (WARNING! Hacked-up tablegen ahead!) > >

[LLVMdev] RFC: AVX Pattern Specification [LONG]

2009 Apr 30

[LLVMdev] RFC: AVX Pattern Specification [LONG]

...t;$src1 = $dst" in { multiclass basic_sse2_fp_binop_rm<bits<8> opc, string OpcodeStr, SDNode OpNode, Intrinsic F64Int, bit Commutable = 0> { // Scalar operation, reg+reg. def SDrr : SDI<opc, MRMSrcReg, (outs FR64:$dst), (ins FR64:$src1, FR64:$src2), !strconcat(OpcodeStr, "sd\t{$src2, $dst|$dst, $src2}"), [(set FR64:$dst, (OpNode FR64:$src1, FR64:$src2))]> { let isCommutable = Commutable; } // Scalar operation, reg+mem....

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

...ote: > Jakob Stoklund Olesen <jolesen at apple.com> writes: > >>> What happens if the result of the above pattern using COPY_TO_REGCLASS >>> is spilled? Will we get a 64-bit store or a 128-bit store? >> >> This behavior isn't affected by the change. FR64 registers are spilled >> with 64-bit stores, and VR128 registers are spilled with 128-bit >> stores. >> >> When the register coalescer removes a copy between VR128 and FR64 >> registers, it chooses the larger spill size for the result. This is >> the same for su...

[LLVMdev] Reducing .td redundancy

2009 Mar 24

[LLVMdev] Reducing .td redundancy

Is it legal to do something like a !strconcat on a non-string entity? That is, is there some operation that will let me do this (replace SOME_CONCAT with an appropriate operator): (WARNING! Hacked-up tablegen ahead!) multiclass sse_fp_binop_bitwise_rm<bits<8> opc, string OpcodeStr, SDNode OpNode> { // Vector operation emulating scalar (fp)

Mischeduler: Unknown reason for peak register pressure increase

2017 Aug 12

Mischeduler: Unknown reason for peak register pressure increase

I am working on a project where we are integrating an existing pre-RA scheduler into LLVM and we are trying to match our peak register pressure values with the machine instruction schedulers values while using X86. I am finding some mismatches in test cases like the one attached. The registers "AH" and "AL" are live-out but not live-in and I don't see that they are defined

[LLVMdev] Patterns with Multiple Stores

2008 Nov 18

[LLVMdev] Patterns with Multiple Stores

...:28, David Greene wrote: >> I want to write a pattern that looks something like this: >> >> def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), >> (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri >> (VR128:$src, >> (MOVSDmr addr:$dst, FR64:$src))), imm:3) >> >> So I want to convert an unaligned vector store to a scalar store, a >> shuffle >> and a scalar store. > > I got a little further with this: > > def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), > (MOVSDmr (ADD64ri8 (...

search for: fr64