search for: g8rc

Displaying 20 results from an estimated 31 matches for "g8rc".

2017 Oct 13
2
Machine Scheduler on Power PC: Latency Limit and Register Pressure
...t of the function. Those loads use 12 registers before any of the divides are scheduled. As a result, we end up with significantly higher register pressure after all the loads. -- 0B BB#0: derived from LLVM BB %entry Live Ins: %X3 %X4 16B %vreg1<def> = COPY %X4; G8RC_and_G8RC_NOX0:%vreg1 32B %vreg0<def> = COPY %X3; G8RC_and_G8RC_NOX0:%vreg0 48B %vreg2<def> = LD 0, %vreg0; mem:LD8[%num](tbaa=!4) G8RC:%vreg2 G8RC_and_G8RC_NOX0:%vreg0 64B %vreg3<def> = LD 0, %vreg1; mem:LD8[%den](tbaa=!4) G8RC:%vreg3 G8RC_and...
2012 Jun 07
0
[LLVMdev] Instruction Cleanup Questions
...> > > > and the RA should eliminate trivial copies. On PPC, normal moves are encoded as OR instructions where the two operands being ORed together are the same. These self moves, as it turns out, come from things like this: %vreg18<def> = OR8To4 %vreg16, %vreg16; GPRC:%vreg18 G8RC:%vreg16 This is generated from the pattern: def : Pat<(i32 (trunc G8RC:$in)), (OR8To4 G8RC:$in, G8RC:$in)>; So, as far as RA is concerned, this is a "real" operation (a binary OR which truncates the result to 32-bits (from 64-bit inputs)). In effect, however, this is ju...
2017 Feb 09
2
Improving the split heuristics for the Greedy Register Allocator
...ber of tweaks to the costs > (both existing and experimental ones that I've added). However, despite all > of that, I can't seem to get RA to split the following: > > 1 BB#0: derived from LLVM BB %entry > 2 Live Ins: %X3 > 3 %vreg15<def> = COPY %X3; G8RC:%vreg15 > 4 %vreg4<def> = CMPLDI %vreg15, 0; CRRC:%vreg4 G8RC:%vreg15 > 5 %vreg11:sub_32<def,read-undef> = LI 0; G8RC:%vreg11 > 6 BCC 68, %vreg4, <BB#1>; CRRC:%vreg4 > 7 Successors according to CFG: BB#4(0x30000000 / 0x80000000 = 37....
2012 Jun 07
1
[LLVMdev] Instruction Cleanup Questions
...Finkel <hfinkel at anl.gov> wrote: > On PPC, normal moves are encoded as OR instructions where the two > operands being ORed together are the same. These self moves, as it > turns out, come from things like this: > > %vreg18<def> = OR8To4 %vreg16, %vreg16; GPRC:%vreg18 G8RC:%vreg16 > > This is generated from the pattern: > > def : Pat<(i32 (trunc G8RC:$in)), > (OR8To4 G8RC:$in, G8RC:$in)>; > > So, as far as RA is concerned, this is a "real" operation (a binary OR > which truncates the result to 32-bits (from 64-bit i...
2012 Jun 07
4
[LLVMdev] Instruction Cleanup Questions
Hi Hal, On 07/06/2012 09:57, Chandler Carruth wrote: > On Wed, Jun 6, 2012 at 10:37 PM, Hal Finkel <hfinkel at anl.gov > <mailto:hfinkel at anl.gov>> wrote: > > I am working on cleaning up some PPC code generation. Two questions: > > 1. Which pass is responsible for cleaning up self-moves: > 0x00000000100057c0 <+208>: mr r3,r3 > and
2017 Jan 13
2
Improving the split heuristics for the Greedy Register Allocator
...ve attempted a very large number of tweaks to the costs (both existing and experimental ones that I've added). However, despite all of that, I can't seem to get RA to split the following: 1 BB#0: derived from LLVM BB %entry 2 Live Ins: %X3 3 %vreg15<def> = COPY %X3; G8RC:%vreg15 4 %vreg4<def> = CMPLDI %vreg15, 0; CRRC:%vreg4 G8RC:%vreg15 5 %vreg11:sub_32<def,read-undef> = LI 0; G8RC:%vreg11 6 BCC 68, %vreg4, <BB#1>; CRRC:%vreg4 7 Successors according to CFG: BB#4(0x30000000 / 0x80000000 = 37.50%) BB#1(0x50000000...
2017 Oct 13
3
Machine Scheduler on Power PC: Latency Limit and Register Pressure
...12 registers before any of the divides are scheduled. As a result, we end up with significantly higher register pressure after all the loads. >> -- >> 0B BB#0: derived from LLVM BB %entry >> Live Ins: %X3 %X4 >> 16B %vreg1<def> = COPY %X4; G8RC_and_G8RC_NOX0:%vreg1 >> 32B %vreg0<def> = COPY %X3; G8RC_and_G8RC_NOX0:%vreg0 >> 48B %vreg2<def> = LD 0, %vreg0; mem:LD8[%num](tbaa=!4) G8RC:%vreg2 G8RC_and_G8RC_NOX0:%vreg0 >> 64B %vreg3<def> = LD 0, %vreg1; mem:LD8[%den](tbaa...
2008 Jul 10
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...st cast both values to const TargetRegisterClass*. > > Evan > > On Jul 10, 2008, at 7:36 AM, Gary Benson wrote: > > Evan Cheng wrote: > > > How about? > > > > > > const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : > > > &PPC:G8RCRegClass; > > > unsigned TmpReg = RegInfo.createVirtualRegister(RC); > > > > I tried something like that yesterday: > > > > const TargetRegisterClass *RC = > > is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass; > > > > but I kept getting...
2008 Jul 08
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
PPCTargetLowering::EmitInstrWithCustomInserter has a reference to the current MachineFunction for other purposes. Can you use MachineFunction::getRegInfo instead? Dan On Jul 8, 2008, at 1:56 PM, Gary Benson wrote: > Would it be acceptable to change MachineInstr::getRegInfo from private > to public so I can use it from > PPCTargetLowering::EmitInstrWithCustomInserter? > >
2008 Jul 11
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...============== --- lib/Target/PowerPC/PPCInstr64Bit.td (revision 53464) +++ lib/Target/PowerPC/PPCInstr64Bit.td (working copy) @@ -116,23 +116,34 @@ def : Pat<(PPCcall_ELF (i64 texternalsym:$dst)), (BL8_ELF texternalsym:$dst)>; -// Atomic operations. -def LDARX : Pseudo<(outs G8RC:$rD), (ins memrr:$ptr, i32imm:$label), - "\nLa${label}_entry:\n\tldarx $rD, $ptr", - [(set G8RC:$rD, (PPClarx xoaddr:$ptr, imm:$label))]>; +// Atomic operations +let usesCustomDAGSchedInserter = 1 in { + let Uses = [CR0] in { + def ATOMIC_LOAD_AD...
2008 Jul 11
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...nst TargetRegisterClass*. >> >> Evan >> >> On Jul 10, 2008, at 7:36 AM, Gary Benson wrote: >>> Evan Cheng wrote: >>>> How about? >>>> >>>> const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : >>>> &PPC:G8RCRegClass; >>>> unsigned TmpReg = RegInfo.createVirtualRegister(RC); >>> >>> I tried something like that yesterday: >>> >>> const TargetRegisterClass *RC = >>> is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass; >>> >>&g...
2008 Jul 10
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Just cast both values to const TargetRegisterClass*. Evan On Jul 10, 2008, at 7:36 AM, Gary Benson wrote: > Evan Cheng wrote: >> How about? >> >> const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : >> &PPC:G8RCRegClass; >> unsigned TmpReg = RegInfo.createVirtualRegister(RC); > > I tried something like that yesterday: > > const TargetRegisterClass *RC = > is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass; > > but I kept getting this error no matter how I arranged it:...
2008 Jul 10
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Evan Cheng wrote: > How about? > > const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : > &PPC:G8RCRegClass; > unsigned TmpReg = RegInfo.createVirtualRegister(RC); I tried something like that yesterday: const TargetRegisterClass *RC = is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass; but I kept getting this error no matter how I arranged it: error: conditional expression b...
2012 Jun 07
1
[LLVMdev] Instruction Cleanup Questions
...el <hfinkel at anl.gov> wrote: > > On PPC, normal moves are encoded as OR instructions where the two > operands being ORed together are the same. These self moves, as it > turns out, come from things like this: > > %vreg18<def> = OR8To4 %vreg16, %vreg16; GPRC:%vreg18 G8RC:%vreg16 > > This is generated from the pattern: > > def : Pat<(i32 (trunc G8RC:$in)), > (OR8To4 G8RC:$in, G8RC:$in)>; > > So, as far as RA is concerned, this is a "real" operation (a binary OR > which truncates the result to 32-bits (from 64-bit i...
2008 Jun 30
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
You need to insert new basic blocks and update CFG to accomplish this. There is a hackish way to do this right now. Add a pseudo instruction to represent this operation and mark it usesCustomDAGSchedInserter. This means the intrinsic is mapped to a single (pseudo) node. But it is then expanded into instructions that can span multiple basic blocks. See
2008 Jul 09
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...============== --- lib/Target/PowerPC/PPCInstr64Bit.td (revision 52957) +++ lib/Target/PowerPC/PPCInstr64Bit.td (working copy) @@ -116,23 +116,34 @@ def : Pat<(PPCcall_ELF (i64 texternalsym:$dst)), (BL8_ELF texternalsym:$dst)>; -// Atomic operations. -def LDARX : Pseudo<(outs G8RC:$rD), (ins memrr:$ptr, i32imm:$label), - "\nLa${label}_entry:\n\tldarx $rD, $ptr", - [(set G8RC:$rD, (PPClarx xoaddr:$ptr, imm:$label))]>; +// Atomic operations +let usesCustomDAGSchedInserter = 1 in { + let Uses = [CR0] in { + def ATOMIC_LOAD_AD...
2008 Jul 08
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Would it be acceptable to change MachineInstr::getRegInfo from private to public so I can use it from PPCTargetLowering::EmitInstrWithCustomInserter? Cheers, Gary Evan Cheng wrote: > Look for createVirtualRegister. These are examples in > PPCISelLowering.cpp. > > Evan > On Jul 8, 2008, at 8:24 AM, Gary Benson wrote: > > > Hi Evan, > > > > Evan Cheng wrote:
2008 Jun 30
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Chris Lattner wrote: > On Jun 27, 2008, at 8:27 AM, Gary Benson wrote: > > def CMP_UNRESw : Pseudo<(outs), (ins GPRC:$rA, GPRC:$rB, i32imm: > > $label), > > "cmpw $rA, $rB\n\tbne- La${label}_exit", > > [(PPCcmp_unres GPRC:$rA, GPRC:$rB, imm: > > $label)]>; > > } > > > > ...and
2008 Jul 02
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...============== --- lib/Target/PowerPC/PPCInstr64Bit.td (revision 52957) +++ lib/Target/PowerPC/PPCInstr64Bit.td (working copy) @@ -116,23 +116,35 @@ def : Pat<(PPCcall_ELF (i64 texternalsym:$dst)), (BL8_ELF texternalsym:$dst)>; -// Atomic operations. -def LDARX : Pseudo<(outs G8RC:$rD), (ins memrr:$ptr, i32imm:$label), - "\nLa${label}_entry:\n\tldarx $rD, $ptr", - [(set G8RC:$rD, (PPClarx xoaddr:$ptr, imm:$label))]>; +// Atomic operations +let usesCustomDAGSchedInserter = 1 in { + let Uses = [CR0] in { + let Uses = [R0] in...
2012 Jun 08
2
[LLVMdev] Strong vs. default phi elimination and single-reg classes
...pies are introduced (which I don't completely understand), and the register allocator tries to spill the count register. For example, with strong-phi elimination, I get (as a simple example): BB#0: derived from LLVM BB %entry Live Ins: %X3 %vreg2<def> = COPY %X3<kill>; G8RC:%vreg2 %vreg4<def> = LI 2048; GPRC:%vreg4 %vreg3<def> = OR8To4 %vreg2<kill>, %vreg2; GPRC:%vreg3 G8RC:%vreg2 %vreg9<def> = COPY %vreg4<kill>; GPRC:%vreg9,%vreg4 %vreg10<def> = RLDICL %vreg9<kill>, 0, 32; GPRC:%vreg10,%vreg9 %vreg11<def> = MTCTR8r %vre...