On Mon, Mar 25, 2013 at 2:07 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:> > On Mar 25, 2013, at 1:41 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: > > > Hi Jakob, > > > > I believe Hal is trying to enable register scavenger to find two (or > more) registers that can be used as temporaries. > > > > One problem I see with this approach is that, if you use register > scavenger during PEI, you will have to pessimistically set aside two > emergency spill slots before you call scavengeRegister, even if it turns > out you only need one. Having an extra stack slot might not be a big > problem, but still it is nice if we can avoid allocating a slot > unnecessarily. > > > > I probably won't need these pseudo instructions that are expanded > post-RA in the first place if I can tell the register allocators to spill > accumulator registers to general purpose integer registers instead of > directly to stack and disallow copying between accumulator registers. But I > guess that is a much more difficult problem to solve. Is that right? > > That depends. > > The register allocator can spill across register classes, but it calls the > functionality "live range splitting" and "register class inflation". Here's > how you enable it: > > - Define a union register class that contains both CPU64Regs and ACRegs. > > - Implement TRI::getLargestLegalSuperClass(), and return the new union > register class when asked about CPU64Regs or ACRegs (or their sub-classes). > > - Teach TII::copyPhysReg() to handle the cross-class copies. > > - Teach TII::storeRegToStackSlot() to constrain the register class to > CPU64Regs when asked to spill a virtual register from the union register > class. > > This will use cross-class spilling in most cases, but unfortunately we > can't guarantee that an ACRegs virtual register will never be spilled. This > just makes it much less likely to happen. > >I will look into this. It will probably alleviate the problem.> Targets are still required to be able to spill all legal register classes. > > > Instead of scavenging for registers during pseudo-expansion, I would like > to make it possible to create new virtual registers during spilling. The > plan is to give TII::storeRegToStackSlot() permission to: > > - Insert multiple instructions at the provided iterator, and > > - Create new virtual registers, possibly from different register classes. > > I think that functionality would solve your problems, right? > >Yes, it sounds like it will solve the problem. Using the following example where live ranges of accumulators $vreg_acc0 and $vreg_acc1 conflict, MULT $vreg_acc0, $vreg_gpr0, $vreg_gpr1 MULT $vreg_acc1, $vreg_gpr2, $vreg_gpr3 (consumer of $vreg_acc1) (consumer of $vreg_acc0) if the register can create new virtual registers $vreg_gpr4 and $vreg_gpr5, I think spilling can be avoided: MULT $vreg_acc0, $vreg_gpr0, $vreg_gpr1 copy $vreg_gpr4, $vreg_acc0:lo // spill lo copy $vreg_gpr5, $vreg_acc0:hi // spill hi MULT $vreg_acc1, $vreg_gpr2, $vreg_gpr3 (consumer of $vreg_acc1) copy $vreg_acc0:lo, $vreg_gpr4 // restore lo copy $vreg_acc0:hi, $vreg_gpr5 // restore hi (consumer of $vreg_acc0) Also, should RA avoid splitting live intervals of accumulators, which creates copy instructions? The general idea is that the scavenger should only be used when it is not> possible to determine at RA time if a register is needed. That would > typically be because the frame layout is not known yet. If a register is > always needed, RA should pick it. It is going to do better than the > scavenger. > > > Can you use Hal's scavenger tricks until we get this functionality added > to the register allocators? (Help implementing it is always welcome, of > course). > > Yes, I think I can, but I have to understand details of Hal's patch first.> /jakob > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130325/1c250951/attachment.html>
Jakob Stoklund Olesen
2013-Mar-25 23:02 UTC
[LLVMdev] [PATCH] RegScavenger::scavengeRegister
On Mar 25, 2013, at 2:51 PM, Akira Hatanaka <ahatanak at gmail.com> wrote:> Yes, it sounds like it will solve the problem. > > Using the following example where live ranges of accumulators $vreg_acc0 and $vreg_acc1 conflict, > > MULT $vreg_acc0, $vreg_gpr0, $vreg_gpr1 > MULT $vreg_acc1, $vreg_gpr2, $vreg_gpr3 > > (consumer of $vreg_acc1) > (consumer of $vreg_acc0) > > if the register can create new virtual registers $vreg_gpr4 and $vreg_gpr5, I think spilling can be avoided: > > > MULT $vreg_acc0, $vreg_gpr0, $vreg_gpr1 > copy $vreg_gpr4, $vreg_acc0:lo // spill lo > copy $vreg_gpr5, $vreg_acc0:hi // spill hi > MULT $vreg_acc1, $vreg_gpr2, $vreg_gpr3 > > (consumer of $vreg_acc1) > copy $vreg_acc0:lo, $vreg_gpr4 // restore lo > copy $vreg_acc0:hi, $vreg_gpr5 // restore hi > (consumer of $vreg_acc0)The cross class spilling doesn't support spilling to multiple registers, though. I thought you could copy the accumulator to a single 64-bit register.> Also, should RA avoid splitting live intervals of accumulators, which creates copy instructions?The alternative to live range splitting is spilling, which is usually worse. /jakob
On Mon, Mar 25, 2013 at 4:02 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:> > On Mar 25, 2013, at 2:51 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: > > > Yes, it sounds like it will solve the problem. > > > > Using the following example where live ranges of accumulators $vreg_acc0 > and $vreg_acc1 conflict, > > > > MULT $vreg_acc0, $vreg_gpr0, $vreg_gpr1 > > MULT $vreg_acc1, $vreg_gpr2, $vreg_gpr3 > > > > (consumer of $vreg_acc1) > > (consumer of $vreg_acc0) > > > > if the register can create new virtual registers $vreg_gpr4 and > $vreg_gpr5, I think spilling can be avoided: > > > > > > MULT $vreg_acc0, $vreg_gpr0, $vreg_gpr1 > > copy $vreg_gpr4, $vreg_acc0:lo // spill lo > > copy $vreg_gpr5, $vreg_acc0:hi // spill hi > > MULT $vreg_acc1, $vreg_gpr2, $vreg_gpr3 > > > > (consumer of $vreg_acc1) > > copy $vreg_acc0:lo, $vreg_gpr4 // restore lo > > copy $vreg_acc0:hi, $vreg_gpr5 // restore hi > > (consumer of $vreg_acc0) > > The cross class spilling doesn't support spilling to multiple registers, > though. I thought you could copy the accumulator to a single 64-bit > register. > >The size of general purpose integer registers for mips32 is 32-bit and accumulators are 64-bit registers consisting of 32-bit hi/lo register pairs. So you will need two instructions to copy two 32-bit GPR registers to a 64-bit accumulator register. If spilling to multiple registers is unsupported, perhaps I can I define a new register class consisting of paired GPR registers and pseudo copy instructions?> Also, should RA avoid splitting live intervals of accumulators, which > creates copy instructions? > > The alternative to live range splitting is spilling, which is usually > worse. > >Here I was assuming register allocator will spill accumulator registers to integer registers instead of directly to stack. In that case, splitting might be worse than spilling since reload requires two GPR-to-accumulator copy instructions while copying one accumulator to another requires four copy instructions (instruction set doesn't have any accumulator-to-accumulator copy instructions): copy $vreg_gpr0, $vreg_acc0:lo copy $vreg_gpr1, $vreg_acc0:hi copy $vreg_acc1:lo, $vreg_gpr0 copy $vreg_acc1:hi, $vreg_gpr1> /jakob > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130326/0470e54d/attachment.html>