Rafael EspĂndola
2006-May-30  11:26 UTC
[LLVMdev] [RFC, ARM] expanding RET to CopyToReg;BRIND
I have changed the way in which the ARM backend generates a function return. Instead of expanding a RET to a CopyToReg;RETFLAG, it now expands into a CopyToReg;BRIND. I haven't commit it yet, but the patch is attached. In my opinion the resulting code is easier to understand, but I have some questions: Why all backends use RETFLAG? Why it is named RETFLAG? Why the Copy that places the result must have a Flag operand? If I understand correctly, the Flag operand exists in nodes that use a flag register (cpsr in ARM). Thanks, Rafael -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm.patch Type: application/octet-stream Size: 2418 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20060530/bc2931b9/attachment.obj>
On Tue, 30 May 2006, [UTF-8] Rafael Esp?ndola wrote:> I have changed the way in which the ARM backend generates a function > return. Instead of expanding a RET to a CopyToReg;RETFLAG, it now > expands into a CopyToReg;BRIND. I haven't commit it yet, but the patch > is attached.Ok, I haven't looked at the code, but you're free to do whatever make sense.> In my opinion the resulting code is easier to understand, but I have > some questions: > > Why all backends use RETFLAG?The backends seem to be doing the following: 1. For 'ret void', a "ISD::RET" node is left along and not lowered. As such, it gets directly pattern matched. On PPC, for example, we have: // Return void support. def : Pat<(ret), (BLR)>; ... which maps it directly to the PPC "blr" instruction. 2. For 'ret value', the targets custom lower the ISD::RET node into some number of CopyToReg nodes (to set up the live outs), then need a node to represent the return. The return node has to be flagged do the copies, so that the scheduler doesn't make the copies wander from the return.> Why it is named RETFLAG?Historical reason. Originally we didn't have nodes that could *optionally* have an input flag. A better design, e.g. on PPC would be to have a PPCISD::RET node, which takes an optional input flag, and always lower RET to it.> Why the Copy that places the result must have a Flag operand? If I > understand correctly, the Flag operand exists in nodes that use a flag > register (cpsr in ARM).Flag in the SelectionDAG stuff is so named because it was originally used for condition codes. However, it has since grown to mean "keep these two nodes always together". In the case of return, you want the scheduler to produce code like this (on PPC): ... R3 = outval_virtreg blr not like this: ... R3 = outval_virtreg ... blr So the copy and blr are flagged together. Another case where flags are useful are for things like the X86 variable shift instruction. There the shift amount is required to be in the CL register, so we generate code like this: CL = shamt_virtreg X = shl Y, CL We don't want the copy and shift to wander apart from each other (e.g. we don't want another shift to get scheduled in between them), so we flag them together. In practice, these copies usually get coallesced away. -Chris -- http://nondot.org/sabre/ http://llvm.org/
Rafael EspĂndola
2006-May-31  10:51 UTC
[LLVMdev] [RFC, ARM] expanding RET to CopyToReg;BRIND
> > Why it is named RETFLAG? > > Historical reason. Originally we didn't have nodes that could > *optionally* have an input flag. A better design, e.g. on PPC would be to > have a PPCISD::RET node, which takes an optional input flag, and always > lower RET to it.I See. I will try to always lower to "(mov)*;bx lr" on ARM.> Flag in the SelectionDAG stuff is so named because it was originally used > for condition codes. However, it has since grown to mean "keep these two > nodes always together". In the case of return, you want the scheduler to > produce code like this (on PPC):That clarifies a lot! Thanks.> ... > R3 = outval_virtreg > blr > > not like this: > > ... > R3 = outval_virtreg > ... > blr > > So the copy and blr are flagged together. > > Another case where flags are useful are for things like the X86 variable > shift instruction. There the shift amount is required to be in the CL > register, so we generate code like this: > > > CL = shamt_virtreg > X = shl Y, CL > > We don't want the copy and shift to wander apart from each other (e.g. we > don't want another shift to get scheduled in between them), so we flag > them together. In practice, these copies usually get coallesced away.In the second case shl explicitly uses CL. Shouldn't the register allocator be smart enough to avoid scheduling an instruction that destroys CL in between them? In the first case, what do you think about making it possible for an instruction to optionally depend on a value? That is, make blr depend on R3 or R3/R4 depending on the type of the return value. Something like a = DAG.getNode(ISD::BRIND, MVT::Other, Copy, LR); a.addUse(PPC::R3)> -ChrisThanks, Rafael
Possibly Parallel Threads
- [LLVMdev] [RFC, ARM] expanding RET to CopyToReg;BRIND
- [LLVMdev] [RFC, ARM] expanding RET to CopyToReg;BRIND
- [LLVMdev] [RFC, ARM] expanding RET to CopyToReg;BRIND
- [LLVMdev] [RFC, ARM] expanding RET to CopyToReg;BRIND
- [LLVMdev] adding an optional flag edge to a BRIND node