John Mosby
2009-Mar-03 04:21 UTC
[LLVMdev] Shrink Wrapping - RFC and initial implementation
On Mon, Mar 2, 2009 at 10:35 AM, Evan Cheng <echeng at apple.com> wrote:> > On Mar 1, 2009, at 2:57 PM, John Mosby wrote: > > Obviously, all of this applies only when spills are done with push/pop, > which is the case on x86. I used this issue to start looking at generalizing > how spills and restores are handled, before looking too closely at other > targets, and developed the workaround for the initial implementation. > > > Spills don't have to be done with push and pop. One possible approach to > take is to perform them with store and load. Then later on PEI will optimize > (some of) them into push and pop. >I am rewriting to use stores/loads via storeRegToStackSlot(), loadRegFromStackSlot(), which will obviate the stack adjustment. I did it this way in an early version, but the architecture is to give the target a chance to generate the spill(restore) code itself, so I conformed to that. As you point out, these can selectively be replaced with push/pop later.> Part of what I'm doing now is estimating the work, which requires going > through the targets. I am not far enough along to send out a proposal. > Moving pro/epi generation out of TRI, perhaps into its own "component" is > one architecture I am looking at. > > > It's not necessary to update all the targets at once. We can ask targets to > opt in to take advantage of shrink wrapping. >That sounds good. I now have a better handle on the differences between the targets: XCore handles frame moves in its spillCalleeSavedRegister(), the ARM code for this is very straightforward. I hope to have an updated patch done tomorrow, with doc. and explicated examples. Thanks, John -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090302/aa079857/attachment.html>
John Mosby
2009-Mar-05 03:57 UTC
[LLVMdev] Shrink Wrapping - RFC and initial implementation
Here is an updated patch for shrink wrapping with: - spills/restores done with stack slot stores/loads - stack adjustment removed - refactoring (but still in need of more) - spill/restore insertion code unified with spill/restore placement code Documentation available here<http://wiki.github.com/jdmdj/llvm-work/shrink-wrapping-work> illustrates shrink wrapping with loops and discusses a situation in which the pass would be more effective by splitting edges in the Machine CFG (similar to breaking crit. edges in the CFG). Test cases are attached. Thanks, John -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090304/a485746d/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: shrink-wrapping.P2.patch Type: application/octet-stream Size: 40567 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090304/a485746d/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: test-sw.tar.gz Type: application/x-gzip Size: 2272 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090304/a485746d/attachment.bin>
Evan Cheng
2009-Mar-12 17:05 UTC
[LLVMdev] Shrink Wrapping - RFC and initial implementation
Hi John, It looks pretty good. Thanks for working on this. Some comments: 1. Some of the functions that you introduced, e.g. stringifyCSRegSet probably ought to be "static" and ifdef'ed out when NDEBUG is defined. 2. + // DEBUG + if (! MBB->empty() && ! CSRUsed[MBB].intersects(restore)) { + MachineInstr* MI = BeforeI; + DOUT << "adding restore after "; + DEBUG(MI->dump()); + } else { + DOUT << "adding restore to beginning of " + << getBasicBlockName(MBB) << "\n"; + } + // DEBUG Code like this should also be inside ifndef NDEBUG. 3. It can still use more refactoring. :-) 4. clearSets(). It's not clear what sets it's clearing. Perhaps name it something like clearShrinkWrapData()? 5 +void PEI::placeSpillsAndRestores(MachineFunction &Fn) { + + DOUT << "Computing SAVE, RESTORE sets\n"; + + // If not shrink wrapping, all spills go into entry block, + // all restores in return blocks. + if (! ShrinkWrapping) { + // Do spills in the entry block. The shrink wrap version probably should go to its own function. Otherwise, it should exit early when the non-shrink wrapping version is done. That reduces nesting and please those of us who are a bit picky. 6. + // Save entry block, return blocks. + if (MBB->pred_size() == 0) + entryBlock = MBB; Entry block is the Fn.front(). 7. + if (!MBB->empty() && MBB->back().getDesc().isReturn()) + returnBlocks.push_back(MBB); PEI::insertPrologEpilogCode also traverse MBBs and get at return blocks. So these probably ought to be shared, i.e. returnBlocks should be an ivar and determined early in PEI. 8. + for (MachineBasicBlock::iterator I = MBB->begin(); I != MBB->end (); ++I) { + for (unsigned i = 0, e = CSI.size(); i != e; ++i) { + unsigned Reg = CSI[i].getReg(); + // If instruction I reads or modifies Reg, add it to UsedCSRegs, + // CSRUsed map for the current block. + if (I->readsRegister(Reg, TRI) || I->modifiesRegister(Reg, TRI)) { readsRegister and modifiesRegister both scan the operands. It's probably better to manually walk through the operands. 9. + // If all uses of CSRegs are in the entry block, there is nothing + // to shrink wrap: spills go in entry block, restores go in exiting + // blocks, turn off shrink wrapping. + if (allCSRUsesInEntryBlock) { + ShrinkWrapping = false; + DOUT << "All uses of CSRegs are in entry block, nothing to do.\n"; + } + // If we have decided not to shrink wrap, just return now. + if (! ShrinkWrapping) + return true; Why not just return inside if (allCSRUsesInEntryBlock)? 10. +bool PEI::calculateUsedAnticAvail(MachineFunction &Fn) { ... + // Calculate AnticIn, AnticOut using post-order traversal of MCFG. + for (po_iterator<MachineBasicBlock*> + MBBI = po_begin(Fn.getBlockNumbered(0)), + MBBE = po_end(Fn.getBlockNumbered(0)); MBBI != MBBE; ++MBBI) { + MachineBasicBlock* MBB = *MBBI; ... + // Calculate Avail{In,Out} via top-down walk of Machine dominator tree. + for (df_iterator<MachineDomTreeNode*> DI = df_begin(DT.getRootNode ()), + E = df_end(DT.getRootNode()); DI != E; ++DI) { Later in +/// placeSpillsAndRestores - decide which MBBs need spills, restores +/// of CSRs. +/// +void PEI::placeSpillsAndRestores(MachineFunction &Fn) { ... + // Calculate CSRRestore using post-order traversal of Machine-CFG. + for (po_iterator<MachineBasicBlock*> + MBBI = po_begin(Fn.getBlockNumbered(0)), + MBBE = po_end(Fn.getBlockNumbered(0)); MBBI != MBBE; + +MBBI) { This seem to be doing traversal at least one too many times? Can this be improved? 11. Can you explain a bit more about AnticIn, AvailIn, etc.? 12. Let's worry about edge splitting for a later time. :-) 13. After the code is cleaned up, we should consider checking it in and try it out as llcbeta. Do you have any idea of its compile time impact? Thanks, Evan On Mar 4, 2009, at 7:57 PM, John Mosby wrote:> Here is an updated patch for shrink wrapping with: > > - spills/restores done with stack slot stores/loads > - stack adjustment removed > - refactoring (but still in need of more) > - spill/restore insertion code unified with spill/restore placement > code > > Documentation available here illustrates shrink wrapping with loops > and discusses a situation in which the pass would be more effective > by splitting edges in the Machine CFG (similar to breaking crit. > edges in the CFG). > > Test cases are attached. > > Thanks, > John > > <shrink-wrapping.P2.patch><test- > sw.tar.gz>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090312/f5ac8c05/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Shrink Wrapping - RFC and initial implementation
- [LLVMdev] Shrink Wrapping - RFC and initial implementation
- [LLVMdev] Shrink Wrapping - RFC and initial implementation
- [LLVMdev] Shrink Wrapping - RFC and initial implementation
- [LLVMdev] Basic block liveouts