章明 via llvm-dev
2017-Nov-11 08:20 UTC
[llvm-dev] Update control flow graph when splitting a machine basic block?
Thank you for your reply!> Every MachineBasicBlock has a list of successors; you can access it with > the successors() accessor. That's what you should be using for any CFG > analysis.I am aware of these methods of class MachineBasicBlock, which allows one to access a MachineBasicBlock's successors and predecessors in the CFG. But the CFG itself may no longer be valid if a MachineBasicBlock is split between two control flow instructions. The accessors of class MachineBasicBlock do not automatically update the CFG. So there is no way to access the up-to-date CFG.> I don't think this actually has any impact in practice; I mean, I guess > it's an issue in theory, but in practice we don't stick branches into > the middle of basic blocks.I did not expect a branch in the middle of a basic block either, until yesterday LLVM Release 4.0.0 produced the following machine basic block before the pass ARMConstantIslands is run: bb.1.if.end: successors: %bb.3.for.body(0x80000000) liveins: %r4 %r0 = tMOVr %r4, 14, _, debug-location !23 tBL 14, _, $__aeabi_i2d, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit-def %sp, implicit-def %r0, implicit-def %r1, debug-location !23 tBL 14, _, @sqrt, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit %r1, implicit-def %sp, implicit-def %r0, implicit-def %r1, debug-location !24 tBL 14, _, $__aeabi_d2iz, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit %r1, implicit-def %sp, implicit-def %r0, debug-location !25 DBG_VALUE 2, 0, !17, !18, debug-location !27 DBG_VALUE debug-use %r0, debug-use _, !16, !18, debug-location !26 tCMPi8 %r0, 2, 14, _, implicit-def %cpsr, debug-location !32 t2IT 11, 28, implicit-def %itstate %r0 = tMOVi8 _, 1, 11, %cpsr, implicit %r0, implicit %itstate tPOP_RET 11, %cpsr, def %r4, def %r6, def %r7, def %pc, implicit %r0, implicit %r4, implicit killed %itstate, debug-location !44 %r1 = t2MOVi 2, 14, _, _ t2B %bb.3.for.body, 14, _ Note that a terminator tPOP_RET is before a non-terminator t2MOVi. The command line to produce this is as follows: llc -mtriple=thumbv7m-none-none-eabi -mcpu=cortex-m3 -O1 -stop-before=arm-cp-islands -o prime-factorize.mir prime-factorize.ll Attached are the input file prime-factorize.ll and output file prime-factorize.mir. The machine basic block above violates my previous assumption that only terminators and debug instructions may appear after the first terminator in a machine basic block. I don't know how "bad" this could be, i.e., how many non-terminators in a program could be generated between two terminators. So I have to consider split such basic blocks before I could instrument the program with control flow checks. I don't know whether this is a bug or not. If it is not a bug, this apparently isn't a purely theoretical issue. Best regards! Ming Zhang
Friedman, Eli via llvm-dev
2017-Nov-13 20:37 UTC
[llvm-dev] Update control flow graph when splitting a machine basic block?
On 11/11/2017 12:20 AM, 章明 wrote:> Thank you for your reply! > >> Every MachineBasicBlock has a list of successors; you can access it with >> the successors() accessor. That's what you should be using for any CFG >> analysis. > I am aware of these methods of class MachineBasicBlock, which allows one to access a MachineBasicBlock's successors and predecessors in the CFG. > But the CFG itself may no longer be valid if a MachineBasicBlock is split between two control flow instructions. > The accessors of class MachineBasicBlock do not automatically update the CFG. > So there is no way to access the up-to-date CFG. > >> I don't think this actually has any impact in practice; I mean, I guess >> it's an issue in theory, but in practice we don't stick branches into >> the middle of basic blocks. > I did not expect a branch in the middle of a basic block either, until yesterday LLVM Release 4.0.0 produced the following machine basic block before the pass ARMConstantIslands is run: > > bb.1.if.end: > successors: %bb.3.for.body(0x80000000) > liveins: %r4 > > %r0 = tMOVr %r4, 14, _, debug-location !23 > tBL 14, _, $__aeabi_i2d, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit-def %sp, implicit-def %r0, implicit-def %r1, debug-location !23 > tBL 14, _, @sqrt, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit %r1, implicit-def %sp, implicit-def %r0, implicit-def %r1, debug-location !24 > tBL 14, _, $__aeabi_d2iz, csr_aapcs, implicit-def dead %lr, implicit %sp, implicit %r0, implicit %r1, implicit-def %sp, implicit-def %r0, debug-location !25 > DBG_VALUE 2, 0, !17, !18, debug-location !27 > DBG_VALUE debug-use %r0, debug-use _, !16, !18, debug-location !26 > tCMPi8 %r0, 2, 14, _, implicit-def %cpsr, debug-location !32 > t2IT 11, 28, implicit-def %itstate > %r0 = tMOVi8 _, 1, 11, %cpsr, implicit %r0, implicit %itstate > tPOP_RET 11, %cpsr, def %r4, def %r6, def %r7, def %pc, implicit %r0, implicit %r4, implicit killed %itstate, debug-location !44 > %r1 = t2MOVi 2, 14, _, _ > t2B %bb.3.for.body, 14, _ > > Note that a terminator tPOP_RET is before a non-terminator t2MOVi. > The command line to produce this is as follows: > > llc -mtriple=thumbv7m-none-none-eabi -mcpu=cortex-m3 -O1 -stop-before=arm-cp-islands -o prime-factorize.mir prime-factorize.ll > > Attached are the input file prime-factorize.ll and output file prime-factorize.mir. > > The machine basic block above violates my previous assumption that only terminators and debug instructions may appear after the first terminator in a machine basic block. > I don't know how "bad" this could be, i.e., how many non-terminators in a program could be generated between two terminators. > So I have to consider split such basic blocks before I could instrument the program with control flow checks. > > I don't know whether this is a bug or not. If it is not a bug, this apparently isn't a purely theoretical issue.It looks like what's happening is that IfConversion makes the return conditional, then that gets merged with the following block. I mean, I would say it's a bug in that there's a "terminator" in a position not at the end of the block, but a return doesn't have any CFG successors, so I'm not sure it has any practical effect. I think we won't merge with the following block for an indirect branch which is not a return. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
章明 via llvm-dev
2017-Nov-14 02:58 UTC
[llvm-dev] Update control flow graph when splitting a machine basic block?
> It looks like what's happening is that IfConversion makes the return > conditional, then that gets merged with the following block. I mean, I > would say it's a bug in that there's a "terminator" in a position not at > the end of the block, but a return doesn't have any CFG successors, so > I'm not sure it has any practical effect. > > I think we won't merge with the following block for an indirect branch > which is not a return.I believe MachineBasicBlock::getFirstTerminator, MachineBasicBlock::getFirstInstrTerminator, and Methods/functions that use these methods will break on a machine basic block like this, although I have not tested these with such a machine basic block. I don't know whether MachineBasicBlock::getFirstTerminator and MachineBasicBlock::getFirstInstrTerminator are intended to be used in a late stage like just before code emission or not. I am trying to circumvent this issue by splitting machine basic blocks, s.t. if a machine basic block contains a terminator, only terminators or non-real instructions like DBG_VALUE may come after the first terminator. This would probably also involve updating the CFG. I'm trying to update the CFG by inspecting targets of terminators, but I am not sure whether this approach is viable. My code for updating the CFG after splitting a machine basic block is as follows. I intend to run the code before ARMConstantIslands. Could you please give me some advice? Thank you! // This function assumes the successors of the old machine basic // block are correct. If the target of a terminator in a new machine basic // block is not a successor of the old machine basic block, the target is not // treated as a successor of the new machine basic block. If a successor of the // old machine basic block is not the target of any of the terminators in the // new machine basic blocks, the successor of the old machine basic block is // conservatively treated as a common successor of all the new machine basic // blocks. // MBBBefore: New machine basic block before the split point. // MBBAfter: New machine basic block after the split point. // Successors: Successors of the old machine basic block. // MF: Machine function that contains the new machine basic blocks and the // successors of the old machine basic block. void UpdateCFG(MachineBasicBlock &MBBBefore, MachineBasicBlock &MBBAfter, std::unordered_set<MachineBasicBlock *> &Successors, MachineFunction &MF) { const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo(); assert(TII && "The target instruction information must not be a null pointer."); const MachineJumpTableInfo *MJTI = MF.getJumpTableInfo(); assert(MJTI && "The machine jump table information must not be a null pointer."); const std::vector<MachineJumpTableEntry> &JumpTables = MJTI->getJumpTables(); SmallVector<MachineBasicBlock *, 2> MBBs = {&MBBBefore, &MBBAfter}; std::unordered_set<MachineBasicBlock *> RemainingSuccessors = Successors; for (auto MBB : MBBs) { assert(MBB && "A new machine basic block cannot be a null pointer."); // Remove all successors from the new machine basic block. for (auto SuccI = MBB->succ_begin(), SuccE = MBB->succ_end(); SuccI != SuccE; SuccI++) MBB->removeSuccessor(SuccI); bool MayFallthrough = true; // Iterate over all terminators of the new machine basic block. // Note: MachineBasicBlock::getFirstTerminator and // MachineBasicBlock::getFirstInstrTerminator seem to assume that every // machine instructions after the first terminator is either a terminator or // a debug instruction. This assumption does not always hold. // We do not use MachineBasicBlock::terminators here, because it depends on // MachineBasicBlock::getFirstTerminator. for (auto MI = MBB->begin(), MIE = MBB->end(); MI != MIE; MI++) { if (!MI->isTerminator()) continue; // Determine the target of the terminator. // We search for operands that are machine basic blocks or jump table // indices. // Returns and indirect branches are ignored, since their targets are not // machine basic blocks. for (auto Op = MI->operands_begin(), OpE = MI->operands_end(); Op != OpE; Op++) { // FIXME: Could a machine basic block operand of a terminator not be the target of the terminator? // FIXME: Could a terminator other than jump table have more than one targets? if (Op->isMBB()) { // The operand is a machine basic block. MachineBasicBlock *Target = Op->getMBB(); // If the possible target of the terminator is a successor of the old // machine basic block, or if the terminator is in the new machine // basic block before the split point and the operand is the new // machine basic block after the split point, then treat the target of // the terminator as a successor of the new basic block and remove it // from the remaining successors. if (Successors.count(Target) || (MBB == &MBBBefore && Target == &MBBAfter)) { MBB->addSuccessor(Target); RemainingSuccessors.erase(Target); } } if (Op->isJTI()) { // The operand is a jump table index. unsigned JTI = Op->getIndex(); assert(JTI < JumpTables.size() && "The jump table index is out-of-range."); const MachineJumpTableEntry &JT = JumpTables[JTI]; for (MachineBasicBlock *Target : JT.MBBs) { // If the possible target of the terminator is a successor of the // old machine basic block, then treat it as a successor of the new // basic block and remove it from the remaining successors. if (Successors.count(Target)) { MBB->addSuccessor(Target); RemainingSuccessors.erase(Target); } } } } // If the terminator is an unconditional control flow barrier, then the // machine basic block cannot fall through to its layout successor, and // instructions after it cannot be executed. // FIXME: Is this a correct way to determine whether an machine instruction may fall through? if (MI->isBarrier() && !TII->isPredicated(*MI)) { MayFallthrough = false; break; } } // If the new machine basic block may fall through to its layout successor, // then treat the layout successor as a successor of the new machine basic // block. if (MayFallthrough) { MachineFunction::iterator LayoutSuccessor = std::next(MBB->getIterator()); if (LayoutSuccessor != MF.end()) { if ((MBB == &MBBBefore && &*LayoutSuccessor == &MBBAfter) || (MBB == &MBBAfter && Successors.count(&*LayoutSuccessor))) { MBB->addSuccessor(&*LayoutSuccessor); RemainingSuccessors.erase(&*LayoutSuccessor); } } } } // Treat the remaining successors, i.e., successors of the old machine basic // block that are not found to be targets of terminators in the new machine // basic blocks, as common successors of all the new machine basic blocks. for (auto MBB : MBBs) { for (auto Succ : RemainingSuccessors) MBB->addSuccessor(Succ); } }