Andrew Trick
2012-Feb-10 22:48 UTC
[LLVMdev] Question about /llvm/trunk/lib/CodeGen/MachineScheduler.cpp
...this is moving from llvm-commits to llvm-dev. On Feb 10, 2012, at 12:14 PM, Sergei Larin <slarin at codeaurora.org> wrote:> 1) Can a BB presented to the MI scheduler be _not_ terminated (end on a non > terminator MI) under any circumstances? Below you are speaking about "Empty > blocks, or blocks with only a single instruction that not a terminator..." - > What about the BB#4 below? Is it OK?I don't *think* any MI pass should require terminators for single successor blocks laid out sequentially. It seems obvious to me that a block with multiple successors needs a terminator, but I don't know exactly where that's verified. On the other hand, prior to code placement, I often see unnecessary unconditional jumps. So I'm not sure what is considered best practice. Hopefully someone else on the list will clarify. Either way it is definitely not an assumption made by the scheduler. In my previous message, I was pointing out corner cases present in the MachineScheduler::runOnMachineFunction loop--they all have to do with terminators or lack thereof. It is interesting to note that a VLIW target will want to do block placement before final bundling. That's not something I plan to fix short term. I think you can deal with it in your target's pass config and let me know if we need to fix some assumptions in common code.> 2) If above is true, can SUnit node in SchedDAG (constructed by > ScheduleDAGInstrs::BuildSchedGraph) be "dangling" - not having successors > (and/or predecessor)? (see example below).Yes. The DAG can have multiple roots/leaves. We have special Entry/ExitSU that partially model latency of data flow across blocks and into call arguments, but they really only exist as a heuristic.> The reason I keep on asking this - both back ends we support (Hexagon and > ARM) are happily optimizing out fall-through branches away by the time BB > gets to the machine scheduler, resulting in a fragmented DAG... If your > intent is to handle that during actual scheduling - fine, but if any of the > above assumptions is false, please clarify.That sounds like an ok assumption to me.> I also could add here that it seems that the > ScheduleDAGSDNodes::BuildSchedGraph has no difficulties dealing with a > similar situation.The MachineScheduler only operates on the instructions that are mapped to SUnits produces by BuildSchedGraph. So if BuildSchedGraph can handle it so can the scheduler. Here is the only issue I'm aware of... Terminators are not included in the scheduler DAG, but a VLIW target will likely want to expose them to the bundler. One way to handle this would be to split the bundler into a separate pass. I don't actually think that makes sense. Firstly, it doesn't so much solve the problem as move it to another place. More importantly, the MachineScheduler pass should be flexible enough such we can integrate any target-specific transforms that want to make use of the preRA DAG. Think of MachineScheduler as some target-specific set of preRA DAG clients. My solution for this is to allow the bundler to operate at a wider scope than the DAG. In other words, we can maintain bundler state across scheduling barriers. In the long-term, we want DAGs to span terminators under certain conditions, but it's still useful to be able to throw in scheduling barriers at times without breaking the bundler. In the next few days I'll fixup the interface to allow that, and move things into a header so you can start checking in a Hexagon MachineScheduler pass if you want. -Andy> Thanks. > > Sergei > > > BB#4: derived from LLVM BB %cond.end > Predecessors according to CFG: BB#3 BB#2 > %vreg194<def> = ADDr_MPYir_V4 %vreg185, 120, %vreg187<kill>; > IntRegs:%vreg194,%vreg185,%vreg187 > %vreg195<def> = CONST32_set <ga:@raac_sfBandTabShortOffset>; > IntRegs:%vreg195 > .... > %vreg306<def> = TFRI 127; IntRegs:%vreg306 > %vreg307<def> = TFRI 1; IntRegs:%vreg307 > %vreg438<def> = COPY %vreg193; IntRegs:%vreg438,%vreg193 > %vreg362<def> = COPY %vreg193; IntRegs:%vreg362,%vreg193 > Successors according to CFG: BB#5 > > MachineScheduling xxx:BB#4 > From: %vreg194<def> = ADDr_MPYir_V4 %vreg185, 120, %vreg187<kill>; > IntRegs:%vreg194,%vreg185,%vreg187 > To: End Remaining: 0 > > After ScheduleDAGInstrs::BuildSchedGraph (in AddSchedBarrierDeps > ExitMI==NULL): > > SU(0): %vreg362<def> = COPY %vreg193; IntRegs:%vreg362,%vreg193 > # preds left : 1 > # succs left : 0 > # rdefs left : 1 > Latency : 1 > Depth : 0 > Height : 0 > Predecessors: > val #0x3fa00e0 - SU(5): Latency=1 Reg=%vreg193 > > SU(1): %vreg438<def> = COPY %vreg193; IntRegs:%vreg438,%vreg193 > # preds left : 1 > # succs left : 0 > # rdefs left : 1 > Latency : 1 > Depth : 0 > Height : 0 > Predecessors: > val #0x3fa00e0 - SU(5): Latency=1 Reg=%vreg193 > > SU(2): %vreg307<def> = TFRI 1; IntRegs:%vreg307 > # preds left : 0 > # succs left : 0 > # rdefs left : 1 > Latency : 1 > Depth : 0 > Height : 0 > > SU(3): %vreg306<def> = TFRI 127; IntRegs:%vreg306 > # preds left : 0 > # succs left : 0 > # rdefs left : 1 > Latency : 1 > Depth : 0 > Height : 0 > .... > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > > >> -----Original Message----- >> From: Andrew Trick [mailto:atrick at apple.com] >> Sent: Monday, February 06, 2012 6:06 PM >> To: Sergei Larin >> Cc: llvm-commits at cs.uiuc.edu >> Subject: Re: Question about >> /llvm/trunk/lib/CodeGen/MachineScheduler.cpp >> >> Sergei, >> >> Anything can be a terminator if the target says so. A terminator can >> only be followed by other terminators in the same block (at least I >> think that's true throughout the backend). >> >> Empty blocks, or blocks with only a single instruction that not a >> terminator or some other scheduling boundary should never be seen by >> the scheduler. Maybe that's your problem. I don't know because I >> haven't seen your debug output. >> >> -Andy >> >> On Feb 6, 2012, at 3:53 PM, Sergei Larin <slarin at codeaurora.org> wrote: >> >>> >>> Andrew, >>> >>> Asserts are "on", nothing is caught. Debug output offers a single >> clue - >>> the BB in question has a non-jump terminator (STrib == store byte). >>> I guess my question then becomes - can a non control flow instruction >> serve >>> as BB terminator (the implicit assumption I was referring to)? If not >> then >>> I'll try to find where the "proper" terminator was lost. >>> >>> Thanks. >>> >>> Sergei >>> >>> -- >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. >>> >>> >>>> -----Original Message----- >>>> From: Andrew Trick [mailto:atrick at apple.com] >>>> Sent: Monday, February 06, 2012 5:32 PM >>>> To: Sergei Larin >>>> Cc: llvm-commits at cs.uiuc.edu >>>> Subject: Re: Question about >>>> /llvm/trunk/lib/CodeGen/MachineScheduler.cpp >>>> >>>> Hi Sergei, >>>> >>>> Please make sure you run with assertions enabled and look at -debug- >>>> only=misched output. -verify-machineinstrs may also be useful. >>>> >>>> I'm not sure what the problem is, but crashing is bad. If there are >> any >>>> implicit assumptions, then they should be made explicit. >>>> >>>> I'm not sure I understand the CFG output. A block should end in a >>>> terminator. BB#7 claims to have a successor, but I don't know what >> it's >>>> terminator is. >>>> >>>> -Andy >>>> >>>> On Feb 6, 2012, at 10:56 AM, Sergei Larin <slarin at codeaurora.org> >>>> wrote: >>>> >>>>> >>>>> Andrew, >>>>> >>>>> This is something between a bug report and a question. I do not >>>> provide >>>>> enough context to reproduce the failure, but hope it is fresh >> enough >>>> in your >>>>> memory :) >>>>> >>>>> In lib/CodeGen/MachineScheduler.cpp >>>>> >>>>> In bool MachineScheduler::runOnMachineFunction(MachineFunction &mf) >>>> {} >>>>> >>>>> While iterating over following MBB, for the following fragment >>>>> ... >>>>> BB#6: derived from LLVM BB %sw.bb6, ADDRESS TAKEN >>>>> Predecessors according to CFG: BB#1 >>>>> %vreg10<def> = TFRI 1; IntRegs:%vreg10 >>>>> STrib %vreg1<kill>, 0, %vreg10<kill>; mem:ST1[%Enum_Ref_Par] >>>>> IntRegs:%vreg1,%vreg10 >>>>> JMP <BB#8> >>>>> Successors according to CFG: BB#8 >>>>> >>>>> BB#7: derived from LLVM BB %sw.bb7, ADDRESS TAKEN >>>>> Predecessors according to CFG: BB#1 >>>>> %vreg9<def> = TFRI 2; IntRegs:%vreg9 >>>>> STrib %vreg1<kill>, 0, %vreg9<kill>; mem:ST1[%Enum_Ref_Par] >>>>> IntRegs:%vreg1,%vreg9 >>>>> Successors according to CFG: BB#8 >>>>> >>>>> BB#8: derived from LLVM BB %sw.epilog, ADDRESS TAKEN >>>>> Predecessors according to CFG: BB#0 BB#1 BB#7 BB#6 BB#5 BB#4 BB#2 >>>>> JMPR %PC<imp-def,dead>, %R31<imp-use>, %R0<imp-use,undef> >>>>> ... >>>>> >>>>> BB#7 seems to cause a problem when iterating over it with RegionEnd >>>> pointing >>>>> somewhere wrong. Am I breaking some implicit assumption in BB >>>> formation? >>>>> Want MBB->end() suppose to return for the BB#7? >>>>> >>>>> *** Final schedule *** >>>>> >>>>> MachineScheduling Proc_6:BB#7 >>>>> From: %vreg9<def> = TFRI 2; IntRegs:%vreg9 >>>>> To: >>>>> Program received signal SIGSEGV, Segmentation fault. >>>>> 0x0000000001a0c814 in llvm::MachineOperand::isReg (this=0x7) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/include/llvm/CodeGen/ >>>>> MachineOperand.h:204 >>>>> 204 bool isReg() const { return OpKind == MO_Register; } >>>>> (gdb) bt >>>>> #0 0x0000000001a0c814 in llvm::MachineOperand::isReg (this=0x7) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/include/llvm/CodeGen/ >>>>> MachineOperand.h:204 >>>>> #1 0x00000000021ca720 in llvm::MachineInstr::print >> (this=0x3e2c9f0, >>>> OS=..., >>>>> TM=0x0) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/CodeGen/MachineIn >>>>> str.cpp:1462 >>>>> #2 0x0000000001a970fe in llvm::operator<< (OS=..., MI=...) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/include/llvm/CodeGen/ >>>>> MachineInstr.h:934 >>>>> #3 0x00000000022e4c18 in (anonymous >>>>> namespace)::MachineScheduler::runOnMachineFunction (this=0x3dcc1c0, >>>> mf=...) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/CodeGen/MachineSc >>>>> heduler.cpp:297 >>>>> #4 0x00000000021c41e5 in llvm::MachineFunctionPass::runOnFunction >>>>> (this=0x3dcc1c0, F=...) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/CodeGen/MachineFu >>>>> nctionPass.cpp:33 >>>>> #5 0x0000000002789db6 in llvm::FPPassManager::runOnFunction >>>>> (this=0x3dcb820, F=...) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/VMCore/PassManage >>>>> r.cpp:1498 >>>>> #6 0x000000000278a001 in llvm::FPPassManager::runOnModule >>>> (this=0x3dcb820, >>>>> M=...) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/VMCore/PassManage >>>>> r.cpp:1520 >>>>> #7 0x000000000278a344 in llvm::MPPassManager::runOnModule >>>> (this=0x3dc8ec0, >>>>> M=...) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/VMCore/PassManage >>>>> r.cpp:1574 >>>>> #8 0x000000000278a85a in llvm::PassManagerImpl::run >> (this=0x3dc8b70, >>>> M=...) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/VMCore/PassManage >>>>> r.cpp:1658 >>>>> #9 0x000000000278ab55 in llvm::PassManager::run (this=0x3d7f140, >>>> M=...) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/lib/VMCore/PassManage >>>>> r.cpp:1687 >>>>> #10 0x0000000000c9488d in (anonymous >>>>> namespace)::EmitAssemblyHelper::EmitAssembly (this=0x7fffffffa2e0, >>>>> Action=clang::Backend_EmitAssembly, OS=0x3d13660) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/CodeG >>>>> en/BackendUtil.cpp:497 >>>>> #11 0x0000000000c94968 in clang::EmitBackendOutput (Diags=..., >>>> CGOpts=..., >>>>> TOpts=..., LOpts=..., M=0x3d1a3c0, >>>> Action=clang::Backend_EmitAssembly, >>>>> OS=0x3d13660) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/CodeG >>>>> en/BackendUtil.cpp:509 >>>>> #12 0x0000000000c9042c in >>>> clang::BackendConsumer::HandleTranslationUnit >>>>> (this=0x3d123a0, C=...) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/CodeG >>>>> en/CodeGenAction.cpp:155 >>>>> #13 0x0000000000e3cf61 in clang::ParseAST (S=..., PrintStats=false) >>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/Parse >>>>> /ParseAST.cpp:110 >>>>> #14 0x0000000000b00725 in clang::ASTFrontendAction::ExecuteAction >>>>> (this=0x3cf1740) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/Front >>>>> end/FrontendAction.cpp:414 >>>>> #15 0x0000000000c8ef9c in clang::CodeGenAction::ExecuteAction >>>>> (this=0x3cf1740) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/CodeG >>>>> en/CodeGenAction.cpp:407 >>>>> #16 0x0000000000b00377 in clang::FrontendAction::Execute >>>> (this=0x3cf1740) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/Front >>>>> end/FrontendAction.cpp:334 >>>>> #17 0x0000000000ad8e63 in clang::CompilerInstance::ExecuteAction >>>>> (this=0x3ced2d0, Act=...) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/Front >>>>> end/CompilerInstance.cpp:653 >>>>> #18 0x0000000000aaa8e4 in clang::ExecuteCompilerInvocation >>>> (Clang=0x3ced2d0) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/lib/Front >>>>> endTool/ExecuteCompilerInvocation.cpp:175 >>>>> #19 0x0000000000a99618 in cc1_main (ArgBegin=0x7fffffffaf10, >>>>> ArgEnd=0x7fffffffb108, Argv0=0x3cebaf8 >>>>> "/prj/qct/sunray-austin/scratch/slarin/qdsp6_hex/bin/qc/bin/clang", >>>>> MainAddr=0xaa4654) >>>>> at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/tools/dri >>>>> ver/cc1_main.cpp:165 >>>>> #20 0x0000000000aa5e77 in main (argc_=65, argv_=0x7fffffffc058) at >>>>> /local/mnt/workspace/slarin/tools/llvm-mainline- >>>> merged/tools/clang/tools/dri >>>>> ver/driver.cpp:353 >>>>> >>>>> >>>>> Thanks. >>>>> >>>>> Sergei >>>>> >>> >>> > >
Jakob Stoklund Olesen
2012-Feb-11 00:13 UTC
[LLVMdev] Question about /llvm/trunk/lib/CodeGen/MachineScheduler.cpp
On Feb 10, 2012, at 2:48 PM, Andrew Trick <atrick at apple.com> wrote:> On Feb 10, 2012, at 12:14 PM, Sergei Larin <slarin at codeaurora.org> wrote: >> 1) Can a BB presented to the MI scheduler be _not_ terminated (end on a non >> terminator MI) under any circumstances? Below you are speaking about "Empty >> blocks, or blocks with only a single instruction that not a terminator..." - >> What about the BB#4 below? Is it OK? > > I don't *think* any MI pass should require terminators for single successor blocks laid out sequentially.Right.> It seems obvious to me that a block with multiple successors needs a terminator, but I don't know exactly where that's verified.It's not. In fact, a block can have a landing pad and a fall-through successor, and still not have a terminator. (Invokes ought to be terminators, but they currently aren't).> On the other hand, prior to code placement, I often see unnecessary unconditional jumps. So I'm not sure what is considered best practice. Hopefully someone else on the list will clarify.Most targets try to keep branches minimal at all times, I think. It doesn't always work, so sometimes CodePlacementOpt has to clean it up. /jakob