Hi, I'm working on the iterated register coalescing graph coloring allocator and try to test it with all backends available currently in LLVM. Initial tests with most of the backends are successful. It turned out that my allocator triggers a specific assertion in the RegScavenger and only for the ARM target. It looks like the LR register is used for frame pointer related things, but it is STILL available for register allocation according to the ARMRegisterInfo.td: def GPR : RegisterClass<"ARM", [i32], 32, [R0, R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R12, R11, LR, SP, PC]> Let me now explain the problem step-by-step: 1) Here is the function's machine code before register allocation (this code is produced by bugpoint from a bigger test-case): If you need the BC file, it is attached: # Machine code for Insert(): Live Ins: R0 in VR#1025 R1 in VR#1026 entry: 0x8fdac90, LLVM BB @0x8fc2c48, ID#0: Live Ins: %R0 %R1 %reg1026<def,dead> = MOVr %R1<kill>, 14, %reg0, %reg0 %reg1025<def> = MOVr %R0<kill>, 14, %reg0, %reg0 %reg1024<def> = MOVr %reg1025, 14, %reg0, %reg0 CMPri %reg1025<kill>, 0, 14, %reg0, %CPSR<imp-def> Bcc mbb<UnifiedReturnBlock,0x8fdad70>, 10, %CPSR<kill> Successors according to CFG: 0x8fdad00 (#1) 0x8fdad70 (#2) bb368: 0x8fdad00, LLVM BB @0x8fc2c98, ID#1: Predecessors according to CFG: 0x8fdac90 (#0) %reg1027<def> = MOVi 0, 14, %reg0, %reg0 STR %reg1024<kill>, %reg1027<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) [0x8fc2d68 + 0] BX_RET 14, %reg0 UnifiedReturnBlock: 0x8fdad70, LLVM BB @0x8fc2cc0, ID#2: Predecessors according to CFG: 0x8fdac90 (#0) BX_RET 14, %reg0 # End machine code for Insert(). 2) My register allocator produces a following allocation: ********** REGISTER MAP ********** [reg1024 -> LR] [reg1025 -> R0] [reg1026 -> R1] [reg1027 -> R0] The interesting bit is that it is: - different from the linearscan result - it assigned the LR reg to the reg1024, even though LR is not the first register in the allocation order for the GPR register class. Even though, it ignores the preferred allocation order, it is not a bug and is quite legal. BTW, I obtain the set of allocatable register using the following code at the beginning of the runOnMachineFunction() of my register allocator. Is anything wrong with it? mri = tm->getRegisterInfo(); // Prepare regClass2AllowedSet for each register class // This should be done on a per function basis, because // some registers may get included/excluded on a per // function basic (e.g. frame pointer on X86) regClass2AllowedSet.clear(); regClass2AllowedSet.resize(mri->getNumRegClasses() + 1); for (TargetRegisterInfo::regclass_iterator RCI = mri->regclass_begin(), RCE = mri->regclass_end(); RCI != RCE; ++RCI) { int regClassId = (*RCI)->getID(); regClass2AllowedSet[regClassId].resize(mri->getNumRegs() + 1); for (TargetRegisterClass::iterator I = (*RCI)->allocation_order_begin(*mf), E = (*RCI)->allocation_order_end(*mf); I != E; ++I) regClass2AllowedSet[regClassId].set(*I); } 3) This register allocation results in the following machine code, after replacement of virtual regs by the assigned physical regs: **** Post Machine Instrs **** # Machine code for Insert(): Live Ins: R0 in VR#1025 R1 in VR#1026 entry: 0x8fdac90, LLVM BB @0x8fc2c48, ID#0: Live Ins: %R0 %R1 %LR<def> = MOVr %R0, 14, %reg0, %reg0 CMPri %R0<kill>, 0, 14, %reg0, %CPSR<imp-def> Bcc mbb<UnifiedReturnBlock,0x8fdad70>, 10, %CPSR<kill> Successors according to CFG: 0x8fdad00 (#1) 0x8fdad70 (#2) bb368: 0x8fdad00, LLVM BB @0x8fc2c98, ID#1: Predecessors according to CFG: 0x8fdac90 (#0) %R0<def> = MOVi 0, 14, %reg0, %reg0 STR %LR<kill>, %R0<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) [0x8fc2d68 + 0] BX_RET 14, %reg0 UnifiedReturnBlock: 0x8fdad70, LLVM BB @0x8fc2cc0, ID#2: Predecessors according to CFG: 0x8fdac90 (#0) BX_RET 14, %reg0 # End machine code for Insert(). 4) Then I get the following assertion: llc: /opt/llvm/lib/CodeGen/RegisterScavenging.cpp:223: void llvm::RegScavenger::forward(): Assertion `isUsed(Reg) && "Using an undefined register!"' failed. It is triggered by PrologEpilogInserter::replaceFrameIndices() function. The undefined register is the LR register. If I dump the function at this point I see the following (the instruction tiggering the assetion is marked by ***): # Machine code for Insert(): <fi#0>: size is 4 bytes, alignment is 4 bytes, at location [SP-4] Live Ins: R0 in VR#1025 R1 in VR#1026 entry: 0x8fdac90, LLVM BB @0x8fc2c48, ID#0: Live Ins: %R0 %R1 %LR %SP<def> = SUBri %SP<kill>, 4, 14, %reg0, %reg0 STR %LR<kill>, %SP, %reg0, 0, 14, %reg0 %LR<def> = MOVr %R0, 14, %reg0, %reg0 CMPri %R0<kill>, 0, 14, %reg0, %CPSR<imp-def> Bcc mbb<UnifiedReturnBlock,0x8fdad70>, 10, %CPSR<kill> Successors according to CFG: 0x8fdad00 (#1) 0x8fdad70 (#2) bb368: 0x8fdad00, LLVM BB @0x8fc2c98, ID#1: Predecessors according to CFG: 0x8fdac90 (#0) %R0<def> = MOVi 0, 14, %reg0, %reg0 *** STR %LR<kill>, %R0<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) [0x8fc2d68 + 0] %LR<def> = LDR <fi#0>, %reg0, 0, 14, %reg0 %SP<def> = ADDri %SP<kill>, 4, 14, %reg0, %reg0 BX_RET 14, %reg0 UnifiedReturnBlock: 0x8fdad70, LLVM BB @0x8fc2cc0, ID#2: Predecessors according to CFG: 0x8fdac90 (#0) %LR<def> = LDR <fi#0>, %reg0, 0, 14, %reg0 %SP<def> = ADDri %SP<kill>, 4, 14, %reg0, %reg0 BX_RET 14, %reg0 # End machine code for Insert(). As you can see, PrologEpilogInserter has inserted at the beginning of the function some code for manipulation of the frame pointer and this inserted code uses the LR register. As far as I understand, ARMRegisterInfo.td should exclude the LR register from the set of allocatable registers for functions that require frame pointer manipulation. But currently it is not the case, or? I hope that I provided enough information to explain my problem. I also provided my initial analysis, but may be I'm wrong. Can someone more knowledgeable in ARM backend and LLVM's register allocation framework have a look at it? If it is a bug in the ARM backend, could it be fixed? Thanks, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: bugpoint-reduced-simplified.bc Type: application/octet-stream Size: 536 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090107/6bbb5738/attachment.obj>
On Jan 7, 2009, at 2:48 AM, Roman Levenstein wrote:> > > As you can see, PrologEpilogInserter has inserted at the beginning > of the function some code for manipulation of the frame pointer and > this inserted code uses the LR register. > As far as I understand, ARMRegisterInfo.td should exclude the LR > register from the set of allocatable registers for functions that > require frame pointer manipulation. > But currently it is not the case, or?No, LR is not the frame pointer. It's the link register (caller address). It should be available as a general purpose register. The bug is elsewhere. It has to do with kill / dead markers. %LR<def> = LDR <fi#0>, %reg0, 0, 14, %reg0 %SP<def> = ADDri %SP<kill>, 4, 14, %reg0, %reg0 BX_RET 14, %reg0 LR is restored here but it's not killed before the end of the block is reached. Should BX_RET use it? Evan> > > I hope that I provided enough information to explain my problem. I > also provided my initial analysis, but may be I'm wrong. > > Can someone more knowledgeable in ARM backend and LLVM's register > allocation framework have a look at it? > If it is a bug in the ARM backend, could it be fixed? > > Thanks, > Roman > <bugpoint-reduced-simplified.bc>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090107/486a2f0e/attachment.html>
Hi Evan, Thanks for your feedback! 2009/1/7 Evan Cheng <evan.cheng at apple.com>:> > On Jan 7, 2009, at 2:48 AM, Roman Levenstein wrote: > > > As you can see, PrologEpilogInserter has inserted at the beginning > of the function some code for manipulation of the frame pointer and > this inserted code uses the LR register. > As far as I understand, ARMRegisterInfo.td should exclude the LR > register from the set of allocatable registers for functions that > require frame pointer manipulation. > But currently it is not the case, or? > > No, LR is not the frame pointer. It's the link register (caller address). It > should be available as a general purpose register.OK.> The bug is elsewhere. It has to do with kill / dead markers. > %LR<def> = LDR <fi#0>, %reg0, 0, 14, %reg0 > %SP<def> = ADDri %SP<kill>, 4, 14, %reg0, %reg0 > BX_RET 14, %reg0 > LR is restored here but it's not killed before the end of the block is > reached.Hmm. I have no idea about what ARM backend does. My register allocator just assigns the registers as I explained in my original mail. Then it lets VirtRegMap.cpp do its job, i.e. it lets it rewrite the code and replace virtual registers by the assigned physical registers. You can see the result in the step (3) of my original mail. In my opinion, it still looks correct. May be this rewriting process does something wrong? Then PrologEpilogInserter and some other standard post RA passes are invoked for the ARM backend. But I have not changed anything there, so I have no idea what happens. And, BTW, the instructions you mentioned above are after the instruction triggering the assertion, which is: STR %LR<kill>, %R0<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) [0x8fc2d68 + 0]> Should BX_RET use it?I don't know the semantics of BX_RET on the ARM platform. May be it uses BX_RET somehow. BTW, an idea: May be it is easy to trigger exactly the same behaviour with the linear scan if one does the following: - comment out dependency on the coalescer, so that it is not invoked - change the allocation order of the GPR register class for ARM, so that it starts with the LR register. , Any ideas how to proceed with the current situation? -Roman> I hope that I provided enough information to explain my problem. I > also provided my initial analysis, but may be I'm wrong. > > Can someone more knowledgeable in ARM backend and LLVM's register > allocation framework have a look at it? > If it is a bug in the ARM backend, could it be fixed? > > Thanks, > Roman > <bugpoint-reduced-simplified.bc> >
On Jan 7, 2009, at 2:48 AM, Roman Levenstein wrote:> bb368: 0x8fdad00, LLVM BB @0x8fc2c98, ID#1: > Predecessors according to CFG: 0x8fdac90 (#0) > %R0<def> = MOVi 0, 14, %reg0, %reg0 > *** STR %LR<kill>, %R0<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) > [0x8fc2d68 + 0] > %LR<def> = LDR <fi#0>, %reg0, 0, 14, %reg0 > %SP<def> = ADDri %SP<kill>, 4, 14, %reg0, %reg0 > BX_RET 14, %reg0Ok, ignore my earlier email about BX_RET. The issue is LR should be added to livein of BB #1. **** Post Machine Instrs **** # Machine code for Insert(): Live Ins: R0 in VR#1025 R1 in VR#1026 entry: 0x8fdac90, LLVM BB @0x8fc2c48, ID#0: Live Ins: %R0 %R1 %LR<def> = MOVr %R0, 14, %reg0, %reg0 CMPri %R0<kill>, 0, 14, %reg0, %CPSR<imp-def> Bcc mbb<UnifiedReturnBlock,0x8fdad70>, 10, %CPSR<kill> Successors according to CFG: 0x8fdad00 (#1) 0x8fdad70 (#2) bb368: 0x8fdad00, LLVM BB @0x8fc2c98, ID#1: Predecessors according to CFG: 0x8fdac90 (#0) %R0<def> = MOVi 0, 14, %reg0, %reg0 STR %LR<kill>, %R0<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) [0x8fc2d68 + 0] BX_RET 14, %reg0 Here the STR is using LR, but there isn't a def earlier. Evan
2009/1/13 Evan Cheng <echeng at apple.com>:> > On Jan 7, 2009, at 2:48 AM, Roman Levenstein wrote: > >> bb368: 0x8fdad00, LLVM BB @0x8fc2c98, ID#1: >> Predecessors according to CFG: 0x8fdac90 (#0) >> %R0<def> = MOVi 0, 14, %reg0, %reg0 >> *** STR %LR<kill>, %R0<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) >> [0x8fc2d68 + 0] >> %LR<def> = LDR <fi#0>, %reg0, 0, 14, %reg0 >> %SP<def> = ADDri %SP<kill>, 4, 14, %reg0, %reg0 >> BX_RET 14, %reg0 > > Ok, ignore my earlier email about BX_RET. The issue is LR should be added to > livein of BB #1.Who should do it? Do you mean that ARM backend/LiveIntervalsAnalysis/LiveVariables should do it or do you mean that my regalloc should do it?> **** Post Machine Instrs **** > # Machine code for Insert(): > Live Ins: R0 in VR#1025 R1 in VR#1026 > > entry: 0x8fdac90, LLVM BB @0x8fc2c48, ID#0: > Live Ins: %R0 %R1 > %LR<def> = MOVr %R0, 14, %reg0, %reg0 > CMPri %R0<kill>, 0, 14, %reg0, %CPSR<imp-def> > Bcc mbb<UnifiedReturnBlock,0x8fdad70>, 10, %CPSR<kill> > Successors according to CFG: 0x8fdad00 (#1) 0x8fdad70 (#2) > > bb368: 0x8fdad00, LLVM BB @0x8fc2c98, ID#1: > Predecessors according to CFG: 0x8fdac90 (#0) > %R0<def> = MOVi 0, 14, %reg0, %reg0 > STR %LR<kill>, %R0<kill>, %reg0, 0, 14, %reg0, Mem:ST(4,4) > [0x8fc2d68 + 0] > BX_RET 14, %reg0 > > Here the STR is using LR, but there isn't a def earlier.May be I overlook something, but doesn't %LR<def> = MOVr %R0, 14, %reg0, %reg0 in MBB#0 define the LR? It should be enough, or? -Roman