Alex Bradbury via llvm-dev
2017-Feb-17 11:33 UTC
[llvm-dev] RFC: Setting MachineInstr flags through storeRegToStackSlot
## Problem description One of the responsibilities of a target's implementation of TargetFrameLowering::emitPrologue is to set the frame pointer (if needed). Typically, the frame pointer will be stored to the stack just like the other callee-saved registers, and emitPrologue must insert the instruction to change its value after it was stored to the stack. Mips does this by looking at the number of callee-saved registers spilled for this function and skipping over that many instructions. Other backends such as AArch64 or X86 will skip over all instruction marked with the MachineInstr::FrameSetup flag[1].>From my point of view, skipping over all FrameSetup instructions sounds likethe more robust way to go about things. I haven't fully traced through where the flag is used in LLVM, but it does seem that DwarfDebug::findPrologueEndLoc will return the wrong result if the flag isn't consistently set on frame setup code. The problem is that unless you override TargetFrameLowering::spillCalleeSavedRegisters, then PrologEpilogInserter will just spill callee saved registers through a series of calls to storeRegToStackSlot. This is fine, except storeRegToStackSlot won't set the FrameSetup flag. [1]: Actually X86 will only skip over PUSH instructions with the FrameSetup flag. Is this extra condition necessary? ## Potential solutions 1) Keep the status quo, but modify PrologEpilogInserter so it will mark the last inserted instruction with FrameSetup over calling storeRegToStackSlot. This is exactly what X86 does in its spillCalleeSavedRegisters implementation. This does make the assumption that storeRegtoStackSlot only inserts a single instruction. From what I can see, that is true for all in-tree backends. Arguably, backends like Mips could be improved by modifying their spillCalleeSavedRegisters to set the FrameSetup flag as well. 2) Like the above, but modify storeRegToStackSlot to return the number of MachineInstr inserted and use that value when marking instructions as FrameSetup. This is more invasive, as it will affect all in tree and out-of-tree backends. 3) Add a MachineInstr::MIFlag parameter to storeRegToStackSlot. The storeRegToStackSlot implementation is responsible for applying this flag to all inserted instructions. This could be defaulted to MachineInstr::NoFlags, to minimise changes to function callers, although that defaulted value would have to be replicated in all FooTgtInstrInfo.h headers. For either 2) or 3), the changes are simple, it's just they will affect all backends. I'm very happy to implement the patch, but thought it was worth asking feedback. I'm CCing in Tim+Craig+Simon as AArch64/X86/Mips code owners, as well as Eric as debug info code owner. Any feedback is very welcome. Thanks, Alex
Alex Bradbury via llvm-dev
2017-Feb-17 15:29 UTC
[llvm-dev] RFC: Setting MachineInstr flags through storeRegToStackSlot
On 17 February 2017 at 11:33, Alex Bradbury <asb at asbradbury.org> wrote:> ## Problem description > > One of the responsibilities of a target's implementation of > TargetFrameLowering::emitPrologue is to set the frame pointer (if needed). > Typically, the frame pointer will be stored to the stack just like the other > callee-saved registers, and emitPrologue must insert the instruction to change > its value after it was stored to the stack. Mips does this by looking at the > number of callee-saved registers spilled for this function and skipping over > that many instructions. Other backends such as AArch64 or X86 will skip over > all instruction marked with the MachineInstr::FrameSetup flag[1].I should meant to say, the exact same issue exists for the MachineInstr::FrameDestroy flag, emitEpilogue, and loadRegFromStackSlot. The same resolution should be used for this case as well. Option 2) that I mentioned isn't quite necessary, callers can work out the number of MachineInstrs emitted by looking at the change in the size of the MachineBasicBlock. This does have the disadvantage of forcing all backends using {storeRegTo,loadRegFrom}StackSlot in FooTgtFrameLowering to replicate the necessary loops to mark the inserted instructions. Best, Alex
Quentin Colombet via llvm-dev
2017-Feb-17 19:23 UTC
[llvm-dev] RFC: Setting MachineInstr flags through storeRegToStackSlot
Hi Alex, I believe solution #3 is the one that makes sense the more sense. Cheers, -Quentin> On Feb 17, 2017, at 7:29 AM, Alex Bradbury via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On 17 February 2017 at 11:33, Alex Bradbury <asb at asbradbury.org> wrote: >> ## Problem description >> >> One of the responsibilities of a target's implementation of >> TargetFrameLowering::emitPrologue is to set the frame pointer (if needed). >> Typically, the frame pointer will be stored to the stack just like the other >> callee-saved registers, and emitPrologue must insert the instruction to change >> its value after it was stored to the stack. Mips does this by looking at the >> number of callee-saved registers spilled for this function and skipping over >> that many instructions. Other backends such as AArch64 or X86 will skip over >> all instruction marked with the MachineInstr::FrameSetup flag[1]. > > I should meant to say, the exact same issue exists for the > MachineInstr::FrameDestroy flag, emitEpilogue, and > loadRegFromStackSlot. The same resolution should be used for this case > as well. > > Option 2) that I mentioned isn't quite necessary, callers can work out > the number of MachineInstrs emitted by looking at the change in the > size of the MachineBasicBlock. This does have the disadvantage of > forcing all backends using {storeRegTo,loadRegFrom}StackSlot in > FooTgtFrameLowering to replicate the necessary loops to mark the > inserted instructions. > > Best, > > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Simon Dardis via llvm-dev
2017-Feb-17 21:06 UTC
[llvm-dev] Setting MachineInstr flags through storeRegToStackSlot
Alex, Solution 3 does seem like the simplest and most direct solution. Thanks, Simon
Matthias Braun via llvm-dev
2017-Feb-17 23:14 UTC
[llvm-dev] RFC: Setting MachineInstr flags through storeRegToStackSlot
Can someone familiar with debug info comment on whether it matters to have the FrameSetup flag on the callee save spills? We could have a smart spill or shrink wrapping algorithm that delays the callee saves to a later point in the program while executing non-prologue code first. Is this maybe just meant to indicate the point at which the stack and possible frame/base pointers are setup? Saving the callee saves should not necessary be part of that, it probably is on X86 where the pushs/pops are part of getting the final value of the stack pointer in a function without frame pointer, but for the case where you use storeRegToStackSlot() I'm not sure we even need to mark them as FrameSetup... - Matthias> On Feb 17, 2017, at 3:33 AM, Alex Bradbury via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ## Problem description > > One of the responsibilities of a target's implementation of > TargetFrameLowering::emitPrologue is to set the frame pointer (if needed). > Typically, the frame pointer will be stored to the stack just like the other > callee-saved registers, and emitPrologue must insert the instruction to change > its value after it was stored to the stack. Mips does this by looking at the > number of callee-saved registers spilled for this function and skipping over > that many instructions. Other backends such as AArch64 or X86 will skip over > all instruction marked with the MachineInstr::FrameSetup flag[1]. > > From my point of view, skipping over all FrameSetup instructions sounds like > the more robust way to go about things. I haven't fully traced through where > the flag is used in LLVM, but it does seem that DwarfDebug::findPrologueEndLoc > will return the wrong result if the flag isn't consistently set on frame setup > code. The problem is that unless you override > TargetFrameLowering::spillCalleeSavedRegisters, then PrologEpilogInserter will > just spill callee saved registers through a series of calls to > storeRegToStackSlot. This is fine, except storeRegToStackSlot won't set the > FrameSetup flag. > > [1]: Actually X86 will only skip over PUSH instructions with the FrameSetup > flag. Is this extra condition necessary? > > > ## Potential solutions > > 1) Keep the status quo, but modify PrologEpilogInserter so it will mark the > last inserted instruction with FrameSetup over calling storeRegToStackSlot. > This is exactly what X86 does in its spillCalleeSavedRegisters implementation. > This does make the assumption that storeRegtoStackSlot only inserts a single > instruction. From what I can see, that is true for all in-tree backends. > Arguably, backends like Mips could be improved by modifying their > spillCalleeSavedRegisters to set the FrameSetup flag as well. > > 2) Like the above, but modify storeRegToStackSlot to return the number of > MachineInstr inserted and use that value when marking instructions as > FrameSetup. This is more invasive, as it will affect all in tree and > out-of-tree backends. > > 3) Add a MachineInstr::MIFlag parameter to storeRegToStackSlot. The > storeRegToStackSlot implementation is responsible for applying this flag to > all inserted instructions. This could be defaulted to MachineInstr::NoFlags, > to minimise changes to function callers, although that defaulted value would > have to be replicated in all FooTgtInstrInfo.h headers. > > For either 2) or 3), the changes are simple, it's just they will affect all > backends. I'm very happy to implement the patch, but thought it was worth > asking feedback. I'm CCing in Tim+Craig+Simon as AArch64/X86/Mips code owners, > as well as Eric as debug info code owner. > > > Any feedback is very welcome. > > Thanks, > > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Krzysztof Parzyszek via llvm-dev
2017-Feb-18 19:36 UTC
[llvm-dev] RFC: Setting MachineInstr flags through storeRegToStackSlot
On 2/17/2017 5:14 PM, Matthias Braun via llvm-dev wrote:> Can someone familiar with debug info comment on whether it matters to have the FrameSetup flag on the callee save spills?I'm not that familiar with debug info, but FWIW, Hexagon doesn't set that flag (or FrameDestroy) on anything and the DWARF info works fine (for the purposes of exception handling). -Krzysztof
Alex Bradbury via llvm-dev
2017-Feb-19 21:44 UTC
[llvm-dev] RFC: Setting MachineInstr flags through storeRegToStackSlot
On 17 February 2017 at 23:14, Matthias Braun <mbraun at apple.com> wrote:> Can someone familiar with debug info comment on whether it matters to have the FrameSetup flag on the callee save spills? We could have a smart spill or shrink wrapping algorithm that delays the callee saves to a later point in the program while executing non-prologue code first.You make a good point, better understanding the importance (or lack of importance) of setting the FrameSetup and FrameDestroy flags for debug would be useful. The flags are useful for finding the end of the callee save spills in emitPrologue and the beginning of the callee save restored in emitEpilogue, but if it's not otherwise useful to have them set that problem could be solved in other ways - for instance passing these locations as arguments to emitPrologue/emitEpilogue. You make a good point about shrink wrapping - DwarfDebug::findPrologueEndLoc seems to be one of the main places the FrameSetup flag is checked. If shrink wrapping is applied, the frame setup may not take place in the entry BB. The function could more accurately described as findFirstNonPrologueLoc. For the non-shrinkwrap case, is there any meaningful drawback for the generated debug info if findPrologueEndLoc returns where callee saved registers are spilled (like on MIPS) rather than the first non-FrameSetup instruction after the spills (AArch64, X86)? Best, Alex
Robinson, Paul via llvm-dev
2017-Feb-21 18:23 UTC
[llvm-dev] RFC: Setting MachineInstr flags through storeRegToStackSlot
> -----Original Message----- > From: mbraun at apple.com [mailto:mbraun at apple.com] > Sent: Friday, February 17, 2017 3:15 PM > To: Alex Bradbury > Cc: llvm-dev; Adrian Prantl; Eric Christopher; Robinson, Paul > Subject: Re: [llvm-dev] RFC: Setting MachineInstr flags through > storeRegToStackSlot > > Can someone familiar with debug info comment on whether it matters to have > the FrameSetup flag on the callee save spills? We could have a smart spill > or shrink wrapping algorithm that delays the callee saves to a later point > in the program while executing non-prologue code first. > > Is this maybe just meant to indicate the point at which the stack and > possible frame/base pointers are setup? Saving the callee saves should not > necessary be part of that, it probably is on X86 where the pushs/pops are > part of getting the final value of the stack pointer in a function without > frame pointer, but for the case where you use storeRegToStackSlot() I'm > not sure we even need to mark them as FrameSetup... > > - MatthiasIn DWARF, the idea is that if the debugger is doing a "step in" operation for a call to the function, where is the most reasonable place to stop execution, that is still "before" the first statement? Debug info for parameters and local variables commonly points to stack frame slots, so generally it's not helpful (for the user) to stop before the stack/frame pointer is set up the way the debug info expects. Stashing callee-saved registers is not necessarily part of this, although if you're using the stack pointer as a frame pointer and doing pushes to save registers, you need to be done fiddling with SP before you can say the prologue is ended. --paulr> > > On Feb 17, 2017, at 3:33 AM, Alex Bradbury via llvm-dev <llvm- > dev at lists.llvm.org> wrote: > > > > ## Problem description > > > > One of the responsibilities of a target's implementation of > > TargetFrameLowering::emitPrologue is to set the frame pointer (if > needed). > > Typically, the frame pointer will be stored to the stack just like the > other > > callee-saved registers, and emitPrologue must insert the instruction to > change > > its value after it was stored to the stack. Mips does this by looking at > the > > number of callee-saved registers spilled for this function and skipping > over > > that many instructions. Other backends such as AArch64 or X86 will skip > over > > all instruction marked with the MachineInstr::FrameSetup flag[1]. > > > > From my point of view, skipping over all FrameSetup instructions sounds > like > > the more robust way to go about things. I haven't fully traced through > where > > the flag is used in LLVM, but it does seem that > DwarfDebug::findPrologueEndLoc > > will return the wrong result if the flag isn't consistently set on frame > setup > > code. The problem is that unless you override > > TargetFrameLowering::spillCalleeSavedRegisters, then > PrologEpilogInserter will > > just spill callee saved registers through a series of calls to > > storeRegToStackSlot. This is fine, except storeRegToStackSlot won't set > the > > FrameSetup flag. > > > > [1]: Actually X86 will only skip over PUSH instructions with the > FrameSetup > > flag. Is this extra condition necessary? > > > > > > ## Potential solutions > > > > 1) Keep the status quo, but modify PrologEpilogInserter so it will mark > the > > last inserted instruction with FrameSetup over calling > storeRegToStackSlot. > > This is exactly what X86 does in its spillCalleeSavedRegisters > implementation. > > This does make the assumption that storeRegtoStackSlot only inserts a > single > > instruction. From what I can see, that is true for all in-tree backends. > > Arguably, backends like Mips could be improved by modifying their > > spillCalleeSavedRegisters to set the FrameSetup flag as well. > > > > 2) Like the above, but modify storeRegToStackSlot to return the number > of > > MachineInstr inserted and use that value when marking instructions as > > FrameSetup. This is more invasive, as it will affect all in tree and > > out-of-tree backends. > > > > 3) Add a MachineInstr::MIFlag parameter to storeRegToStackSlot. The > > storeRegToStackSlot implementation is responsible for applying this flag > to > > all inserted instructions. This could be defaulted to > MachineInstr::NoFlags, > > to minimise changes to function callers, although that defaulted value > would > > have to be replicated in all FooTgtInstrInfo.h headers. > > > > For either 2) or 3), the changes are simple, it's just they will affect > all > > backends. I'm very happy to implement the patch, but thought it was > worth > > asking feedback. I'm CCing in Tim+Craig+Simon as AArch64/X86/Mips code > owners, > > as well as Eric as debug info code owner. > > > > > > Any feedback is very welcome. > > > > Thanks, > > > > Alex > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev