David Gwynne via llvm-dev
2017-Apr-27 04:13 UTC
[llvm-dev] -msave-args backend support for x86_64
ola, ive been looking at adding support for an -msave-args option for use on x86_64. the short explanation of it is that it makes x86_64 function prologues store their register arguments on the stack. the purpose of this is to make the arguments trivially accessible for things like stack traces with arguments. as per https://blogs.oracle.com/sherrym/entry/obtaining_function_arguments_on_amd64, this was originally implemented by sun in the various compilers they use to support the debugging facilities in their system. ive been looking at doing the same thing on openbsd for the same reasons which you can see at http://marc.info/?l=openbsd-tech&m=149308081923303&w=2 and http://marc.info/?l=openbsd-tech&m=149325930004885&w=2. there's a strong possibility that openbsd will switch to clang and llvm on amd64, so i had a look at implementing this in clang. i know the illumos community is interested in this functionality, presumably as a way forward from the old gcc theyre still using. i am a fair way along but i wanted to ask for advice on how to proceed from this point. ive only been hacking on llvm for a day or so, so id appreciate some advice from people with experience before i head too far down what could be the wrong path. its not obvious to me what what the etiquette is for sending diffs so it's inline below. it is also available at https://mild.embarrassm.net/~dlg/diff/llvm.msave.trunk this does enough that it generally works. it basically shoves some extra PUSHes into the prologue, and tries to account for them properly so other uses of the frame and the epilogue works. so there are some issues with the code: - it (probably) doesnt handle functions that return structs - it doesnt realign the stack pointer after consuming space on it - it doesnt restrict the use of -msave-args to generation of 64bit code. so my questions are: 1. my understanding is if a function returns a struct, the caller is responsible for allocating space for the struct and passes a pointer to the callee via RDI, which takes a register away from arguments. is that true? what's the best way to detect that in X86FrameLowering::emitPrologue()? 2. i copied get64BitArgumentGPRs from X86ISelLowering.cpp. i need this so i know which registers to push onto the stack and in which order. should i move it to X86RegisterInfo.cpp? could someone give me a pointer on following the FIXME, ie, how could i get that stuff from tblgen? if anyone has some tips for me, it would be greatly appreciated. thanks in advance, dlg Index: lib/Target/X86/X86.td ==================================================================--- lib/Target/X86/X86.td (revision 301500) +++ lib/Target/X86/X86.td (working copy) @@ -235,6 +235,9 @@ "LEA instruction with certain arguments is slow">; def FeatureSlowIncDec : SubtargetFeature<"slow-incdec", "SlowIncDec", "true", "INC and DEC instructions are slower than ADD and SUB">; +def FeatureSaveArgs + : SubtargetFeature<"save-args", "SaveArgs", "true", + "Save register arguments on the stack.">; def FeatureSoftFloat : SubtargetFeature<"soft-float", "UseSoftFloat", "true", "Use software floating point features.">; Index: lib/Target/X86/X86FrameLowering.cpp ==================================================================--- lib/Target/X86/X86FrameLowering.cpp (revision 301500) +++ lib/Target/X86/X86FrameLowering.cpp (working copy) @@ -47,6 +47,7 @@ // standard x86_64 and NaCl use 64-bit frame/stack pointers, x32 - 32-bit. Uses64BitFramePtr = STI.isTarget64BitLP64() || STI.isTargetNaCl64(); StackPtr = TRI->getStackRegister(); + SaveArgs = STI.getSaveArgs(); } bool X86FrameLowering::hasReservedCallFrame(const MachineFunction &MF) const { @@ -83,7 +84,7 @@ /// or if frame pointer elimination is disabled. bool X86FrameLowering::hasFP(const MachineFunction &MF) const { const MachineFrameInfo &MFI = MF.getFrameInfo(); - return (MF.getTarget().Options.DisableFramePointerElim(MF) || + return (MF.getTarget().Options.DisableFramePointerElim(MF) || SaveArgs || TRI->needsStackRealignment(MF) || MFI.hasVarSizedObjects() || MFI.isFrameAddressTaken() || MFI.hasOpaqueSPAdjustment() || @@ -850,6 +851,25 @@ MI->getOperand(3).setIsDead(); } +// FIXME: Get this from tablegen. +static ArrayRef<MCPhysReg> get64BitArgumentGPRs(CallingConv::ID CallConv, + const X86Subtarget &Subtarget) { + assert(Subtarget.is64Bit()); + + if (Subtarget.isCallingConvWin64(CallConv)) { + static const MCPhysReg GPR64ArgRegsWin64[] = { + X86::RCX, X86::RDX, X86::R8, X86::R9 + }; + return makeArrayRef(std::begin(GPR64ArgRegsWin64), std::end(GPR64ArgRegsWin64)); + } + + static const MCPhysReg GPR64ArgRegs64Bit[] = { + X86::RDI, X86::RSI, X86::RDX, X86::RCX, X86::R8, X86::R9 + }; + return makeArrayRef(std::begin(GPR64ArgRegs64Bit), std::end(GPR64ArgRegs64Bit)); +} + + /// emitPrologue - Push callee-saved registers onto the stack, which /// automatically adjust the stack pointer. Adjust the stack pointer to allocate /// space for local variables. Also emit labels used by the exception handler to @@ -1123,6 +1143,36 @@ BuildCFI(MBB, MBBI, DL, MCCFIInstruction::createDefCfaRegister( nullptr, DwarfFramePtr)); } + + if (SaveArgs) { + ArrayRef<MCPhysReg> GPRs + get64BitArgumentGPRs(Fn->getCallingConv(), STI); + unsigned arg_size = Fn->arg_size(); + unsigned RI = 0; + int64_t SaveSize = 0; + + for (MCPhysReg Reg : GPRs) { + if (++RI > arg_size) + break; + + SaveSize += SlotSize; + +#if 1 + BuildMI(MBB, MBBI, DL, TII.get(X86::PUSH64r)) + .addReg(Reg) + .setMIFlag(MachineInstr::FrameSetup); +#else + // MOV64mr Reg, -SaveSize(%rbp) + addRegOffset(BuildMI(MBB, MBBI, DL, TII.get(X86::MOV64mr)), + FramePtr, true, -SaveSize) + .addReg(Reg) + .setMIFlag(MachineInstr::FrameSetup); +#endif + } + + StackSize += SaveSize; + MFI.setStackSize(StackSize); + } } // Mark the FramePtr as live-in in every block. Don't do this again for Index: lib/Target/X86/X86FrameLowering.h ==================================================================--- lib/Target/X86/X86FrameLowering.h (revision 301500) +++ lib/Target/X86/X86FrameLowering.h (working copy) @@ -36,6 +36,8 @@ unsigned SlotSize; + bool SaveArgs; + /// Is64Bit implies that x86_64 instructions are available. bool Is64Bit; Index: lib/Target/X86/X86Subtarget.cpp ==================================================================--- lib/Target/X86/X86Subtarget.cpp (revision 301500) +++ lib/Target/X86/X86Subtarget.cpp (working copy) @@ -312,6 +312,7 @@ SlowLEA = false; SlowIncDec = false; stackAlignment = 4; + SaveArgs = false; // FIXME: this is a known good value for Yonah. How about others? MaxInlineSizeThreshold = 128; UseSoftFloat = false; Index: lib/Target/X86/X86Subtarget.h ==================================================================--- lib/Target/X86/X86Subtarget.h (revision 301500) +++ lib/Target/X86/X86Subtarget.h (working copy) @@ -293,6 +293,9 @@ /// entry to the function and which must be maintained by every function. unsigned stackAlignment; + /// Whether function prologues should save register arguments on the stack. + bool SaveArgs; + /// Max. memset / memcpy size that is turned into rep/movs, rep/stos ops. /// unsigned MaxInlineSizeThreshold; @@ -356,6 +359,8 @@ return &getInstrInfo()->getRegisterInfo(); } + bool getSaveArgs() const { return SaveArgs; } + /// Returns the minimum alignment known to hold of the /// stack frame on entry to the function and which must be maintained by every /// function for this subtarget.
David Chisnall via llvm-dev
2017-Apr-27 11:00 UTC
[llvm-dev] -msave-args backend support for x86_64
This seems like a fragile and heavy hammer if your goal is simply to allow debuggers to find the arguments. Wouldn’t it be simpler to mark the arguments as live for the entire call, so that the back end will either kepp them in registers or spill them, depending on register pressure, and update the DWARF frame info so that you can find them? David> On 27 Apr 2017, at 05:13, David Gwynne via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ola, > > ive been looking at adding support for an -msave-args option for > use on x86_64. the short explanation of it is that it makes x86_64 > function prologues store their register arguments on the stack. the > purpose of this is to make the arguments trivially accessible for > things like stack traces with arguments. > > as per > https://blogs.oracle.com/sherrym/entry/obtaining_function_arguments_on_amd64, > this was originally implemented by sun in the various compilers > they use to support the debugging facilities in their system. ive > been looking at doing the same thing on openbsd for the same reasons > which you can see at > http://marc.info/?l=openbsd-tech&m=149308081923303&w=2 and > http://marc.info/?l=openbsd-tech&m=149325930004885&w=2. > > there's a strong possibility that openbsd will switch to clang and > llvm on amd64, so i had a look at implementing this in clang. i > know the illumos community is interested in this functionality, > presumably as a way forward from the old gcc theyre still using. > > i am a fair way along but i wanted to ask for advice on how to > proceed from this point. ive only been hacking on llvm for a day > or so, so id appreciate some advice from people with experience > before i head too far down what could be the wrong path. > > its not obvious to me what what the etiquette is for sending diffs > so it's inline below. it is also available at > https://mild.embarrassm.net/~dlg/diff/llvm.msave.trunk > > this does enough that it generally works. it basically shoves some > extra PUSHes into the prologue, and tries to account for them > properly so other uses of the frame and the epilogue works. > > so there are some issues with the code: > > - it (probably) doesnt handle functions that return structs > - it doesnt realign the stack pointer after consuming space on it > - it doesnt restrict the use of -msave-args to generation of 64bit > code. > > so my questions are: > > 1. my understanding is if a function returns a struct, the caller > is responsible for allocating space for the struct and passes a > pointer to the callee via RDI, which takes a register away from > arguments. > > is that true? what's the best way to detect that in > X86FrameLowering::emitPrologue()? > > 2. i copied get64BitArgumentGPRs from X86ISelLowering.cpp. > > i need this so i know which registers to push onto the stack and > in which order. > > should i move it to X86RegisterInfo.cpp? could someone give me a > pointer on following the FIXME, ie, how could i get that stuff from > tblgen? > > if anyone has some tips for me, it would be greatly appreciated. > > thanks in advance, > dlg > > Index: lib/Target/X86/X86.td > ==================================================================> --- lib/Target/X86/X86.td (revision 301500) > +++ lib/Target/X86/X86.td (working copy) > @@ -235,6 +235,9 @@ > "LEA instruction with certain arguments is slow">; > def FeatureSlowIncDec : SubtargetFeature<"slow-incdec", "SlowIncDec", "true", > "INC and DEC instructions are slower than ADD and SUB">; > +def FeatureSaveArgs > + : SubtargetFeature<"save-args", "SaveArgs", "true", > + "Save register arguments on the stack.">; > def FeatureSoftFloat > : SubtargetFeature<"soft-float", "UseSoftFloat", "true", > "Use software floating point features.">; > Index: lib/Target/X86/X86FrameLowering.cpp > ==================================================================> --- lib/Target/X86/X86FrameLowering.cpp (revision 301500) > +++ lib/Target/X86/X86FrameLowering.cpp (working copy) > @@ -47,6 +47,7 @@ > // standard x86_64 and NaCl use 64-bit frame/stack pointers, x32 - 32-bit. > Uses64BitFramePtr = STI.isTarget64BitLP64() || STI.isTargetNaCl64(); > StackPtr = TRI->getStackRegister(); > + SaveArgs = STI.getSaveArgs(); > } > > bool X86FrameLowering::hasReservedCallFrame(const MachineFunction &MF) const { > @@ -83,7 +84,7 @@ > /// or if frame pointer elimination is disabled. > bool X86FrameLowering::hasFP(const MachineFunction &MF) const { > const MachineFrameInfo &MFI = MF.getFrameInfo(); > - return (MF.getTarget().Options.DisableFramePointerElim(MF) || > + return (MF.getTarget().Options.DisableFramePointerElim(MF) || SaveArgs || > TRI->needsStackRealignment(MF) || > MFI.hasVarSizedObjects() || > MFI.isFrameAddressTaken() || MFI.hasOpaqueSPAdjustment() || > @@ -850,6 +851,25 @@ > MI->getOperand(3).setIsDead(); > } > > +// FIXME: Get this from tablegen. > +static ArrayRef<MCPhysReg> get64BitArgumentGPRs(CallingConv::ID CallConv, > + const X86Subtarget &Subtarget) { > + assert(Subtarget.is64Bit()); > + > + if (Subtarget.isCallingConvWin64(CallConv)) { > + static const MCPhysReg GPR64ArgRegsWin64[] = { > + X86::RCX, X86::RDX, X86::R8, X86::R9 > + }; > + return makeArrayRef(std::begin(GPR64ArgRegsWin64), std::end(GPR64ArgRegsWin64)); > + } > + > + static const MCPhysReg GPR64ArgRegs64Bit[] = { > + X86::RDI, X86::RSI, X86::RDX, X86::RCX, X86::R8, X86::R9 > + }; > + return makeArrayRef(std::begin(GPR64ArgRegs64Bit), std::end(GPR64ArgRegs64Bit)); > +} > + > + > /// emitPrologue - Push callee-saved registers onto the stack, which > /// automatically adjust the stack pointer. Adjust the stack pointer to allocate > /// space for local variables. Also emit labels used by the exception handler to > @@ -1123,6 +1143,36 @@ > BuildCFI(MBB, MBBI, DL, MCCFIInstruction::createDefCfaRegister( > nullptr, DwarfFramePtr)); > } > + > + if (SaveArgs) { > + ArrayRef<MCPhysReg> GPRs > + get64BitArgumentGPRs(Fn->getCallingConv(), STI); > + unsigned arg_size = Fn->arg_size(); > + unsigned RI = 0; > + int64_t SaveSize = 0; > + > + for (MCPhysReg Reg : GPRs) { > + if (++RI > arg_size) > + break; > + > + SaveSize += SlotSize; > + > +#if 1 > + BuildMI(MBB, MBBI, DL, TII.get(X86::PUSH64r)) > + .addReg(Reg) > + .setMIFlag(MachineInstr::FrameSetup); > +#else > + // MOV64mr Reg, -SaveSize(%rbp) > + addRegOffset(BuildMI(MBB, MBBI, DL, TII.get(X86::MOV64mr)), > + FramePtr, true, -SaveSize) > + .addReg(Reg) > + .setMIFlag(MachineInstr::FrameSetup); > +#endif > + } > + > + StackSize += SaveSize; > + MFI.setStackSize(StackSize); > + } > } > > // Mark the FramePtr as live-in in every block. Don't do this again for > Index: lib/Target/X86/X86FrameLowering.h > ==================================================================> --- lib/Target/X86/X86FrameLowering.h (revision 301500) > +++ lib/Target/X86/X86FrameLowering.h (working copy) > @@ -36,6 +36,8 @@ > > unsigned SlotSize; > > + bool SaveArgs; > + > /// Is64Bit implies that x86_64 instructions are available. > bool Is64Bit; > > Index: lib/Target/X86/X86Subtarget.cpp > ==================================================================> --- lib/Target/X86/X86Subtarget.cpp (revision 301500) > +++ lib/Target/X86/X86Subtarget.cpp (working copy) > @@ -312,6 +312,7 @@ > SlowLEA = false; > SlowIncDec = false; > stackAlignment = 4; > + SaveArgs = false; > // FIXME: this is a known good value for Yonah. How about others? > MaxInlineSizeThreshold = 128; > UseSoftFloat = false; > Index: lib/Target/X86/X86Subtarget.h > ==================================================================> --- lib/Target/X86/X86Subtarget.h (revision 301500) > +++ lib/Target/X86/X86Subtarget.h (working copy) > @@ -293,6 +293,9 @@ > /// entry to the function and which must be maintained by every function. > unsigned stackAlignment; > > + /// Whether function prologues should save register arguments on the stack. > + bool SaveArgs; > + > /// Max. memset / memcpy size that is turned into rep/movs, rep/stos ops. > /// > unsigned MaxInlineSizeThreshold; > @@ -356,6 +359,8 @@ > return &getInstrInfo()->getRegisterInfo(); > } > > + bool getSaveArgs() const { return SaveArgs; } > + > /// Returns the minimum alignment known to hold of the > /// stack frame on entry to the function and which must be maintained by every > /// function for this subtarget. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
David Gwynne via llvm-dev
2017-Apr-27 12:53 UTC
[llvm-dev] -msave-args backend support for x86_64
> On 27 Apr 2017, at 9:00 pm, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote: > > This seems like a fragile and heavy hammer if your goal is simply to allow debuggers to find the arguments. Wouldn’t it be simpler to mark the arguments as live for the entire call, so that the back end will either kepp them in registers or spill them, depending on register pressure, and update the DWARF frame info so that you can find them?that assumes that shipping dwarf is reasonable and handling dwarf is simple. in my situation im trying to make an in kernel debugger more useful. the debugger is the thing that happens when the kernel crashes. adding dwarf to the kernel increases its size by more than a factor of 4 (44.7M vs 10.6M), and adding code for dwarf handling would increase that size further. shipping dwarf in a .debuginfo style file makes it hard to access that info during a crash, since the thing that failed is the thing you need to use to get the .debuginfo file. alternatively, emitting a limited series of pushes and adjusting the stack pointer is simple. i’d say it's roughly equivalent to the cost of stack probes, but simpler. it’s also simple to pull apart. the code to handle reading arguments the callee pushed onto the stack is literally an additional 9 lines, before falling through to the existing code that reads the rest of the arguments out of the callers frame. it has been robust in deployment for over a decade in solaris and derivatives like illumos. if anyone could have justified shipping dwarf, it would have been the solaris team with their emphasis on debugging and introspecting their system. the fact that they didn’t is telling in my opinion. cheers, dlg> > David > >> On 27 Apr 2017, at 05:13, David Gwynne via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> ola, >> >> ive been looking at adding support for an -msave-args option for >> use on x86_64. the short explanation of it is that it makes x86_64 >> function prologues store their register arguments on the stack. the >> purpose of this is to make the arguments trivially accessible for >> things like stack traces with arguments. >> >> as per >> https://blogs.oracle.com/sherrym/entry/obtaining_function_arguments_on_amd64, >> this was originally implemented by sun in the various compilers >> they use to support the debugging facilities in their system. ive >> been looking at doing the same thing on openbsd for the same reasons >> which you can see at >> http://marc.info/?l=openbsd-tech&m=149308081923303&w=2 and >> http://marc.info/?l=openbsd-tech&m=149325930004885&w=2. >> >> there's a strong possibility that openbsd will switch to clang and >> llvm on amd64, so i had a look at implementing this in clang. i >> know the illumos community is interested in this functionality, >> presumably as a way forward from the old gcc theyre still using. >> >> i am a fair way along but i wanted to ask for advice on how to >> proceed from this point. ive only been hacking on llvm for a day >> or so, so id appreciate some advice from people with experience >> before i head too far down what could be the wrong path. >> >> its not obvious to me what what the etiquette is for sending diffs >> so it's inline below. it is also available at >> https://mild.embarrassm.net/~dlg/diff/llvm.msave.trunk >> >> this does enough that it generally works. it basically shoves some >> extra PUSHes into the prologue, and tries to account for them >> properly so other uses of the frame and the epilogue works. >> >> so there are some issues with the code: >> >> - it (probably) doesnt handle functions that return structs >> - it doesnt realign the stack pointer after consuming space on it >> - it doesnt restrict the use of -msave-args to generation of 64bit >> code. >> >> so my questions are: >> >> 1. my understanding is if a function returns a struct, the caller >> is responsible for allocating space for the struct and passes a >> pointer to the callee via RDI, which takes a register away from >> arguments. >> >> is that true? what's the best way to detect that in >> X86FrameLowering::emitPrologue()? >> >> 2. i copied get64BitArgumentGPRs from X86ISelLowering.cpp. >> >> i need this so i know which registers to push onto the stack and >> in which order. >> >> should i move it to X86RegisterInfo.cpp? could someone give me a >> pointer on following the FIXME, ie, how could i get that stuff from >> tblgen? >> >> if anyone has some tips for me, it would be greatly appreciated. >> >> thanks in advance, >> dlg >> >> Index: lib/Target/X86/X86.td >> ==================================================================>> --- lib/Target/X86/X86.td (revision 301500) >> +++ lib/Target/X86/X86.td (working copy) >> @@ -235,6 +235,9 @@ >> "LEA instruction with certain arguments is slow">; >> def FeatureSlowIncDec : SubtargetFeature<"slow-incdec", "SlowIncDec", "true", >> "INC and DEC instructions are slower than ADD and SUB">; >> +def FeatureSaveArgs >> + : SubtargetFeature<"save-args", "SaveArgs", "true", >> + "Save register arguments on the stack.">; >> def FeatureSoftFloat >> : SubtargetFeature<"soft-float", "UseSoftFloat", "true", >> "Use software floating point features.">; >> Index: lib/Target/X86/X86FrameLowering.cpp >> ==================================================================>> --- lib/Target/X86/X86FrameLowering.cpp (revision 301500) >> +++ lib/Target/X86/X86FrameLowering.cpp (working copy) >> @@ -47,6 +47,7 @@ >> // standard x86_64 and NaCl use 64-bit frame/stack pointers, x32 - 32-bit. >> Uses64BitFramePtr = STI.isTarget64BitLP64() || STI.isTargetNaCl64(); >> StackPtr = TRI->getStackRegister(); >> + SaveArgs = STI.getSaveArgs(); >> } >> >> bool X86FrameLowering::hasReservedCallFrame(const MachineFunction &MF) const { >> @@ -83,7 +84,7 @@ >> /// or if frame pointer elimination is disabled. >> bool X86FrameLowering::hasFP(const MachineFunction &MF) const { >> const MachineFrameInfo &MFI = MF.getFrameInfo(); >> - return (MF.getTarget().Options.DisableFramePointerElim(MF) || >> + return (MF.getTarget().Options.DisableFramePointerElim(MF) || SaveArgs || >> TRI->needsStackRealignment(MF) || >> MFI.hasVarSizedObjects() || >> MFI.isFrameAddressTaken() || MFI.hasOpaqueSPAdjustment() || >> @@ -850,6 +851,25 @@ >> MI->getOperand(3).setIsDead(); >> } >> >> +// FIXME: Get this from tablegen. >> +static ArrayRef<MCPhysReg> get64BitArgumentGPRs(CallingConv::ID CallConv, >> + const X86Subtarget &Subtarget) { >> + assert(Subtarget.is64Bit()); >> + >> + if (Subtarget.isCallingConvWin64(CallConv)) { >> + static const MCPhysReg GPR64ArgRegsWin64[] = { >> + X86::RCX, X86::RDX, X86::R8, X86::R9 >> + }; >> + return makeArrayRef(std::begin(GPR64ArgRegsWin64), std::end(GPR64ArgRegsWin64)); >> + } >> + >> + static const MCPhysReg GPR64ArgRegs64Bit[] = { >> + X86::RDI, X86::RSI, X86::RDX, X86::RCX, X86::R8, X86::R9 >> + }; >> + return makeArrayRef(std::begin(GPR64ArgRegs64Bit), std::end(GPR64ArgRegs64Bit)); >> +} >> + >> + >> /// emitPrologue - Push callee-saved registers onto the stack, which >> /// automatically adjust the stack pointer. Adjust the stack pointer to allocate >> /// space for local variables. Also emit labels used by the exception handler to >> @@ -1123,6 +1143,36 @@ >> BuildCFI(MBB, MBBI, DL, MCCFIInstruction::createDefCfaRegister( >> nullptr, DwarfFramePtr)); >> } >> + >> + if (SaveArgs) { >> + ArrayRef<MCPhysReg> GPRs >> + get64BitArgumentGPRs(Fn->getCallingConv(), STI); >> + unsigned arg_size = Fn->arg_size(); >> + unsigned RI = 0; >> + int64_t SaveSize = 0; >> + >> + for (MCPhysReg Reg : GPRs) { >> + if (++RI > arg_size) >> + break; >> + >> + SaveSize += SlotSize; >> + >> +#if 1 >> + BuildMI(MBB, MBBI, DL, TII.get(X86::PUSH64r)) >> + .addReg(Reg) >> + .setMIFlag(MachineInstr::FrameSetup); >> +#else >> + // MOV64mr Reg, -SaveSize(%rbp) >> + addRegOffset(BuildMI(MBB, MBBI, DL, TII.get(X86::MOV64mr)), >> + FramePtr, true, -SaveSize) >> + .addReg(Reg) >> + .setMIFlag(MachineInstr::FrameSetup); >> +#endif >> + } >> + >> + StackSize += SaveSize; >> + MFI.setStackSize(StackSize); >> + } >> } >> >> // Mark the FramePtr as live-in in every block. Don't do this again for >> Index: lib/Target/X86/X86FrameLowering.h >> ==================================================================>> --- lib/Target/X86/X86FrameLowering.h (revision 301500) >> +++ lib/Target/X86/X86FrameLowering.h (working copy) >> @@ -36,6 +36,8 @@ >> >> unsigned SlotSize; >> >> + bool SaveArgs; >> + >> /// Is64Bit implies that x86_64 instructions are available. >> bool Is64Bit; >> >> Index: lib/Target/X86/X86Subtarget.cpp >> ==================================================================>> --- lib/Target/X86/X86Subtarget.cpp (revision 301500) >> +++ lib/Target/X86/X86Subtarget.cpp (working copy) >> @@ -312,6 +312,7 @@ >> SlowLEA = false; >> SlowIncDec = false; >> stackAlignment = 4; >> + SaveArgs = false; >> // FIXME: this is a known good value for Yonah. How about others? >> MaxInlineSizeThreshold = 128; >> UseSoftFloat = false; >> Index: lib/Target/X86/X86Subtarget.h >> ==================================================================>> --- lib/Target/X86/X86Subtarget.h (revision 301500) >> +++ lib/Target/X86/X86Subtarget.h (working copy) >> @@ -293,6 +293,9 @@ >> /// entry to the function and which must be maintained by every function. >> unsigned stackAlignment; >> >> + /// Whether function prologues should save register arguments on the stack. >> + bool SaveArgs; >> + >> /// Max. memset / memcpy size that is turned into rep/movs, rep/stos ops. >> /// >> unsigned MaxInlineSizeThreshold; >> @@ -356,6 +359,8 @@ >> return &getInstrInfo()->getRegisterInfo(); >> } >> >> + bool getSaveArgs() const { return SaveArgs; } >> + >> /// Returns the minimum alignment known to hold of the >> /// stack frame on entry to the function and which must be maintained by every >> /// function for this subtarget. >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Joerg Sonnenberger via llvm-dev
2017-Apr-27 16:00 UTC
[llvm-dev] -msave-args backend support for x86_64
On Thu, Apr 27, 2017 at 12:00:37PM +0100, David Chisnall via llvm-dev wrote:> This seems like a fragile and heavy hammer if your goal is simply to > allow debuggers to find the arguments. Wouldn’t it be simpler to mark > the arguments as live for the entire call, so that the back end will > either kepp them in registers or spill them, depending on register > pressure, and update the DWARF frame info so that you can find them?It is certainly missing frame info annotation, but the basic approach is similar to what is already done for variadic functions on many architectures with register-based argument handling. That leads me to the question on whether this shouldn't be a generic code generation flag. Joerg
Reid Kleckner via llvm-dev
2017-Apr-28 17:29 UTC
[llvm-dev] -msave-args backend support for x86_64
FWIW MSVC has a similar flag called /homeparams: https://msdn.microsoft.com/en-us/library/6exwh0y6.aspx So, I think there is some value in this feature, even ignoring vector and floating point arguments, which won't fit into these general-purpose register parameters. Win64 doesn't have this problem because they don't pass values larger than 8 bytes by value, including vectors. On Win64, we should also store arguments into the 32 byte shadow space allocated by the caller for varargs purposes to match tools expecting MSVC's behavior. Your patch doesn't appear to update the dwarf or seh call frame info, which is very important. I might grab this at some point, since I have some experience with that. On Wed, Apr 26, 2017 at 9:13 PM, David Gwynne via llvm-dev < llvm-dev at lists.llvm.org> wrote:> ola, > > ive been looking at adding support for an -msave-args option for > use on x86_64. the short explanation of it is that it makes x86_64 > function prologues store their register arguments on the stack. the > purpose of this is to make the arguments trivially accessible for > things like stack traces with arguments. > > as per > https://blogs.oracle.com/sherrym/entry/obtaining_ > function_arguments_on_amd64, > this was originally implemented by sun in the various compilers > they use to support the debugging facilities in their system. ive > been looking at doing the same thing on openbsd for the same reasons > which you can see at > http://marc.info/?l=openbsd-tech&m=149308081923303&w=2 and > http://marc.info/?l=openbsd-tech&m=149325930004885&w=2. > > there's a strong possibility that openbsd will switch to clang and > llvm on amd64, so i had a look at implementing this in clang. i > know the illumos community is interested in this functionality, > presumably as a way forward from the old gcc theyre still using. > > i am a fair way along but i wanted to ask for advice on how to > proceed from this point. ive only been hacking on llvm for a day > or so, so id appreciate some advice from people with experience > before i head too far down what could be the wrong path. > > its not obvious to me what what the etiquette is for sending diffs > so it's inline below. it is also available at > https://mild.embarrassm.net/~dlg/diff/llvm.msave.trunk > > this does enough that it generally works. it basically shoves some > extra PUSHes into the prologue, and tries to account for them > properly so other uses of the frame and the epilogue works. > > so there are some issues with the code: > > - it (probably) doesnt handle functions that return structs > - it doesnt realign the stack pointer after consuming space on it > - it doesnt restrict the use of -msave-args to generation of 64bit > code. > > so my questions are: > > 1. my understanding is if a function returns a struct, the caller > is responsible for allocating space for the struct and passes a > pointer to the callee via RDI, which takes a register away from > arguments. > > is that true? what's the best way to detect that in > X86FrameLowering::emitPrologue()? > > 2. i copied get64BitArgumentGPRs from X86ISelLowering.cpp. > > i need this so i know which registers to push onto the stack and > in which order. > > should i move it to X86RegisterInfo.cpp? could someone give me a > pointer on following the FIXME, ie, how could i get that stuff from > tblgen? > > if anyone has some tips for me, it would be greatly appreciated. > > thanks in advance, > dlg > > Index: lib/Target/X86/X86.td > ==================================================================> --- lib/Target/X86/X86.td (revision 301500) > +++ lib/Target/X86/X86.td (working copy) > @@ -235,6 +235,9 @@ > "LEA instruction with certain > arguments is slow">; > def FeatureSlowIncDec : SubtargetFeature<"slow-incdec", "SlowIncDec", > "true", > "INC and DEC instructions are slower > than ADD and SUB">; > +def FeatureSaveArgs > + : SubtargetFeature<"save-args", "SaveArgs", "true", > + "Save register arguments on the stack.">; > def FeatureSoftFloat > : SubtargetFeature<"soft-float", "UseSoftFloat", "true", > "Use software floating point features.">; > Index: lib/Target/X86/X86FrameLowering.cpp > ==================================================================> --- lib/Target/X86/X86FrameLowering.cpp (revision 301500) > +++ lib/Target/X86/X86FrameLowering.cpp (working copy) > @@ -47,6 +47,7 @@ > // standard x86_64 and NaCl use 64-bit frame/stack pointers, x32 - > 32-bit. > Uses64BitFramePtr = STI.isTarget64BitLP64() || STI.isTargetNaCl64(); > StackPtr = TRI->getStackRegister(); > + SaveArgs = STI.getSaveArgs(); > } > > bool X86FrameLowering::hasReservedCallFrame(const MachineFunction &MF) > const { > @@ -83,7 +84,7 @@ > /// or if frame pointer elimination is disabled. > bool X86FrameLowering::hasFP(const MachineFunction &MF) const { > const MachineFrameInfo &MFI = MF.getFrameInfo(); > - return (MF.getTarget().Options.DisableFramePointerElim(MF) || > + return (MF.getTarget().Options.DisableFramePointerElim(MF) || SaveArgs > || > TRI->needsStackRealignment(MF) || > MFI.hasVarSizedObjects() || > MFI.isFrameAddressTaken() || MFI.hasOpaqueSPAdjustment() || > @@ -850,6 +851,25 @@ > MI->getOperand(3).setIsDead(); > } > > +// FIXME: Get this from tablegen. > +static ArrayRef<MCPhysReg> get64BitArgumentGPRs(CallingConv::ID CallConv, > + const X86Subtarget > &Subtarget) { > + assert(Subtarget.is64Bit()); > + > + if (Subtarget.isCallingConvWin64(CallConv)) { > + static const MCPhysReg GPR64ArgRegsWin64[] = { > + X86::RCX, X86::RDX, X86::R8, X86::R9 > + }; > + return makeArrayRef(std::begin(GPR64ArgRegsWin64), > std::end(GPR64ArgRegsWin64)); > + } > + > + static const MCPhysReg GPR64ArgRegs64Bit[] = { > + X86::RDI, X86::RSI, X86::RDX, X86::RCX, X86::R8, X86::R9 > + }; > + return makeArrayRef(std::begin(GPR64ArgRegs64Bit), > std::end(GPR64ArgRegs64Bit)); > +} > + > + > /// emitPrologue - Push callee-saved registers onto the stack, which > /// automatically adjust the stack pointer. Adjust the stack pointer to > allocate > /// space for local variables. Also emit labels used by the exception > handler to > @@ -1123,6 +1143,36 @@ > BuildCFI(MBB, MBBI, DL, MCCFIInstruction::createDefCfaRegister( > nullptr, DwarfFramePtr)); > } > + > + if (SaveArgs) { > + ArrayRef<MCPhysReg> GPRs > + get64BitArgumentGPRs(Fn->getCallingConv(), STI); > + unsigned arg_size = Fn->arg_size(); > + unsigned RI = 0; > + int64_t SaveSize = 0; > + > + for (MCPhysReg Reg : GPRs) { > + if (++RI > arg_size) > + break; > + > + SaveSize += SlotSize; > + > +#if 1 > + BuildMI(MBB, MBBI, DL, TII.get(X86::PUSH64r)) > + .addReg(Reg) > + .setMIFlag(MachineInstr::FrameSetup); > +#else > + // MOV64mr Reg, -SaveSize(%rbp) > + addRegOffset(BuildMI(MBB, MBBI, DL, TII.get(X86::MOV64mr)), > + FramePtr, true, -SaveSize) > + .addReg(Reg) > + .setMIFlag(MachineInstr::FrameSetup); > +#endif > + } > + > + StackSize += SaveSize; > + MFI.setStackSize(StackSize); > + } > } > > // Mark the FramePtr as live-in in every block. Don't do this again > for > Index: lib/Target/X86/X86FrameLowering.h > ==================================================================> --- lib/Target/X86/X86FrameLowering.h (revision 301500) > +++ lib/Target/X86/X86FrameLowering.h (working copy) > @@ -36,6 +36,8 @@ > > unsigned SlotSize; > > + bool SaveArgs; > + > /// Is64Bit implies that x86_64 instructions are available. > bool Is64Bit; > > Index: lib/Target/X86/X86Subtarget.cpp > ==================================================================> --- lib/Target/X86/X86Subtarget.cpp (revision 301500) > +++ lib/Target/X86/X86Subtarget.cpp (working copy) > @@ -312,6 +312,7 @@ > SlowLEA = false; > SlowIncDec = false; > stackAlignment = 4; > + SaveArgs = false; > // FIXME: this is a known good value for Yonah. How about others? > MaxInlineSizeThreshold = 128; > UseSoftFloat = false; > Index: lib/Target/X86/X86Subtarget.h > ==================================================================> --- lib/Target/X86/X86Subtarget.h (revision 301500) > +++ lib/Target/X86/X86Subtarget.h (working copy) > @@ -293,6 +293,9 @@ > /// entry to the function and which must be maintained by every > function. > unsigned stackAlignment; > > + /// Whether function prologues should save register arguments on the > stack. > + bool SaveArgs; > + > /// Max. memset / memcpy size that is turned into rep/movs, rep/stos > ops. > /// > unsigned MaxInlineSizeThreshold; > @@ -356,6 +359,8 @@ > return &getInstrInfo()->getRegisterInfo(); > } > > + bool getSaveArgs() const { return SaveArgs; } > + > /// Returns the minimum alignment known to hold of the > /// stack frame on entry to the function and which must be maintained > by every > /// function for this subtarget. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170428/2b272077/attachment.html>