Stephen McGruer
2012-Oct-26 10:54 UTC
[LLVMdev] Properly handling mem-loc arguments when prologue adjusts FP.
For my target, I handle incoming memory arguments by creating a store to memory (in LowerCall, [1]), then creating a fixed object on the stack and loading from it (in LowerFormalArguments[2]). This approach was based on MSP430. I now have the problem that the resulting loads in my output assembly are done assuming that the call stack looks something like: ------ MemArg ------ MemArg ------ <-- Frame Pointer This isn't true, because during prologue I emit a number of instructions to store the return address, the old frame pointer value, etc[3], so that the stack ends up looking like: ------ MemArg ------ MemArg ------ VarArg ------ Return Addr Register ------ Old Frame Pointer ------ <-- Frame Pointer At this point, the memory-argument instructions like 'ld rX, [fp]' (load from frame-pointer) now obviously load the wrong data. How can I 'fix' the load instructions to point to the right locations in memory? (That is, increase their offset by the space that the prologue adds to the stack)? One thing that I've just realized may be notable is that my prologue code (see [3]) uses a special "st.a" instruction that both stores to a memory location and decrements the address-argument-register. (i.e. st.a r1, [r2, 4] does r2 = r2 - 4 then stores to [r2]). So if LLVM normally guesses these things automatically from the instructions, it wouldn't be able to guess that. But here I'm just conjecturing - may not be relevant! Thanks, Stephen [1]: LowerCall ... // Arguments that can be passed in a register must be kept in the // RegsToPass vector. if (VA.isRegLoc()) { RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg)); } else { // Sanity check. assert(VA.isMemLoc()); // Get the stack pointer if needed. if (StackPtr.getNode() == 0) { StackPtr = DAG.getCopyFromReg(Chain, dl, ARC::SP, getPointerTy()); } SDValue PtrOff = DAG.getNode(ISD::ADD, dl, getPointerTy(), StackPtr, DAG.getIntPtrConstant(VA.getLocMemOffset())); MemOpChains.push_back(DAG.getStore(Chain, dl, Arg, PtrOff, MachinePointerInfo(),false, false, 0)); } ... [2]: LowerFormalArguments ... if (VA.isRegLoc()) { // Arguments passed in registers. const TargetRegisterClass *RC = ARC::CPURegsRegisterClass; unsigned int Register = MF.addLiveIn(VA.getLocReg(), RC); EVT RegisterValueType = VA.getLocVT(); ArgValue = DAG.getCopyFromReg(Chain, dl, Register, RegisterValueType); InVals.push_back(ArgValue); } else { // Sanity check assert(VA.isMemLoc()); // Load the argument to a virtual register unsigned ObjSize = VA.getLocVT().getSizeInBits()/8; if (ObjSize != 4) { llvm_unreachable("Memory argument is wrong size - not 32 bit!"); } // Create the frame index object for this incoming parameter... int FI = MFI->CreateFixedObject(ObjSize, VA.getLocMemOffset(), true); // Create the SelectionDAG nodes corresponding to a load from this // parameter. SDValue FIN = DAG.getFrameIndex(FI, MVT::i32); InVals.push_back(DAG.getLoad(VA.getLocVT(), dl, Chain, FIN, MachinePointerInfo::getFixedStack(FI), false, false, false, 0)); } ... [3] Example of prologue moving stack pointer (which the frame pointer is then set to.) ... if (VARegSaveSize) { BuildMI(MBB, MBBI, dl, TII.get(ARC::SUBrsi), ARC::SP).addReg(ARC::SP) .addImm(VARegSaveSize); } // Save the return address register, if necessary if (MFI->adjustsStack()) { BuildMI(MBB, MBBI, dl, TII.get(ARC::STrri_a)).addReg(ARC::SP) .addImm(-UNITS_PER_WORD).addReg(ARC::BLINK); } // Save the caller's frame pointer (if required), and set new FP to this // location. BuildMI(MBB, MBBI, dl, TII.get(ARC::STrri_a)).addReg(ARC::SP) .addImm(-UNITS_PER_WORD).addReg(ARC::FP); BuildMI(MBB, MBBI, dl, TII.get(ARC::MOVrr), ARC::FP).addReg(ARC::SP); ... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121026/8f1f3bfa/attachment.html>
Stephen McGruer
2012-Oct-28 16:59 UTC
[LLVMdev] Properly handling mem-loc arguments when prologue adjusts FP.
As a small update, I switched out my 'special' store instruction for a store and then a sub (both of which are fully described), and yet the resulting assembly still doesn't seem to realize that the SP/FP has moved: http://pastebin.com/RLj8B8Pe I considered handling this *in* LowerFormalArguments, by working out the offset there, but that depends on knowing if the function makes calls or not, and I'm not sure if that information is around then. Plus it also feels very hacky! Stephen On 26 October 2012 11:54, Stephen McGruer <stephen.mcgruer at gmail.com> wrote:> For my target, I handle incoming memory arguments by creating a store to > memory (in LowerCall, [1]), then creating a fixed object on the stack and > loading from it (in LowerFormalArguments[2]). This approach was based on > MSP430. > > I now have the problem that the resulting loads in my output assembly are > done assuming that the call stack looks something like: > > ------ > MemArg > ------ > MemArg > ------ <-- Frame Pointer > > This isn't true, because during prologue I emit a number of instructions > to store the return address, the old frame pointer value, etc[3], so that > the stack ends up looking like: > > ------ > MemArg > ------ > MemArg > ------ > VarArg > ------ > Return Addr Register > ------ > Old Frame Pointer > ------ <-- Frame Pointer > > At this point, the memory-argument instructions like 'ld rX, [fp]' (load > from frame-pointer) now obviously load the wrong data. > > How can I 'fix' the load instructions to point to the right locations in > memory? (That is, increase their offset by the space that the prologue adds > to the stack)? > > One thing that I've just realized may be notable is that my prologue code > (see [3]) uses a special "st.a" instruction that both stores to a memory > location and decrements the address-argument-register. (i.e. st.a r1, [r2, > 4] does r2 = r2 - 4 then stores to [r2]). So if LLVM normally guesses these > things automatically from the instructions, it wouldn't be able to guess > that. But here I'm just conjecturing - may not be relevant! > > Thanks, > Stephen > > [1]: LowerCall > ... > // Arguments that can be passed in a register must be kept in the > // RegsToPass vector. > if (VA.isRegLoc()) { > RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg)); > } else { > // Sanity check. > assert(VA.isMemLoc()); > > // Get the stack pointer if needed. > if (StackPtr.getNode() == 0) { > StackPtr = DAG.getCopyFromReg(Chain, dl, ARC::SP, getPointerTy()); > } > > SDValue PtrOff = DAG.getNode(ISD::ADD, dl, getPointerTy(), StackPtr, > DAG.getIntPtrConstant(VA.getLocMemOffset())); > MemOpChains.push_back(DAG.getStore(Chain, dl, Arg, PtrOff, > MachinePointerInfo(),false, false, 0)); > } > ... > > [2]: LowerFormalArguments > ... > if (VA.isRegLoc()) { > // Arguments passed in registers. > > const TargetRegisterClass *RC = ARC::CPURegsRegisterClass; > unsigned int Register = MF.addLiveIn(VA.getLocReg(), RC); > EVT RegisterValueType = VA.getLocVT(); > ArgValue = DAG.getCopyFromReg(Chain, dl, Register, > RegisterValueType); > > InVals.push_back(ArgValue); > } else { > // Sanity check > assert(VA.isMemLoc()); > > // Load the argument to a virtual register > unsigned ObjSize = VA.getLocVT().getSizeInBits()/8; > > if (ObjSize != 4) { > llvm_unreachable("Memory argument is wrong size - not 32 bit!"); > } > > // Create the frame index object for this incoming parameter... > int FI = MFI->CreateFixedObject(ObjSize, VA.getLocMemOffset(), true); > > // Create the SelectionDAG nodes corresponding to a load from this > // parameter. > SDValue FIN = DAG.getFrameIndex(FI, MVT::i32); > InVals.push_back(DAG.getLoad(VA.getLocVT(), dl, Chain, FIN, > MachinePointerInfo::getFixedStack(FI), > false, false, false, 0)); > } > ... > > [3] Example of prologue moving stack pointer (which the frame pointer is > then set to.) > ... > if (VARegSaveSize) { > BuildMI(MBB, MBBI, dl, TII.get(ARC::SUBrsi), ARC::SP).addReg(ARC::SP) > .addImm(VARegSaveSize); > } > > // Save the return address register, if necessary > if (MFI->adjustsStack()) { > BuildMI(MBB, MBBI, dl, TII.get(ARC::STrri_a)).addReg(ARC::SP) > .addImm(-UNITS_PER_WORD).addReg(ARC::BLINK); > } > > // Save the caller's frame pointer (if required), and set new FP to this > // location. > BuildMI(MBB, MBBI, dl, TII.get(ARC::STrri_a)).addReg(ARC::SP) > .addImm(-UNITS_PER_WORD).addReg(ARC::FP); > BuildMI(MBB, MBBI, dl, TII.get(ARC::MOVrr), ARC::FP).addReg(ARC::SP); > ... >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121028/a8bcf4e2/attachment.html>
Possibly Parallel Threads
- TableGen customized node with mayStore attribute is deleted if there is no use
- [LLVMdev] Moving return value registers from MRI to return instructions
- need help for customized backend LowerFormalArguments
- [LLVMdev] ARM struct byval size > 64 triggers failure
- [LLVMdev] RFC: Tail call optimization X86