Anton Korobeynikov
2007-Aug-08 19:12 UTC
[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling
Hello, Arnold.> with the sentence i tried to express the question whether there is a > way to persuade the code generator to use another register to load (or > move) the function pointer to (right before the callee saved register > restore) but thinking a little further that's nonsense.Why don't define some special op for callee address and custom lower it? I really suggest you to look into eh_return. It's used in some pretty tricky situtation inside eh runtime: it it used to return from some eh runtime code. We already know, how much we should unwind the stack, and what is the handler (sounds similar, right?). Also %eax and %edx are used to return eh data and should be preserved in such function. So, in general, code for eh_return looks like (intel notation here): mov ecx, ebp add ecx, offset mov [ecx], handler .... (just usual epilogue) mov esp, ecx ret (in real we have slightly different code, because we need to compensate stack adjustment due to ebp save in epilogue, but idea is the same) We cannot just jump to handler because we need to change the %esp here. Also, offset and handler are calculated by code of routine, that's why such code looks like to be the most painless way to do the things. The tail call emission has many similar bits, but everything (both handler and offset) is known during codegen time, that's why we can simplify the code much.> TOT means trunk of today (funny because in german, my native language > it means death)?TOT = Top of Tree. But yes, many abbrevations has very funny meaning in different languages (TOT = "that" in russian :) )> So what i will be trying then is to emit a copytoreg from the virtual > register holding the > function pointer to ecx before the tailcall node.This seems to be the right way to do the things.> the downside here is that ECX is no longer free for passing function > arguments. (i am using the x86_fastcall semantics at the moment with > first two arguments stored in ecx,edx)This is the most problematic stuff here. Some notes: 1. You cannot emit a tail call to something returning a struct. 2. It can be tricky to emit a call to function having arguments in registers (emitting something target-dependent for register holding function pointer seems to be the right way to do the things: you can always check, what are livein registers for function and try to emit something safe). I'd suggest to start from plain functions with C calling convention.> and sorry if i am bothering you with questions whose answer should be > obvious. i am really a total newbie greenhorn :)You're welcome :) Many of such questions are not so easy to answer. -- WBR, Anton Korobeynikov
Arnold Schwaighofer
2007-Aug-08 20:54 UTC
[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling
On 8 Aug 2007, at 21:12, Anton Korobeynikov wrote:> Hello, Arnold. > >> with the sentence i tried to express the question whether there is a >> way to persuade the code generator to use another register to load >> (or >> move) the function pointer to (right before the callee saved register >> restore) but thinking a little further that's nonsense. > Why don't define some special op for callee address and custom > lower it?so what you are saying there is that instead of X86TargetLowering::LowerCCCCallTo(SDOperand Op, SelectionDAG &DAG, unsigned CC) ... Chain = DAG.getNode(isTailCall ? X86ISD::TAILCALL : X86ISD::CALL, NodeTys, &Ops[0], Ops.size()); ... do something along the lines X86TargetLowering::LowerCCCCallTo(SDOperand Op, SelectionDAG &DAG, unsigned CC) ... if (isTailCall) { SDOperand CustomCallee = DAG.getNode(X86ISD:TCLOWERCALLEEADDRESS, Op.getOperand(6/* callee*/)); Ops[6] = CustomCalleed; Chain = DAG.getNode( X86ISD::TAILCALL , NodeTys, &Ops[0], Ops.size()); ... And later lower it such that it performs the move {register holding callee} to {legitimate register holding the callee} Problem here is that it is not known (when lowering X86ISD:TCLOWERCALLEEADDRESS )whether the call will really be a tail call followed by a return. when i treat it later when dagcombining the ret_flag instruction i have all that information. (see code below)> I really suggest you to look into eh_return. It's used in some pretty > tricky situtation inside eh runtime: it it used to return from some eh > runtime code. We already know, how much we should unwind the stack, > and > what is the handler (sounds similar, right?). Also %eax and %edx are > used to return eh data and should be preserved in such function. > So, in > general, code for eh_return looks like (intel notation here): > > mov ecx, ebp > add ecx, offset > mov [ecx], handler > .... (just usual epilogue) > mov esp, ecx > ret > > (in real we have slightly different code, because we need to > compensate > stack adjustment due to ebp save in epilogue, but idea is the same) > > We cannot just jump to handler because we need to change the %esp > here. > Also, offset and handler are calculated by code of routine, that's why > such code looks like to be the most painless way to do the things. The > tail call emission has many similar bits, but everything (both handler > and offset) is known during codegen time, that's why we can > simplify the > code much. >I'll have a look at it. at the moment i perform the dag transformations in post legalize phase of the dagcombiner. SDOperand X86TargetLowering::PerformDAGCombine(SDNode *N, DAGCombinerInfo &DCI) const { SelectionDAG &DAG = DCI.DAG; switch (N->getOpcode()) { default: break; case ISD::VECTOR_SHUFFLE: return PerformShuffleCombine(N, DAG, Subtarget); case ISD::SELECT: return PerformSELECTCombine(N, DAG, Subtarget); case X86ISD::RET_FLAG: return PerformRETCombine(N, DCI, Subtarget, this); } static SDOperand PerformRETCombine(SDNode * node, TargetLowering::DAGCombinerInfo &DCI, const X86Subtarget *Subtarget, const X86TargetLowering * TLI) { SDOperand RetNode; bool ReturnRegularCall =false; do { if( !DCI.isBeforeLegalize() && PerformTailCallOpt) { // check whether there are any instructions between RET and CALLSEQ_END if yes break out // look for tail call before CALLSEQ_END // Adjust ESP // remove instructions after tailcall (the moves from integer/ float result) RetNode = TAILCALL node // here i could insert the before mentioned move {register holding callee} to {legitimate register holding the callee} // at the moment: hack suggested by dale SDOperand TCKillRegOps [] = {AdjStackChain, AdjStackFlag}; SDOperand TCKillReg = DCI.DAG.getNode (X86ISD::TAILCALL_REG_KILL, DCI.DAG.getVTList(MVT::i32, MVT::Other, MVT::Flag), TCKillRegOps, 2); if (RetNode.getNumOperands() ==5) { // no registers for argument passing SDOperand OpsTailCall [] = {TCKillReg, RetNode.getOperand(1),RetNode.getOperand(2), RetNode.getOperand(3), SDOperand(TCKillReg.Val, 1)}; RetNode = DCI.DAG.getNode(X86ISD::TAILCALL, TCVTs, OpsTailCall, 3); } else if (RetNode.getNumOperands()==4){ }... // update argument stores: store relative to framepointer of current function (the one that is tailcalling) // save RET_ADDR to new location // and probably some more stuff } while(0); return RetNode;>> the downside here is that ECX is no longer free for passing function >> arguments. (i am using the x86_fastcall semantics at the moment with >> first two arguments stored in ecx,edx) > This is the most problematic stuff here. Some notes: > 1. You cannot emit a tail call to something returning a struct.I would have added the code for returning a struct. (additional pointer pushed by caller and popped by callee or so my understanding is a the moment) but honestly have not look at this part yet.> 2. It can be tricky to emit a call to function having arguments in > registers (emitting something target-dependent for register holding > function pointer seems to be the right way to do the things: you can > always check, what are livein registers for function and try to emit > something safe). >the implementation i have at the moment only works if caller and (tail) callee have the same calling convention. so i don't see/ understand where there would be a problem. (if ecx/edx is used for param passing always) but i am not taking exception handling into account. (there would be problems if i understand you correctly)> I'd suggest to start from plain functions with C calling convention. > >> and sorry if i am bothering you with questions whose answer should be >> obvious. i am really a total newbie greenhorn :) > You're welcome :) Many of such questions are not so easy to answer. > -- > WBR, Anton Korobeynikov > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Arnold Schwaighofer
2007-Aug-09 13:28 UTC
[LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling
On 8 Aug 2007, at 21:12, Anton Korobeynikov wrote:> Hello, Arnold. > >> with the sentence i tried to express the question whether there is a >> way to persuade the code generator to use another register to load >> (or >> move) the function pointer to (right before the callee saved register >> restore) but thinking a little further that's nonsense. > Why don't define some special op for callee address and custom > lower it? > I really suggest you to look into eh_return. It's used in some pretty > tricky situtation inside eh runtime: it it used to return from some eh > runtime code. We already know, how much we should unwind the stack, > and > what is the handler (sounds similar, right?). Also %eax and %edx are > used to return eh data and should be preserved in such function. > So, in > general, code for eh_return looks like (intel notation here): > > mov ecx, ebp > add ecx, offset > mov [ecx], handler > .... (just usual epilogue) > mov esp, ecx > retAaaah, yes now i understand i was missing the movrr in X86RegisterInfo.cpp. And yes that's probably a much cleaner way of doing it. The TC_RETURN would take a register (containing the calleeaddress)/or the callee TargetGlobalAdress/TargetExternalSymbol and the size of the stack adjustment (difference between caller/callee args) TAILCALL would then be lowered to loading the callee address to a register (if its dynamic). and increasing esp+4. in X86RegisterInfo.cpp we would then have if (RetOpcode== X86::TC_RETURN){ if (isDynamicCallee(RetOpCode) add esp {stack adjustment tailcall} mov esp {register from TAILCALL} } else // remove the ret jmp _targetfunction } resulting code mov ecx _targetfunction #load callee epilogue sub esp 4 # TAILCALL stackslot for eip add esp 8 #caller has 2 more arg mov esp ecx ret if the targetfunction is known epilogue #TAILCALL add esp 8 jmp _targetfunction Lowering of TAILCALL would also take caring of adjusting the argument stores.
Apparently Analagous Threads
- [LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling
- [LLVMdev] Destination register needs to be valid after callee saved register restore when tail calling
- [LLVMdev] RFC: Tail call optimization X86
- [LLVMdev] RFC: Tail call optimization X86
- [LLVMdev] RFC: Tail call optimization X86