> > FastCC use to be caller pops arguments so there was no stack > > adjustment after the > > call to qux. Now FastCC has callee pops arguments on return semantics > > so the > > x86 backend inserts a stack adjustment after the call. > > > > _array: > > subl $12, %esp > > movss LCPI1_0, %xmm0 > > mulss 16(%esp), %xmm0 > > movss %xmm0, (%esp) > > call L_qux$stub > > subl $4, %esp << stack adjustment because qux pops > > arguments on return > > mulss LCPI1_0, %xmm0 > > addl $12, %esp > > ret $4 > > Ok. Please make sure for now this only takes effect if tail call > optimization is turned on though. Also, we need to fold the subl into > addl if possible.I think you misunderstood me. I have not added code to insert this subl it is done already by the backend (before my changes) with certain calling conventions (callee pops arguments). There used to be already calling conventions e.g. x86_fastcall that would have cause the stack adjustment only fastcc was not one of them. Now that fastcc can cause tail call optimization i had to change the convention from caller pops arguments to callee pops arguments in order to allow tail call optimization in a general way.> > > > > > >> Index: include/llvm/Target/TargetLowering.h > >> ==================================================================> >> --- include/llvm/Target/TargetLowering.h (revision 42247) > >> +++ include/llvm/Target/TargetLowering.h (working copy) > >> @@ -851,8 +851,18 @@ > >> virtual std::pair<SDOperand, SDOperand> > >> LowerCallTo(SDOperand Chain, const Type *RetTy, bool > >> RetTyIsSigned, > >> bool isVarArg, unsigned CallingConv, bool isTailCall, > >> - SDOperand Callee, ArgListTy &Args, SelectionDAG &DAG); > >> + bool isNextInstRet, SDOperand Callee, ArgListTy &Args, > >> + SelectionDAG &DAG); > >> + // IsEligibleForTailCallElimination - Check whether the call is > >> eligible for > >> + // tailcall elimination > >> + virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG, > >> + bool IsNextInstRet, > >> + SDOperand Callee, > >> + unsigned CalleeCC) > >> const { > >> + return false; > >> + } > >> + > >> > >> I am not too keen on "isNextInstRet". We should never use instruction > >> ordering before scheduling to determine whether it's possible to > >> perform an optimization. IsEligibleForTailCallElimination() should > >> determine the feasibility on its own, no? > > > > My assumption: > > Tail call optimization is performed if the instruction following the > > call is > > a return and the return uses the result of the call or is a void > > return. > > Ok. But using whether the next instruction is a return at this time is > too restrictive. It's possible there may be instructions that are > between the call and the return that do not interfere with tail call > optimization. > > > > > > > The problem is that at the point (in TargetLowering:LowerCallTo) where > > IsEligibleForTailCallElimination is called the SDOperand node for the > > following instructions (we are looking for a return) has not been > > build. > > So IsEligibleForTailCallElimination can't determine - using > > 'Callee' - > > whether the instruction following the call is a return. That is the > > reason > > why i pass the IsNextInstRet flag to TargetLowering::LowerCallTo. > > Right. But I think the right way to deal with this is to assume all > the tail calls are eligible for tail call optimizations at this point. > Then checks for eligibility and fix the call nodes after all the > instructions are lowered.I am not sure that i understand where this point would be. As I view it the only place would be in LegalizeDAG.cpp in SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) { ... case ISD::CALL: // The only option for this is to custom lower it. Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG); } would change to (pseudo code) SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) { SDOperand Result = Op; ... case ISD::CALL: // The only option for this is to custom lower it. Result = modifyCallSoThatTailAttributeReallyIndicatesWhetherThisCallIsEligibleForTailCallOptimization(Result); Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG); } because any point after that (e.g. scheduling) would be to late because the target specific loweringcode (e.g. LowerCALL -> LowerCCCall)
On Sep 25, 2007, at 12:01 AM, Arnold Schwaighofer wrote:>>> FastCC use to be caller pops arguments so there was no stack >>> adjustment after the >>> call to qux. Now FastCC has callee pops arguments on return >>> semantics >>> so the >>> x86 backend inserts a stack adjustment after the call. >>> >>> _array: >>> subl $12, %esp >>> movss LCPI1_0, %xmm0 >>> mulss 16(%esp), %xmm0 >>> movss %xmm0, (%esp) >>> call L_qux$stub >>> subl $4, %esp << stack adjustment because qux pops >>> arguments on return >>> mulss LCPI1_0, %xmm0 >>> addl $12, %esp >>> ret $4 >> >> Ok. Please make sure for now this only takes effect if tail call >> optimization is turned on though. Also, we need to fold the subl >> into >> addl if possible. > I think you misunderstood me. I have not added code to insert this > subl it is done already by the backend (before my changes) with > certain calling conventions (callee pops arguments). There used to be > already calling conventions e.g. x86_fastcall that would have cause > the stack adjustment only fastcc was not one of them. Now that fastcc > can cause tail call optimization i had to change the convention from > caller pops arguments to callee pops arguments in order to allow tail > call optimization in a general way.Hmmm. Ok. So this is due to X86CallingConv.td changes? Unfortunately that's not controlled by options. Ok then.>> >>> >>> >>>> Index: include/llvm/Target/TargetLowering.h >>>> ==================================================================>>>> --- include/llvm/Target/TargetLowering.h (revision 42247) >>>> +++ include/llvm/Target/TargetLowering.h (working copy) >>>> @@ -851,8 +851,18 @@ >>>> virtual std::pair<SDOperand, SDOperand> >>>> LowerCallTo(SDOperand Chain, const Type *RetTy, bool >>>> RetTyIsSigned, >>>> bool isVarArg, unsigned CallingConv, bool >>>> isTailCall, >>>> - SDOperand Callee, ArgListTy &Args, SelectionDAG >>>> &DAG); >>>> + bool isNextInstRet, SDOperand Callee, ArgListTy >>>> &Args, >>>> + SelectionDAG &DAG); >>>> + // IsEligibleForTailCallElimination - Check whether the call is >>>> eligible for >>>> + // tailcall elimination >>>> + virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG, >>>> + bool >>>> IsNextInstRet, >>>> + SDOperand Callee, >>>> + unsigned CalleeCC) >>>> const { >>>> + return false; >>>> + } >>>> + >>>> >>>> I am not too keen on "isNextInstRet". We should never use >>>> instruction >>>> ordering before scheduling to determine whether it's possible to >>>> perform an optimization. IsEligibleForTailCallElimination() should >>>> determine the feasibility on its own, no? >>> >>> My assumption: >>> Tail call optimization is performed if the instruction following the >>> call is >>> a return and the return uses the result of the call or is a void >>> return. >> >> Ok. But using whether the next instruction is a return at this >> time is >> too restrictive. It's possible there may be instructions that are >> between the call and the return that do not interfere with tail call >> optimization. >> >>> >>> >>> The problem is that at the point (in TargetLowering:LowerCallTo) >>> where >>> IsEligibleForTailCallElimination is called the SDOperand node for >>> the >>> following instructions (we are looking for a return) has not been >>> build. >>> So IsEligibleForTailCallElimination can't determine - using >>> 'Callee' - >>> whether the instruction following the call is a return. That is the >>> reason >>> why i pass the IsNextInstRet flag to TargetLowering::LowerCallTo. >> >> Right. But I think the right way to deal with this is to assume all >> the tail calls are eligible for tail call optimizations at this >> point. >> Then checks for eligibility and fix the call nodes after all the >> instructions are lowered. > I am not sure that i understand where this point would be. As I view > it the only place would be in LegalizeDAG.cpp inNo, in SelectionDAGISel.cpp:BuildSelectionDAG(). After line 4610, the complete DAG is available. You can add a call here to look at the last call and the terminator to decide whether the call is really a candidate for tail call optimization. You can fix it up at this point. Evan> > SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) { > ... > case ISD::CALL: > // The only option for this is to custom lower it. > Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG); > > } > > would change to (pseudo code) > SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) { > SDOperand Result = Op; > ... > case ISD::CALL: > // The only option for this is to custom lower it. > > Result = > modifyCallSoThatTailAttributeReallyIndicatesWhetherThisCallIsEligibleF > orTailCallOptimization(Result); > > Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG); > > } > > because any point after that (e.g. scheduling) would be to late > because the target specific loweringcode (e.g. LowerCALL -> > LowerCCCall) > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu llvm.cs.uiuc.edu > lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Tue, 25 Sep 2007, Evan Cheng wrote:>> the stack adjustment only fastcc was not one of them. Now that fastcc >> can cause tail call optimization i had to change the convention from >> caller pops arguments to callee pops arguments in order to allow tail >> call optimization in a general way. > > Hmmm. Ok. So this is due to X86CallingConv.td changes? Unfortunately > that's not controlled by options. Ok then.Sure it can be, you can set up custom predicates, for example the X86CallingConv.td file has: class CCIfSubtarget<string F, CCAction A> : CCIf<!strconcat("State.getTarget().getSubtarget<X86Subtarget>().", F), A>; It would be straight-forward to have a CCIf defined to check some command line argument. I think enabling this as llcbeta for a few nights makes sense before turning it on by default. -Chris -- nondot.org/sabre llvm.org