thr3ads.net - llvm dev - [LLVMdev] RFC: Tail call optimization X86 [Sep 2007]

If this information is useful, please help other people find it:
Share via:

Arnold Schwaighofer

2007-Sep-25 07:01 UTC

[LLVMdev] RFC: Tail call optimization X86

> > FastCC use to be caller pops arguments so there was no stack
> > adjustment after the
> > call to qux. Now FastCC has callee pops arguments on return semantics
> > so the
> > x86 backend inserts a stack adjustment after the call.
> >
> > _array:
> >       subl    $12, %esp
> >         movss   LCPI1_0, %xmm0
> >         mulss   16(%esp), %xmm0
> >         movss   %xmm0, (%esp)
> >         call    L_qux$stub
> >         subl    $4, %esp   << stack adjustment because qux pops
> > arguments on return
> >         mulss   LCPI1_0, %xmm0
> >         addl    $12, %esp
> >         ret     $4
>
> Ok. Please make sure for now this only takes effect if tail call
> optimization is  turned on though. Also, we need to fold the subl into
> addl if possible.I think you misunderstood me. I have not added code to insert this
subl it is done already by the backend (before my changes) with
certain calling conventions (callee pops arguments). There used to be
already calling conventions e.g. x86_fastcall that would have cause
the stack adjustment only fastcc was not one of them. Now that fastcc
can cause tail call optimization i had to change the convention from
caller pops arguments to callee pops arguments in order to allow tail
call optimization in a general way.>
> >
> >
> >> Index: include/llvm/Target/TargetLowering.h
> >>
==================================================================> >>
--- include/llvm/Target/TargetLowering.h     (revision 42247)
> >> +++ include/llvm/Target/TargetLowering.h     (working copy)
> >> @@ -851,8 +851,18 @@
> >>    virtual std::pair<SDOperand, SDOperand>
> >>    LowerCallTo(SDOperand Chain, const Type *RetTy, bool
> >> RetTyIsSigned,
> >>                bool isVarArg, unsigned CallingConv, bool
isTailCall,
> >> -              SDOperand Callee, ArgListTy &Args, SelectionDAG
&DAG);
> >> +              bool isNextInstRet, SDOperand Callee, ArgListTy
&Args,
> >> +              SelectionDAG &DAG);
> >> +  // IsEligibleForTailCallElimination - Check whether the call is
> >> eligible for
> >> +  // tailcall elimination
> >> +  virtual bool IsEligibleForTailCallElimination(SelectionDAG&
DAG,
> >> +                                                bool
IsNextInstRet,
> >> +                                                SDOperand Callee,
> >> +                                                unsigned
CalleeCC)
> >> const {
> >> +    return false;
> >> +  }
> >> +
> >>
> >> I am not too keen on "isNextInstRet". We should never
use instruction
> >> ordering before scheduling to determine whether it's possible
to
> >> perform an optimization. IsEligibleForTailCallElimination() should
> >> determine the feasibility on its own, no?
> >
> > My assumption:
> > Tail call optimization is performed if the instruction following the
> > call is
> > a return and the return uses the result of the call or is a void
> > return.
>
> Ok. But using whether the next instruction is a return at this time is
> too restrictive. It's possible there may be instructions that are
> between the call and the return that do not interfere with tail call
> optimization.
>
> >
> >
> > The problem is that at the point (in TargetLowering:LowerCallTo) where
> > IsEligibleForTailCallElimination is called the SDOperand node for the
> > following instructions (we are looking for a return) has not been
> > build.
> > So IsEligibleForTailCallElimination can't determine - using
> > 'Callee'  -
> > whether the instruction following the call is a return. That is the
> > reason
> > why i pass  the IsNextInstRet flag to TargetLowering::LowerCallTo.
>
> Right. But I think the right way to deal with this is to assume all
> the tail calls are eligible for tail call optimizations at this point.
> Then checks for eligibility and fix the call nodes after all the
> instructions are lowered.I am not sure that i understand where this point would be. As I view
it the only place would be in LegalizeDAG.cpp in

SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) {
...
 case ISD::CALL:
    // The only option for this is to custom lower it.
    Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG);

}

would change to (pseudo code)
SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) {
 SDOperand Result = Op;
...
 case ISD::CALL:
    // The only option for this is to custom lower it.

    Result =
modifyCallSoThatTailAttributeReallyIndicatesWhetherThisCallIsEligibleForTailCallOptimization(Result);

    Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG);

}

because any point after that (e.g. scheduling) would be to late
because the target specific loweringcode (e.g. LowerCALL ->
LowerCCCall)

Evan Cheng

2007-Sep-25 18:14 UTC

head link

[LLVMdev] RFC: Tail call optimization X86

On Sep 25, 2007, at 12:01 AM, Arnold Schwaighofer wrote:
>>> FastCC use to be caller pops arguments so there was no stack
>>> adjustment after the
>>> call to qux. Now FastCC has callee pops arguments on return  
>>> semantics
>>> so the
>>> x86 backend inserts a stack adjustment after the call.
>>>
>>> _array:
>>>       subl    $12, %esp
>>>         movss   LCPI1_0, %xmm0
>>>         mulss   16(%esp), %xmm0
>>>         movss   %xmm0, (%esp)
>>>         call    L_qux$stub
>>>         subl    $4, %esp   << stack adjustment because qux
pops
>>> arguments on return
>>>         mulss   LCPI1_0, %xmm0
>>>         addl    $12, %esp
>>>         ret     $4
>>
>> Ok. Please make sure for now this only takes effect if tail call
>> optimization is  turned on though. Also, we need to fold the subl  
>> into
>> addl if possible.
> I think you misunderstood me. I have not added code to insert this
> subl it is done already by the backend (before my changes) with
> certain calling conventions (callee pops arguments). There used to be
> already calling conventions e.g. x86_fastcall that would have cause
> the stack adjustment only fastcc was not one of them. Now that fastcc
> can cause tail call optimization i had to change the convention from
> caller pops arguments to callee pops arguments in order to allow tail
> call optimization in a general way.
Hmmm. Ok. So this is due to X86CallingConv.td changes? Unfortunately  
that's not controlled by options. Ok then.
>>
>>>
>>>
>>>> Index: include/llvm/Target/TargetLowering.h
>>>>
==================================================================>>>>
--- include/llvm/Target/TargetLowering.h     (revision 42247)
>>>> +++ include/llvm/Target/TargetLowering.h     (working copy)
>>>> @@ -851,8 +851,18 @@
>>>>    virtual std::pair<SDOperand, SDOperand>
>>>>    LowerCallTo(SDOperand Chain, const Type *RetTy, bool
>>>> RetTyIsSigned,
>>>>                bool isVarArg, unsigned CallingConv, bool  
>>>> isTailCall,
>>>> -              SDOperand Callee, ArgListTy &Args,
SelectionDAG
>>>> &DAG);
>>>> +              bool isNextInstRet, SDOperand Callee, ArgListTy
>>>> &Args,
>>>> +              SelectionDAG &DAG);
>>>> +  // IsEligibleForTailCallElimination - Check whether the call
is
>>>> eligible for
>>>> +  // tailcall elimination
>>>> +  virtual bool
IsEligibleForTailCallElimination(SelectionDAG& DAG,
>>>> +                                                bool  
>>>> IsNextInstRet,
>>>> +                                                SDOperand
Callee,
>>>> +                                                unsigned
CalleeCC)
>>>> const {
>>>> +    return false;
>>>> +  }
>>>> +
>>>>
>>>> I am not too keen on "isNextInstRet". We should never
use
>>>> instruction
>>>> ordering before scheduling to determine whether it's
possible to
>>>> perform an optimization. IsEligibleForTailCallElimination()
should
>>>> determine the feasibility on its own, no?
>>>
>>> My assumption:
>>> Tail call optimization is performed if the instruction following
the
>>> call is
>>> a return and the return uses the result of the call or is a void
>>> return.
>>
>> Ok. But using whether the next instruction is a return at this  
>> time is
>> too restrictive. It's possible there may be instructions that are
>> between the call and the return that do not interfere with tail call
>> optimization.
>>
>>>
>>>
>>> The problem is that at the point (in TargetLowering:LowerCallTo)  
>>> where
>>> IsEligibleForTailCallElimination is called the SDOperand node for  
>>> the
>>> following instructions (we are looking for a return) has not been
>>> build.
>>> So IsEligibleForTailCallElimination can't determine - using
>>> 'Callee'  -
>>> whether the instruction following the call is a return. That is the
>>> reason
>>> why i pass  the IsNextInstRet flag to TargetLowering::LowerCallTo.
>>
>> Right. But I think the right way to deal with this is to assume all
>> the tail calls are eligible for tail call optimizations at this  
>> point.
>> Then checks for eligibility and fix the call nodes after all the
>> instructions are lowered.
> I am not sure that i understand where this point would be. As I view
> it the only place would be in LegalizeDAG.cpp in
No, in SelectionDAGISel.cpp:BuildSelectionDAG(). After line 4610, the  
complete DAG is available. You can add a call here to look at the  
last call and the terminator to decide whether the call is really a  
candidate for tail call optimization. You can fix it up at this point.

Evan
>
> SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) {
> ...
>  case ISD::CALL:
>     // The only option for this is to custom lower it.
>     Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG);
>
> }
>
> would change to (pseudo code)
> SDOperand SelectionDAGLegalize::LegalizeOp(SDOperand Op) {
>  SDOperand Result = Op;
> ...
>  case ISD::CALL:
>     // The only option for this is to custom lower it.
>
>     Result =  
> modifyCallSoThatTailAttributeReallyIndicatesWhetherThisCallIsEligibleF 
> orTailCallOptimization(Result);
>
>     Tmp3 = TLI.LowerOperation(Result.getValue(0), DAG);
>
> }
>
> because any point after that (e.g. scheduling) would be to late
> because the target specific loweringcode (e.g. LowerCALL ->
> LowerCCCall)
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Chris Lattner

2007-Sep-26 00:26 UTC

head link

[LLVMdev] RFC: Tail call optimization X86

On Tue, 25 Sep 2007, Evan Cheng wrote:>> the stack adjustment only fastcc was not one of them. Now that fastcc
>> can cause tail call optimization i had to change the convention from
>> caller pops arguments to callee pops arguments in order to allow tail
>> call optimization in a general way.
>
> Hmmm. Ok. So this is due to X86CallingConv.td changes? Unfortunately
> that's not controlled by options. Ok then.
Sure it can be, you can set up custom predicates, for example the 
X86CallingConv.td file has:

class CCIfSubtarget<string F, CCAction A>
  :
CCIf<!strconcat("State.getTarget().getSubtarget<X86Subtarget>().",
F), A>;

It would be straight-forward to have a CCIf defined to check some command 
line argument.  I think enabling this as llcbeta for a few nights makes 
sense before turning it on by default.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Sep 2007 - [LLVMdev] RFC: Tail call optimization X86

[LLVMdev] RFC: Tail call optimization X86

[LLVMdev] RFC: Tail call optimization X86

[LLVMdev] RFC: Tail call optimization X86

Apparently Analagous Threads