On 24 Sep 2007, at 09:18, Evan Cheng wrote:> +; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 -stats -info-
> output-file - | grep asm-printer | grep 9
> +; change preceeding line form ... | grep 8 to ..| grep 9 since
> +; with new fastcc has std call semantics causing a stack adjustment
> +; after the function call
>
> Not sure if I understand this. Can you illustrate with an example?
Sure
The code generated used to be
_array:
subl $12, %esp
movss LCPI1_0, %xmm0
mulss 16(%esp), %xmm0
movss %xmm0, (%esp)
call L_qux$stub
mulss LCPI1_0, %xmm0
addl $12, %esp
ret
FastCC use to be caller pops arguments so there was no stack
adjustment after the
call to qux. Now FastCC has callee pops arguments on return semantics
so the
x86 backend inserts a stack adjustment after the call.
_array:
subl $12, %esp
movss LCPI1_0, %xmm0
mulss 16(%esp), %xmm0
movss %xmm0, (%esp)
call L_qux$stub
subl $4, %esp << stack adjustment because qux pops
arguments on return
mulss LCPI1_0, %xmm0
addl $12, %esp
ret $4
> Index: include/llvm/Target/TargetLowering.h
> ==================================================================> ---
include/llvm/Target/TargetLowering.h (revision 42247)
> +++ include/llvm/Target/TargetLowering.h (working copy)
> @@ -851,8 +851,18 @@
> virtual std::pair<SDOperand, SDOperand>
> LowerCallTo(SDOperand Chain, const Type *RetTy, bool
> RetTyIsSigned,
> bool isVarArg, unsigned CallingConv, bool isTailCall,
> - SDOperand Callee, ArgListTy &Args, SelectionDAG
&DAG);
> + bool isNextInstRet, SDOperand Callee, ArgListTy &Args,
> + SelectionDAG &DAG);
> + // IsEligibleForTailCallElimination - Check whether the call is
> eligible for
> + // tailcall elimination
> + virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG,
> + bool IsNextInstRet,
> + SDOperand Callee,
> + unsigned CalleeCC)
> const {
> + return false;
> + }
> +
>
> I am not too keen on "isNextInstRet". We should never use
instruction
> ordering before scheduling to determine whether it's possible to
> perform an optimization. IsEligibleForTailCallElimination() should
> determine the feasibility on its own, no?
My assumption:
Tail call optimization is performed if the instruction following the
call is
a return and the return uses the result of the call or is a void return.
The problem is that at the point (in TargetLowering:LowerCallTo) where
IsEligibleForTailCallElimination is called the SDOperand node for the
following instructions (we are looking for a return) has not been build.
So IsEligibleForTailCallElimination can't determine - using 'Callee'
-
whether the instruction following the call is a return. That is the
reason
why i pass the IsNextInstRet flag to TargetLowering::LowerCallTo.
TargetLowering::LowerCallTo(SDOperand Chain, const Type *RetTy,
bool RetTyIsSigned, bool isVarArg,
unsigned CallingConv, bool isTailCall,
bool isNextInstRet, SDOperand Callee,
ArgListTy &Args, SelectionDAG &DAG) {
SmallVector<SDOperand, 32> Ops;
Ops.push_back(Chain); // Op#0 - Chain
Ops.push_back(DAG.getConstant(CallingConv, getPointerTy())); //
Op#1 - CC
Ops.push_back(DAG.getConstant(isVarArg, getPointerTy())); //
Op#2 - VarArg
if (isTailCall &&
IsEligibleForTailCallElimination(DAG, isNextInstRet, Callee,
CallingConv))
Ops.push_back(DAG.getConstant(true, getPointerTy())); // Op#3 -
Tail
The instructions i am refering here to are LLVM IR instructions not
target machine instructions. I could remove isNextInstRet from
IsEligibleForTailCallElimination because in TargetLowering::LowerCallTo
it is clear that we can't do tail call optimization if isNextInstRet
(this time coming from
SelectionDAGLowering::LowerCallTo) is false.
if (isTailCall && isNextInstRet &&
IsEligibleForTailCallElimination(DAG,, Callee, CallingConv))
Note that I interpreted the following paragraph of you:> IsEligibleForTailCallElimination() should be a target hook. This way
> TargetLowering::LowerCallTo() can determine whether a call is
> eligible for tail call elimination and insert the current ISD::CALL
> node. X86TargetLowering::LowerX86_32FastCCCallTo() will not have to
> handle non-tail calls.
in the following way:
I check in TargetLowering:LowerCallTo whether the call
IsEligibleForTailCallElimination
and insert a ISD:CALL node with the tailcall attribute set or not.
Then in X86TargetLowering::LowerCALL i can dispatch the right
lowering code:
> SDOperand X86TargetLowering::LowerCALL(SDOperand Op, SelectionDAG
> &DAG) {
> unsigned CallingConv = cast<ConstantSDNode>(Op.getOperand(1))-
> >getValue();
> bool isTailCall =
cast<ConstantSDNode>(Op.getOperand(3))->getValue
> () != 0;
>
> case CallingConv::Fast:
> if (isTailCall && PerformTailCallOpt)
> return LowerX86_TailCallTo(Op, DAG, CallingConv);
> else
> return LowerCCCCallTo(Op,DAG, CallingConv);
> Some stylistic nitpicks. Please write the comments as:
> /// IsNextInstructionReturn - Check whether..
Will do.> + assert(BI != BB->end() && "Woohooa");
> Better assertion messages please. :-)
>
> Why not write it like this:
>
okay> Also, shouldn't this function be "static"?
okay> Please fix the inconsistency: "TAILCALL" vs. "TAIL
CALL".
>
okay> +SDOperand GetPossiblePreceedingTailCall(SDOperand Chain) {
>
> This should be "static"?
okay> +
> + Chain = DAG.getNode( X86ISD::CALL,
> NodeTys, &Ops[0], Ops.size());
>
> Stylistic nitpick: please merge 2 lines and remove the space after
> '('.
okay> - Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
> - Flag = Chain.getValue(1);
> + Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
> + Flag = Chain.getValue(1);
>
> What happened here?
Ups corrected.>
> +// Check to see whether the next instruction following the call is a
> return
> +// A function is eligible if caller/callee calling conventions
> match, currently
> +// only fastcc supports tail calls, and the function CALL is
> immediatly followed
> +// by a RET
> +bool X86TargetLowering::IsEligibleForTailCallElimination
> (SelectionDAG& DAG,
> + bool
> IsNextInstRet,
> + SDOperand
> Callee,
> + unsigned
> CalleeCC)
> + const {
>
> Having "const {" on a separate line looks funky. :-)
>
will change ;)> Please remove code that's commented out.
> if( G ==> if (G :-)
> No need to set IsEligible to false since it's initialized to false.
>
;) okay dokey
regards