The patch is against revision 42247. -------------- next part -------------- A non-text attachment was scrubbed... Name: tailcall-src.patch Type: application/octet-stream Size: 62639 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070923/4770302f/attachment.obj>
Hi Arnold,
This is a very good first step! Thanks! Comments below.
Evan
Index: test/CodeGen/X86/constant-pool-remat-0.ll
==================================================================---
test/CodeGen/X86/constant-pool-remat-0.ll (revision 42247)
+++ test/CodeGen/X86/constant-pool-remat-0.ll (working copy)
@@ -1,8 +1,10 @@
; RUN: llvm-as < %s | llc -march=x86-64 | grep LCPI | count 3
; RUN: llvm-as < %s | llc -march=x86-64 -stats -info-output-file - |
grep asm-printer | grep 6
; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 | grep LCPI | count 3
-; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 -stats -info-
output-file - | grep asm-printer | grep 8
-
+; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 -stats -info-
output-file - | grep asm-printer | grep 9
+; change preceeding line form ... | grep 8 to ..| grep 9 since
+; with new fastcc has std call semantics causing a stack adjustment
+; after the function call
Not sure if I understand this. Can you illustrate with an example?
Index: include/llvm/Target/TargetLowering.h
==================================================================---
include/llvm/Target/TargetLowering.h (revision 42247)
+++ include/llvm/Target/TargetLowering.h (working copy)
@@ -851,8 +851,18 @@
virtual std::pair<SDOperand, SDOperand>
LowerCallTo(SDOperand Chain, const Type *RetTy, bool RetTyIsSigned,
bool isVarArg, unsigned CallingConv, bool isTailCall,
- SDOperand Callee, ArgListTy &Args, SelectionDAG &DAG);
+ bool isNextInstRet, SDOperand Callee, ArgListTy &Args,
+ SelectionDAG &DAG);
+ // IsEligibleForTailCallElimination - Check whether the call is
eligible for
+ // tailcall elimination
+ virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG,
+ bool IsNextInstRet,
+ SDOperand Callee,
+ unsigned CalleeCC)
const {
+ return false;
+ }
+
I am not too keen on "isNextInstRet". We should never use instruction
ordering before scheduling to determine whether it's possible to
perform an optimization. IsEligibleForTailCallElimination() should
determine the feasibility on its own, no?
+// check whether the instruction following the tailcall is a return
instruction
+// and whether its return type is void or if it uses the value
defined by the
+// tail call
+bool IsNextInstructionReturn(Instruction &I) {
+ bool IsNextInstRet = false;
+ BasicBlock *BB = I.getParent();
+ BasicBlock::iterator BI = &I;
+ assert(BI != BB->end() && "Woohooa");
+ ++BI;
+ if (BI != BB->end()) {
+ ReturnInst *RI = dyn_cast<ReturnInst>(BI);
+ if (RI) {
+ if (RI->getReturnValue()==0 ||
+ std::find(RI->op_begin(), RI->op_end(),
+ (Value*)&I) != RI->op_end()) {
+ IsNextInstRet=true;
+ }
+ }
+ }
+ return IsNextInstRet;
+}
+
Some stylistic nitpicks. Please write the comments as:
/// IsNextInstructionReturn - Check whether..
+ assert(BI != BB->end() && "Woohooa");
Better assertion messages please. :-)
Why not write it like this:
+bool IsNextInstructionReturn(Instruction &I) {
+ BasicBlock *BB = I.getParent();
+ BasicBlock::iterator BI = &I;
+ assert(BI != BB->end() && "Woohooa");
+ ++BI;
+ if (BI != BB->end()) {
+ if (ReturnInst *RI = dyn_cast<ReturnInst>(BI)) {
+ if (RI->getReturnValue()==0 ||
+ std::find(RI->op_begin(), RI->op_end(),
+ (Value*)&I) != RI->op_end())
+ return true;
+ }
+ }
+ return false;
+}
Also, shouldn't this function be "static"?
let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1 in
+ def TAILJMPd : IBr<0xE9, (ins i32imm:$dst), "jmp\t${dst:call} #
TAILCALL",
+ []>;
+let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1 in
+ def TAILJMPr : I<0xFF, MRM4r, (outs), (ins GR32:$dst), "jmp{l}\t{*}
$dst # TAILCALL",
+ []>;
+let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1 in
def TAILJMPm : I<0xFF, MRM4m, (outs), (ins i32mem:$dst),
"jmp\t{*}$dst # TAIL CALL", []>;
Please fix the inconsistency: "TAILCALL" vs. "TAIL CALL".
+SDOperand GetPossiblePreceedingTailCall(SDOperand Chain) {
This should be "static"?
+
+ Chain = DAG.getNode( X86ISD::CALL,
NodeTys, &Ops[0], Ops.size());
Stylistic nitpick: please merge 2 lines and remove the space after '('.
- Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
- Flag = Chain.getValue(1);
+ Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
+ Flag = Chain.getValue(1);
What happened here?
+// Check to see whether the next instruction following the call is a
return
+// A function is eligible if caller/callee calling conventions
match, currently
+// only fastcc supports tail calls, and the function CALL is
immediatly followed
+// by a RET
+bool X86TargetLowering::IsEligibleForTailCallElimination
(SelectionDAG& DAG,
+ bool
IsNextInstRet,
+ SDOperand
Callee,
+ unsigned
CalleeCC)
+ const {
Having "const {" on a separate line looks funky. :-)
+ bool IsEligible = false;
+ //if (IsNextInstRet && false == Subtarget->is64Bit()) {
Please remove code that's commented out.
+ if( G != 0 &&
+ (GV = G->getGlobal()) &&
+ (GV->hasHiddenVisibility() || GV->hasProtectedVisibility
()))
+ IsEligible=true;
+ else
+ IsEligible=false;
if( G ==> if (G :-)
No need to set IsEligible to false since it's initialized to false.
Evan
On Sep 23, 2007, at 9:58 AM, Arnold Schwaighofer wrote:
> The patch is against revision 42247.
> <tailcall-src.patch>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On 24 Sep 2007, at 09:18, Evan Cheng wrote:> +; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 -stats -info- > output-file - | grep asm-printer | grep 9 > +; change preceeding line form ... | grep 8 to ..| grep 9 since > +; with new fastcc has std call semantics causing a stack adjustment > +; after the function call > > Not sure if I understand this. Can you illustrate with an example?Sure The code generated used to be _array: subl $12, %esp movss LCPI1_0, %xmm0 mulss 16(%esp), %xmm0 movss %xmm0, (%esp) call L_qux$stub mulss LCPI1_0, %xmm0 addl $12, %esp ret FastCC use to be caller pops arguments so there was no stack adjustment after the call to qux. Now FastCC has callee pops arguments on return semantics so the x86 backend inserts a stack adjustment after the call. _array: subl $12, %esp movss LCPI1_0, %xmm0 mulss 16(%esp), %xmm0 movss %xmm0, (%esp) call L_qux$stub subl $4, %esp << stack adjustment because qux pops arguments on return mulss LCPI1_0, %xmm0 addl $12, %esp ret $4> Index: include/llvm/Target/TargetLowering.h > ==================================================================> --- include/llvm/Target/TargetLowering.h (revision 42247) > +++ include/llvm/Target/TargetLowering.h (working copy) > @@ -851,8 +851,18 @@ > virtual std::pair<SDOperand, SDOperand> > LowerCallTo(SDOperand Chain, const Type *RetTy, bool > RetTyIsSigned, > bool isVarArg, unsigned CallingConv, bool isTailCall, > - SDOperand Callee, ArgListTy &Args, SelectionDAG &DAG); > + bool isNextInstRet, SDOperand Callee, ArgListTy &Args, > + SelectionDAG &DAG); > + // IsEligibleForTailCallElimination - Check whether the call is > eligible for > + // tailcall elimination > + virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG, > + bool IsNextInstRet, > + SDOperand Callee, > + unsigned CalleeCC) > const { > + return false; > + } > + > > I am not too keen on "isNextInstRet". We should never use instruction > ordering before scheduling to determine whether it's possible to > perform an optimization. IsEligibleForTailCallElimination() should > determine the feasibility on its own, no?My assumption: Tail call optimization is performed if the instruction following the call is a return and the return uses the result of the call or is a void return. The problem is that at the point (in TargetLowering:LowerCallTo) where IsEligibleForTailCallElimination is called the SDOperand node for the following instructions (we are looking for a return) has not been build. So IsEligibleForTailCallElimination can't determine - using 'Callee' - whether the instruction following the call is a return. That is the reason why i pass the IsNextInstRet flag to TargetLowering::LowerCallTo. TargetLowering::LowerCallTo(SDOperand Chain, const Type *RetTy, bool RetTyIsSigned, bool isVarArg, unsigned CallingConv, bool isTailCall, bool isNextInstRet, SDOperand Callee, ArgListTy &Args, SelectionDAG &DAG) { SmallVector<SDOperand, 32> Ops; Ops.push_back(Chain); // Op#0 - Chain Ops.push_back(DAG.getConstant(CallingConv, getPointerTy())); // Op#1 - CC Ops.push_back(DAG.getConstant(isVarArg, getPointerTy())); // Op#2 - VarArg if (isTailCall && IsEligibleForTailCallElimination(DAG, isNextInstRet, Callee, CallingConv)) Ops.push_back(DAG.getConstant(true, getPointerTy())); // Op#3 - Tail The instructions i am refering here to are LLVM IR instructions not target machine instructions. I could remove isNextInstRet from IsEligibleForTailCallElimination because in TargetLowering::LowerCallTo it is clear that we can't do tail call optimization if isNextInstRet (this time coming from SelectionDAGLowering::LowerCallTo) is false. if (isTailCall && isNextInstRet && IsEligibleForTailCallElimination(DAG,, Callee, CallingConv)) Note that I interpreted the following paragraph of you:> IsEligibleForTailCallElimination() should be a target hook. This way > TargetLowering::LowerCallTo() can determine whether a call is > eligible for tail call elimination and insert the current ISD::CALL > node. X86TargetLowering::LowerX86_32FastCCCallTo() will not have to > handle non-tail calls.in the following way: I check in TargetLowering:LowerCallTo whether the call IsEligibleForTailCallElimination and insert a ISD:CALL node with the tailcall attribute set or not. Then in X86TargetLowering::LowerCALL i can dispatch the right lowering code:> SDOperand X86TargetLowering::LowerCALL(SDOperand Op, SelectionDAG > &DAG) { > unsigned CallingConv = cast<ConstantSDNode>(Op.getOperand(1))- > >getValue(); > bool isTailCall = cast<ConstantSDNode>(Op.getOperand(3))->getValue > () != 0; > > case CallingConv::Fast: > if (isTailCall && PerformTailCallOpt) > return LowerX86_TailCallTo(Op, DAG, CallingConv); > else > return LowerCCCCallTo(Op,DAG, CallingConv);> Some stylistic nitpicks. Please write the comments as: > /// IsNextInstructionReturn - Check whether..Will do.> + assert(BI != BB->end() && "Woohooa"); > Better assertion messages please. :-) > > Why not write it like this: >okay> Also, shouldn't this function be "static"?okay> Please fix the inconsistency: "TAILCALL" vs. "TAIL CALL". >okay> +SDOperand GetPossiblePreceedingTailCall(SDOperand Chain) { > > This should be "static"?okay> + > + Chain = DAG.getNode( X86ISD::CALL, > NodeTys, &Ops[0], Ops.size()); > > Stylistic nitpick: please merge 2 lines and remove the space after > '('.okay> - Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2); > - Flag = Chain.getValue(1); > + Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2); > + Flag = Chain.getValue(1); > > What happened here?Ups corrected.> > +// Check to see whether the next instruction following the call is a > return > +// A function is eligible if caller/callee calling conventions > match, currently > +// only fastcc supports tail calls, and the function CALL is > immediatly followed > +// by a RET > +bool X86TargetLowering::IsEligibleForTailCallElimination > (SelectionDAG& DAG, > + bool > IsNextInstRet, > + SDOperand > Callee, > + unsigned > CalleeCC) > + const { > > Having "const {" on a separate line looks funky. :-) >will change ;)> Please remove code that's commented out.> if( G ==> if (G :-)> No need to set IsEligible to false since it's initialized to false. >;) okay dokey regards