> Right but if only d() performs the move then what is going on between the
tail
> calls and the returns in a, b, and c that inhibits TCO? I thought they
might
> be redundant moves (but I've no idea!).
A okay now i understand what you mean. And the confusion originates
from the multiple layers of representations that are involved when
code is generated.
You generate LLVM IR. This gets transformed to SelectionDAG nodes
during code generation. The SelectionDAG nodes are lowered to native
code. A struct return is represented as multiple output values of the
call SelectionDAG node. Those multiple output values are bundled in a
merge_values SelectionDAG node which in turn is the input to the
return SelectionDAG node. This merge_values node is what is preventing
the current tail call optimization code to perform the optimization.
Not because it is a fundamental problem but because i didn't think of
this case when i implemented it. The code below illustrates part of
the problem i.e it expects the return node to immediately follow the
call node. It would have to change to allow for the merge_values node
to be in between and some other code in the backend specific lowering
also would have to change.
static bool CheckTailCallReturnConstraints(CallSDNode *TheCall, SDValue Ret) {
unsigned NumOps = Ret.getNumOperands();
if ((NumOps == 1 &&
(Ret.getOperand(0) == SDValue(TheCall,1) ||
Ret.getOperand(0) == SDValue(TheCall,0))) ||
(NumOps > 1 &&
Ret.getOperand(0) == SDValue(TheCall,
TheCall->getNumValues()-1) &&
Ret.getOperand(1) == SDValue(TheCall,0)))
return true;
return false;
}
Actual platform specific instructions are generated at a later stage.
regards arnold