John Brawn
2015-Jan-13 10:55 UTC
[LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets
Forcing tailcalls, even when it isn’t profitable in terms of performance (or space in the case of -Os, though I can’t off the top of my head think of any case where a faster tail call would also be larger), is what the -tailcallopt option is for: see http://llvm.org/docs/CodeGenerator.html#tail-call-optimization John From: bruce.hoult at gmail.com [mailto:bruce.hoult at gmail.com] On Behalf Of Bruce Hoult Sent: 13 January 2015 00:01 To: John Brawn Subject: Re: [LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets On Tue, Jan 13, 2015 at 3:50 AM, John Brawn <john.brawn at arm.com> wrote:> During epilog generation, spill register restoring will be done within > the emit epilogue function. > If LR happens to be spilled on the stack by the prologue, it's restored > by use of a scratch register > just before restoring the other registers.POP is 1+N cycles whereas LDR is 2 cycles. If we need to LDR lr from the stack then POP r4 then that's 2 (LDR) + 1+1 (POP) + 1 (MOV to lr) + 1 (ADD sp) = 6 cycles, but a POP {r4,lr} is just 3 cycles. You appear to be using speed as the figure of merit, but that is not the point of tail call optimisation (except incidentally). TCO is to minimise the use of precious stack space, and in fact to allow certain algorithms and program transformations to run in constant stack space. If a programmer assumes that TCO is available and writes their program using continuation-passing style, and then TCO does not actually happen, that is a correctness issue and the program will overflow the stack and crash very quickly. A few register loads or spills is a distant second consideration. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150113/5d9b1893/attachment.html>