Bjoern Haase
2015-Jan-13 19:57 UTC
[LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets
Am 13.01.2015 um 12:59 schrieb Bruce Hoult:> That link says "currently supported on x86/x86-64 and PowerPC". The purpose > of the current patch appears to be adding support for this flag for Thumb-1? > > (with the implication that it is already supported for Thumb-2 and ARM32 > and the documentation is out of date) > > I could, of course, be confused. > > On Tue, Jan 13, 2015 at 11:55 PM, John Brawn <john.brawn at arm.com> wrote: > >> Forcing tailcalls, even when it isn’t profitable in terms of performance >> (or space in the case of >> >> -Os, though I can’t off the top of my head think of any case where a >> faster tail call would also >> >> be larger), is what the -tailcallopt option is for: see >> >> http://llvm.org/docs/CodeGenerator.html#tail-call-optimization >>The patch is for enabling tail call and sibling call optimization in the form that is currently also supported for thumb2 and arm.> Bruce Hoult wrote : > >TCO is to minimise the use of precious stack space, and in fact to allow >certain algorithms and program transformations to run in constant stack >space. > > >If a programmer assumes that TCO is available and writes their program >using continuation-passing style, and then TCO does not actually happen, >that is a correctness issue and the program will overflow the stack and >crash very quickly.My motivation for this patch is specifically, that I am dealing with a communication stack where state machines use continuation-passing style coding and where tail call optimization is important. The present level of sibling/tail call optimization supported by thumb2 will do. With the current head version for thumb1 we have a problem. Therefore I'd very much appreciate having tail-call optimization available also for thumb1 in llvm mainline. Personally, I would not mind requiring a special compile switch for activating the support for thumb1. I.e. not activating it as part of the standard optimization levels Os, O2 or O3. Yours, Björn.