On Sunday 22 February 2009 20:36:52 Duncan Sands wrote:> Hi Jon, > > > I have written a variety tests of tail calls for my HLVM and all passed > > with flying colors until I wrote this test (which is actually for > > algebraic datatypes) and discovered that it segfaults after ~100k > > iterations through what I think should be a tail call. Here's the IR: > > is this really a tail call?>From what I have understood of the LLVM docs about when tail calls geteliminated on x86 and x64 it should be a tail call, yes. http://llvm.org/docs/CodeGenerator.html#tailcallopt . Caller and callee have the calling convention fastcc. . The call is a tail call - in tail position (ret immediately follows call and ret uses value of call or is void). . Option -tailcallopt is enabled. . No variable argument lists are used. . On x86-64 when generating GOT/PIC code only module-local calls (visibility = hidden or protected) are supported. Those are all satisfied.> I didn't look closely but at a glance it seems to be passing a local stack > variable as a call parameter.In this case, the arguments are a { { i8*, i8* }*, i8* } and a i32. As I understand it, first-class structs are simply unpacked for argument passing so that is equivalent to passing { i8*, i8* }* and i8* and i32. In this case, the first is a pointer to a global variable and the second is a pointer to a malloc'd block. So I don't see why any of the arguments should be inhibiting tail call elimination. I just tested my theory that returning a first-class struct from a function inhibits tail call elimination and it seems that I was correct: altering this function to pass its return struct by pointer in the first argument fixes the stack overflow. Is this a bug in LLVM? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
Hi Jon,> >From what I have understood of the LLVM docs about when tail calls get > eliminated on x86 and x64 it should be a tail call, yes. > > http://llvm.org/docs/CodeGenerator.html#tailcallopt > > . Caller and callee have the calling convention fastcc. > . The call is a tail call - in tail position (ret immediately follows call and > ret uses value of call or is void). > . Option -tailcallopt is enabled. > . No variable argument lists are used. > . On x86-64 when generating GOT/PIC code only module-local calls (visibility = > hidden or protected) are supported. > > Those are all satisfied.this list is for the code generator, and it seems obviously incomplete: it makes no mention of local variables (alloca). Probably it is implicitly assuming that the call was marked "tail call" by the LLVM optimizers. So you also need to check under what conditions the LLVM optimizers do that.> > I didn't look closely but at a glance it seems to be passing a local stack > > variable as a call parameter. > > In this case, the arguments are a { { i8*, i8* }*, i8* } and a i32.Maybe, but I'm pretty sure at least one of these values was calculated by mucking around with allocas. Did you add the "tail call" mark yourself? If not, try removing it, and see if the LLVM optimizers add it back.> I just tested my theory that returning a first-class struct from a function > inhibits tail call elimination and it seems that I was correct: altering this > function to pass its return struct by pointer in the first argument fixes the > stack overflow. > > Is this a bug in LLVM?Could be, but first I'd like to be sure that you are not misusing tail calls. Ciao, Duncan.
Hello Duncan and Jon, I am the criminal responsible for the tail call implementation in the backends. On Mon, Feb 23, 2009 at 9:17 AM, Duncan Sands <baldrick at free.fr> wrote:> Hi Jon, > >> >From what I have understood of the LLVM docs about when tail calls get >> eliminated on x86 and x64 it should be a tail call, yes.See below.> this list is for the code generator, and it seems obviously incomplete: it makes > no mention of local variables (alloca). Probably it is implicitly assuming that > the call was marked "tail call" by the LLVM optimizers. So you also need to check > under what conditions the LLVM optimizers do that.The backend does not check the following condition but assumes it to be true.>From <http://llvm.org/docs/LangRef.html#i_call>:The optional "tail" marker indicates whether the callee function accesses any allocas or varargs in the caller. If the "tail" marker is present, the function call is eligible for tail call optimization. Note that calls may be marked "tail" even if they do not occur before a ret instruction. If the llvm IR is generated by a front end (HLVM) the front end must guarantee this condition.>> > I didn't look closely but at a glance it seems to be passing a local stack >> > variable as a call parameter.To me, also only glancing over the code and with a bit rusty llvm ir knowledge, it seems that %14 (the value passed to the tail call)is allocated in the stack frame (alloca) of the calling function. So the above condition is violated. %3 = alloca { { i8*, i8* }*, i8* } %13 = getelementptr { { i8*, i8* }*, i8* }* %3, i32 0 %14 = load { { i8*, i8* }*, i8* }* %13 %17 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8* }*,i8*} %14, i32 %16)>> Is this a bug in LLVM? > Could be, but first I'd like to be sure that you are not misusing tail calls.Assuming that i am right and a pointer to alloca'ed memory is passed this is not a bug but a misuse of the 'tail' instruction. Maybe we should add a link or a note on <http://llvm.org/docs/CodeGenerator.html#tailcallopt> to <http://llvm.org/docs/LangRef.html#i_call> that explains that the call can only be marked as 'tail' if no pointers to alloca'ed memory are passed. regards arnold