Jon Harrop wrote:> I've come up with the following minimal repro that segfaults on my machine:Jon, were you able to resolve this? FWIW, TOT is causing all kinds of weird segfaults related to tail calls in my Pure interpreter, too (at least on x86-64). In my case these seem to be limited to the JIT, however (batch-compiled Pure programs via opt+llc all work fine, even with TCO), so it's probably a different issue. When using JIT compilation, the Pure interpreter works fine with LLVM 2.3 thru 2.6, and also with early revisions of 2.7svn, but it fails most of my test suite with current TOT, even though the generated IR seems to be the same as before. Have there been any changes to the x86-64 backend of the JIT which might break tail call elimination? I didn't see any announcements about major changes in the JIT on the ml, so I have no idea what might be going wrong there. Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de WWW: http://www.musikinformatik.uni-mainz.de/ag
> Have there been any changes to the x86-64 backend of the JIT which might > break tail call elimination? I didn't see any announcements about major > changes in the JIT on the ml, so I have no idea what might be going > wrong there.Most probably recent changes (e.g. post-RA scheduling) just uncovered the bugs TCO always had. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University
On Sun, Nov 29, 2009 at 2:19 AM, Albert Graef <Dr.Graef at t-online.de> wrote:> Jon Harrop wrote: >> I've come up with the following minimal repro that segfaults on my machine: > > Jon, were you able to resolve this? > > FWIW, TOT is causing all kinds of weird segfaults related to tail calls > in my Pure interpreter, too (at least on x86-64). In my case these seem > to be limited to the JIT, however (batch-compiled Pure programs via > opt+llc all work fine, even with TCO), so it's probably a different > issue. When using JIT compilation, the Pure interpreter works fine with > LLVM 2.3 thru 2.6, and also with early revisions of 2.7svn, but it fails > most of my test suite with current TOT, even though the generated IR > seems to be the same as before. > > Have there been any changes to the x86-64 backend of the JIT which might > break tail call elimination? I didn't see any announcements about major > changes in the JIT on the ml, so I have no idea what might be going > wrong there. >Try batch compiling with the large code model. (llc -code-model=large) If that also causes tail calls to break, then I did something wrong in fixing far calls in the JIT.
On Sunday 29 November 2009 10:19:40 Albert Graef wrote:> Jon Harrop wrote: > > I've come up with the following minimal repro that segfaults on my > > machine: > > Jon, were you able to resolve this?Nope. I just worked around the problem because HLVM was segfaulting on a single stdlib function call (just removing "tail" fixed the segfault which was trying to read 0x1 according to ValGrind) so I inlined it and now I have no segfaults. I kept the full repro as a .ll though. Arnold showed that my minimal repro was buggy: I'd confused calling conventions. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
Jeffrey Yasskin wrote:> Try batch compiling with the large code model. (llc -code-model=large)Works fine. Anything else that I could try? -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de WWW: http://www.musikinformatik.uni-mainz.de/ag
On Nov 29, 2009, at 2:19 AM, Albert Graef wrote:> Have there been any changes to the x86-64 backend of the JIT which might > break tail call elimination? I didn't see any announcements about major > changes in the JIT on the ml, so I have no idea what might be going > wrong there.I haven't tested it, but the following pattern in X86Instr64bit.td looks suspicious as it appears to attempt to support direct tailcalls to arbitrary 64-bit immediates: def : Pat<(X86tcret GR64:$dst, imm:$off), (TCRETURNri64 GR64:$dst, imm:$off)>; Dan
On Nov 30, 2009, at 1:21 PM, Dan Gohman wrote:> > On Nov 29, 2009, at 2:19 AM, Albert Graef wrote: > >> Have there been any changes to the x86-64 backend of the JIT which might >> break tail call elimination? I didn't see any announcements about major >> changes in the JIT on the ml, so I have no idea what might be going >> wrong there. > > I haven't tested it, but the following pattern in X86Instr64bit.td > looks suspicious as it appears to attempt to support direct tailcalls > to arbitrary 64-bit immediates: > > def : Pat<(X86tcret GR64:$dst, imm:$off), > (TCRETURNri64 GR64:$dst, imm:$off)>;Arnold pointed out to me that I was mistaken here; this offset is a stack offset, so it's not the kind of thing I was looking for. With the recent changes to support regular calls where the callee is not within range for a 32-bit immediate on 64-bit targets, my suspicion was that perhaps tailcalls needed similar fixing, but at another glance I don't see anything obviously wrong there. It would be interesting if someone could look at one of the segfaults in a debugger and determine which address its trying to jump to, and compare that with the actual address of the intended callee. Dan
Jeffrey Yasskin wrote:> Try batch compiling with the large code model. (llc -code-model=large) > If that also causes tail calls to break, then I did something wrong in > fixing far calls in the JIT.Jeffrey, I took a closer look at this now, and all the TCO-related weirdness I see in the Pure interpreter is indeed related to your commit in r88984 ("Make X86-64 in the Large model always emit 64-bit calls"). Up to and including r88983, Pure passes all checks (at least with eager compilation, see below), with r88984 and later more than half of the checks fail. This only happens when using dynamic compilation. As I reported earlier, batch compilation works fine, even if the large code model is used. OTOH, dynamic compilation is broken no matter which code model I choose when creating the JIT. So it seems that r88984 does break fastcc and/or tail calls in the JIT. Maybe you don't see this in UnladenSwallow because it doesn't do tail calls? There's also some minor breakage which isn't TCO-related (four failed checks in the Pure interpreter) when reenabling lazy compilation with DisableLazyCompilation(false). These seem to go all the way back to your commit of Nick's patch (r84032 = "Keep track of stubs that are created" fails exactly the same checks, while r84031 is fine). Those four Pure checks all involve anonymous closures (lambdas); I still need to look at Nick's patch to figure out what exactly is going on there. Now I might be able to live without lazy compilation (even though it noticably slows down some code), but it goes without saying that as a functional language Pure definitely needs TCO. So I can only hope that this will be fixed before the LLVM 2.7 release. Ok, so what next? Should I submit a bug report? Reopen PR5162? Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de WWW: http://www.musikinformatik.uni-mainz.de/ag