thr3ads.net - llvm dev - [LLVMdev] Possible bug in TCO? [Dec 2009]

If this information is useful, please help other people find it:
Share via:

Albert Graef

2009-Nov-29 10:19 UTC

[LLVMdev] Possible bug in TCO?

Jon Harrop wrote:> I've come up with the following minimal repro that segfaults on my
machine:
Jon, were you able to resolve this?

FWIW, TOT is causing all kinds of weird segfaults related to tail calls
in my Pure interpreter, too (at least on x86-64). In my case these seem
to be limited to the JIT, however (batch-compiled Pure programs via
opt+llc all work fine, even with TCO), so it's probably a different
issue. When using JIT compilation, the Pure interpreter works fine with
LLVM 2.3 thru 2.6, and also with early revisions of 2.7svn, but it fails
most of my test suite with current TOT, even though the generated IR
seems to be the same as before.

Have there been any changes to the x86-64 backend of the JIT which might
break tail call elimination? I didn't see any announcements about major
changes in the JIT on the ml, so I have no idea what might be going
wrong there.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Anton Korobeynikov

2009-Nov-29 11:05 UTC

head link

[LLVMdev] Possible bug in TCO?

> Have there been any changes to the x86-64 backend of the JIT which might
> break tail call elimination? I didn't see any announcements about major
> changes in the JIT on the ml, so I have no idea what might be going
> wrong there.Most probably recent changes (e.g. post-RA scheduling) just uncovered
the bugs TCO always had.

-- 
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University

Jeffrey Yasskin

2009-Nov-29 17:03 UTC

head link

[LLVMdev] Possible bug in TCO?

On Sun, Nov 29, 2009 at 2:19 AM, Albert Graef <Dr.Graef at t-online.de>
wrote:> Jon Harrop wrote:
>> I've come up with the following minimal repro that segfaults on my
machine:
>
> Jon, were you able to resolve this?
>
> FWIW, TOT is causing all kinds of weird segfaults related to tail calls
> in my Pure interpreter, too (at least on x86-64). In my case these seem
> to be limited to the JIT, however (batch-compiled Pure programs via
> opt+llc all work fine, even with TCO), so it's probably a different
> issue. When using JIT compilation, the Pure interpreter works fine with
> LLVM 2.3 thru 2.6, and also with early revisions of 2.7svn, but it fails
> most of my test suite with current TOT, even though the generated IR
> seems to be the same as before.
>
> Have there been any changes to the x86-64 backend of the JIT which might
> break tail call elimination? I didn't see any announcements about major
> changes in the JIT on the ml, so I have no idea what might be going
> wrong there.
>
Try batch compiling with the large code model. (llc -code-model=large)
If that also causes tail calls to break, then I did something wrong in
fixing far calls in the JIT.

Jon Harrop

2009-Nov-29 17:31 UTC

head link

[LLVMdev] Possible bug in TCO?

On Sunday 29 November 2009 10:19:40 Albert Graef wrote:> Jon Harrop wrote:
> > I've come up with the following minimal repro that segfaults on my
> > machine:
>
> Jon, were you able to resolve this?
Nope. I just worked around the problem because HLVM was segfaulting on a 
single stdlib function call (just removing "tail" fixed the segfault
which
was trying to read 0x1 according to ValGrind) so I inlined it and now I have 
no segfaults. I kept the full repro as a .ll though.

Arnold showed that my minimal repro was buggy: I'd confused calling 
conventions.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Albert Graef

2009-Nov-29 21:10 UTC

head link

[LLVMdev] Possible bug in TCO?

Jeffrey Yasskin wrote:> Try batch compiling with the large code model. (llc -code-model=large)
Works fine. Anything else that I could try?

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Dan Gohman

2009-Nov-30 21:21 UTC

head link

[LLVMdev] Possible bug in TCO?

On Nov 29, 2009, at 2:19 AM, Albert Graef wrote:
> Have there been any changes to the x86-64 backend of the JIT which might
> break tail call elimination? I didn't see any announcements about major
> changes in the JIT on the ml, so I have no idea what might be going
> wrong there.
I haven't tested it, but the following pattern in X86Instr64bit.td
looks suspicious as it appears to attempt to support direct tailcalls
to arbitrary 64-bit immediates:

def : Pat<(X86tcret GR64:$dst, imm:$off),
          (TCRETURNri64 GR64:$dst, imm:$off)>;

Dan

Dan Gohman

2009-Nov-30 22:46 UTC

head link

[LLVMdev] Possible bug in TCO?

On Nov 30, 2009, at 1:21 PM, Dan Gohman wrote:
> 
> On Nov 29, 2009, at 2:19 AM, Albert Graef wrote:
> 
>> Have there been any changes to the x86-64 backend of the JIT which
might
>> break tail call elimination? I didn't see any announcements about
major
>> changes in the JIT on the ml, so I have no idea what might be going
>> wrong there.
> 
> I haven't tested it, but the following pattern in X86Instr64bit.td
> looks suspicious as it appears to attempt to support direct tailcalls
> to arbitrary 64-bit immediates:
> 
> def : Pat<(X86tcret GR64:$dst, imm:$off),
>          (TCRETURNri64 GR64:$dst, imm:$off)>;
Arnold pointed out to me that I was mistaken here; this offset is a
stack offset, so it's not the kind of thing I was looking for.

With the recent changes to support regular calls where the callee is not
within range for a 32-bit immediate on 64-bit targets, my suspicion was
that perhaps tailcalls needed similar fixing, but at another glance I
don't see anything obviously wrong there. It would be interesting if
someone could look at one of the segfaults in a debugger and determine
which address its trying to jump to, and compare that with the actual
address of the intended callee.

Dan

Albert Graef

2009-Dec-08 07:58 UTC

head link

[LLVMdev] Possible bug in TCO?

Jeffrey Yasskin wrote:> Try batch compiling with the large code model. (llc -code-model=large)
> If that also causes tail calls to break, then I did something wrong in
> fixing far calls in the JIT.
Jeffrey, I took a closer look at this now, and all the TCO-related
weirdness I see in the Pure interpreter is indeed related to your commit
in r88984 ("Make X86-64 in the Large model always emit 64-bit calls").
Up to and including r88983, Pure passes all checks (at least with eager
compilation, see below), with r88984 and later more than half of the
checks fail. This only happens when using dynamic compilation. As I
reported earlier, batch compilation works fine, even if the large code
model is used. OTOH, dynamic compilation is broken no matter which code
model I choose when creating the JIT.

So it seems that r88984 does break fastcc and/or tail calls in the JIT.
Maybe you don't see this in UnladenSwallow because it doesn't do tail
calls?

There's also some minor breakage which isn't TCO-related (four failed
checks in the Pure interpreter) when reenabling lazy compilation with
DisableLazyCompilation(false). These seem to go all the way back to your
commit of Nick's patch (r84032 = "Keep track of stubs that are
created"
fails exactly the same checks, while r84031 is fine). Those four Pure
checks all involve anonymous closures (lambdas); I still need to look at
Nick's patch to figure out what exactly is going on there.

Now I might be able to live without lazy compilation (even though it
noticably slows down some code), but it goes without saying that as a
functional language Pure definitely needs TCO. So I can only hope that
this will be fixed before the LLVM 2.7 release.

Ok, so what next? Should I submit a bug report? Reopen PR5162?

Albert

--
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email: Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de
WWW: http://www.musikinformatik.uni-mainz.de/ag

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Dec 2009 - [LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

[LLVMdev] Possible bug in TCO?

Possibly Parallel Threads