thr3ads.net - llvm dev - [LLVMdev] Proposal: Add a guaranteed tail call marker [Apr 2014]

If this information is useful, please help other people find it:
Share via:

Reid Kleckner

2014-Apr-01 02:35 UTC

[LLVMdev] Proposal: Add a guaranteed tail call marker

Some frontends for LLVM require that LLVM perform tail call optimization
(TCO) for correctness.  Internally, LLVM refers to TCO of a non-recursive
tail call as sibling call optimization, but I'm going to refer to that
generically as TCO.  Often, functional languages like Scheme have a
language-level requirement that TCO occurs for any call in the tail
position, and this is usually why users of LLVM have asked for guaranteed
TCO functionality.

In my case, to implement vtable thunks and virtual member pointers in the
IA32 Microsoft C++ ABI, I cannot simply forward the arguments to the callee
without calling the copy constructor.  If I can use a guaranteed tail call,
I don't have to emit copy constructor calls, and things are much easier.

Currently, in order to get guaranteed TCO, frontends have to enable the
GuaranteedTailCall codegen option and obey a narrow set of rules, which
includes always using fastcc.  This is fairly awkward and doesn't solve my
use case, since the ABI requires a particular convention.

Instead, I propose that we add a new tail call marker, 'musttail', that
guarantees that TCO will occur.  I'm open to other naming suggestions.
 Some strawmen are 'tailonly' or 'guaranteedtail'.  Along with
it, I
propose a set conservative of verifier enforced rules to follow to ensure
that most reasonable backends will be able to perform TCO.  It also ensures
that optimizations, like tail merging, don't accidentally move a call out
of tail position.

First, the prototype of the caller and the callee must be "congruent".
Two
prototypes are congruent if all return and parameter types match except for
the pointer types.  The pointer types can have different pointee types, but
they must have the same address space.  In addition, all the ABI impacting
attributes on the parameters must be the same, meaning byval, inalloca,
inreg, etc, must all match up.

Second, the call must be in tail position.  The call must be immediately
followed by a bitcast or a ret, both of which must use the result of the
call.  If there is a bitcast, it must be followed by a ret which uses the
bitcast.

Importantly, LLVM must perform TCO regardless of the computation necessary
to compute the arguments for the callee, which is not the case today.

I sent a patch to llvm-commits, but I'd like to hear high-level feedback on
llvmdev:
http://llvm-reviews.chandlerc.com/D3240

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140331/71db259e/attachment.html>

Rafael Espíndola

2014-Apr-01 18:03 UTC

head link

[LLVMdev] Proposal: Add a guaranteed tail call marker

On 31 March 2014 22:35, Reid Kleckner <rnk at google.com>
wrote:> Some frontends for LLVM require that LLVM perform tail call optimization
> (TCO) for correctness.  Internally, LLVM refers to TCO of a non-recursive
> tail call as sibling call optimization, but I'm going to refer to that
> generically as TCO.  Often, functional languages like Scheme have a
> language-level requirement that TCO occurs for any call in the tail
> position, and this is usually why users of LLVM have asked for guaranteed
> TCO functionality.
>
> In my case, to implement vtable thunks and virtual member pointers in the
> IA32 Microsoft C++ ABI, I cannot simply forward the arguments to the callee
> without calling the copy constructor.  If I can use a guaranteed tail call,
> I don't have to emit copy constructor calls, and things are much
easier.
>
> Currently, in order to get guaranteed TCO, frontends have to enable the
> GuaranteedTailCall codegen option and obey a narrow set of rules, which
> includes always using fastcc.  This is fairly awkward and doesn't solve
my
> use case, since the ABI requires a particular convention.
>
> Instead, I propose that we add a new tail call marker, 'musttail',
that
> guarantees that TCO will occur.  I'm open to other naming suggestions. 
Some
> strawmen are 'tailonly' or 'guaranteedtail'.  Along with
it, I propose a set
> conservative of verifier enforced rules to follow to ensure that most
> reasonable backends will be able to perform TCO.  It also ensures that
> optimizations, like tail merging, don't accidentally move a call out of
tail
> position.
>
> First, the prototype of the caller and the callee must be
"congruent".  Two
> prototypes are congruent if all return and parameter types match except for
> the pointer types.  The pointer types can have different pointee types, but
> they must have the same address space.  In addition, all the ABI impacting
> attributes on the parameters must be the same, meaning byval, inalloca,
> inreg, etc, must all match up.
>
> Second, the call must be in tail position.  The call must be immediately
> followed by a bitcast or a ret, both of which must use the result of the
> call.  If there is a bitcast, it must be followed by a ret which uses the
> bitcast.
>
> Importantly, LLVM must perform TCO regardless of the computation necessary
> to compute the arguments for the callee, which is not the case today.
>
> I sent a patch to llvm-commits, but I'd like to hear high-level
feedback on
> llvmdev:
> http://llvm-reviews.chandlerc.com/D3240
It would even be invalid to introduce calls to the copy constructor,
no? I don't think the thunks exist at the c++ standard level, so we
cannot create extra copy constructor calls for them. In other words,
we need something like what you are proposing.

Having this also puts us on the path to improve itanium codegen for

struct A {
  virtual void f(int x, ...);
};
struct B {
  virtual void f(int x, ...);
};
struct C : A, B {
  virtual void c();
  virtual void f(int x, ...);
};
void C::f(int x, ...) {  }

Currently we have clang duplicate the body of C::f.  We cannot create
a normal copy since we cannot represent the copy of the varargs, even
when we want llvm to eliminate those copies. We could use a inalloca
argument for the rest of the frame instead of a "...". Since we have a
guarantee that no copy will be done, we don't have to know the size.

Cheers,
Rafael

Reid Kleckner

2014-Apr-01 18:20 UTC

head link

[LLVMdev] Proposal: Add a guaranteed tail call marker

On Tue, Apr 1, 2014 at 11:03 AM, Rafael Espíndola <
rafael.espindola at gmail.com> wrote:
> Having this also puts us on the path to improve itanium codegen for
>
> struct A {
>   virtual void f(int x, ...);
> };
> struct B {
>   virtual void f(int x, ...);
> };
> struct C : A, B {
>   virtual void c();
>   virtual void f(int x, ...);
> };
> void C::f(int x, ...) {  }
>
> Currently we have clang duplicate the body of C::f.  We cannot create
> a normal copy since we cannot represent the copy of the varargs, even
> when we want llvm to eliminate those copies. We could use a inalloca
> argument for the rest of the frame instead of a "...". Since we
have a
> guarantee that no copy will be done, we don't have to know the size.
>
Yep!  Fixing this problem is definitely doable with this feature.  Rather
than using inalloca, I'd add an Ellipsis Value so we can write:
musttail call void @fn(i8* %this_adj, ...)

You wouldn't be able to place '...' anywhere other than as the last
argument of a musttail call, since we can't really lower it otherwise.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140401/f8f9c658/attachment.html>

llvm dev - Apr 2014 - [LLVMdev] Proposal: Add a guaranteed tail call marker

[LLVMdev] Proposal: Add a guaranteed tail call marker

[LLVMdev] Proposal: Add a guaranteed tail call marker

[LLVMdev] Proposal: Add a guaranteed tail call marker