thr3ads.net - llvm dev - [LLVMdev] Broke my tail (call) [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Arnold Schwaighofer

2009-Feb-23 16:14 UTC

[LLVMdev] Broke my tail (call)

On Mon, Feb 23, 2009 at 2:13 PM, Jon Harrop <jon at ffconsultancy.com>
wrote:> Moreover, I now have evidence that LLVM is not behaving as you expect:
>
> 3. Adjusting the return value from this function into sret form results in
> tail call elimination being performed correctly. Note that this is still
> passing a first-class struct by value as an argument to a function through
a
> tail call.Yes i was wrong the problem is of a different kind and has nothing to
do with the alloca. As you say the struct is passed by value. This is
not the problem. For example the following code (not containing any
allocas) will also not be tail call optimized.

define fastcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) {
entry:
       %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*
}*, i8*} %0, i32 %1)
       ret { { i8*, i8* }*, i8*} %2
}

The problem has to do with how struct returns are represented
internally by llvm (in the SelectionDAG) and how the tail call
optimization implementation checks if it may perform the tail call.
The implementation checks that the <call> node is immediately followed
by a <ret> node. A struct return causes the insertion of a
<merge_values> node between the tail call instruction and the return
instruction node. Because of the intermediate node the backend
believes it must not optimize the tail call.

[result1, result2] = <call ... >
[merged] = <merge_values [result1, result2]>
<ret [merged]>

So the current situation is that tail calls on functions that return a
struct are not optimized (as you correctly observed ;). Sorry.

regards arnold

Jon Harrop

2009-Feb-24 10:50 UTC

head link

[LLVMdev] Broke my tail (call)

On Monday 23 February 2009 16:14:57 Arnold Schwaighofer
wrote:> The problem has to do with how struct returns are represented
> internally by llvm (in the SelectionDAG) and how the tail call
> optimization implementation checks if it may perform the tail call.
> The implementation checks that the <call> node is immediately
followed
> by a <ret> node. A struct return causes the insertion of a
> <merge_values> node between the tail call instruction and the return
> instruction node. Because of the intermediate node the backend
> believes it must not optimize the tail call.
>
> [result1, result2] = <call ... >
> [merged] = <merge_values [result1, result2]>
> <ret [merged]>
>
> So the current situation is that tail calls on functions that return a
> struct are not optimized (as you correctly observed ;). Sorry.
Thanks for the clarification. That makes a lot more sense!

LLVM's support for structs is wonderful but I don't think they can be 
called "first-class structs" until all such arbitrary restrictions
have been
removed, even though the workaround (using sret form) is trivial in this 
case.

Shall I file this as a bug against LLVM?

Assuming this gets fixed, will it be more efficient that the sret form?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Arnold Schwaighofer

2009-Feb-24 14:54 UTC

head link

[LLVMdev] Broke my tail (call)

On Tue, Feb 24, 2009 at 11:50 AM, Jon Harrop <jon at ffconsultancy.com>
wrote:
> Thanks for the clarification. That makes a lot more sense!
>
> LLVM's support for structs is wonderful but I don't think they can
be
> called "first-class structs" until all such arbitrary
restrictions have been
> removed, even though the workaround (using sret form) is trivial in this
> case.
>
> Shall I file this as a bug against LLVM?Yes please do and maybe include the following code which already
exhibits the error in the bug report.

define fastcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) {
entry:
      %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*,
i8*}*, i8*} %0, i32 %1)
      ret { { i8*, i8* }*, i8*} %2
}
> Assuming this gets fixed, will it be more efficient that the sret form?Don't expect this to be fixed soon as i am currently on a leave of
absence from llvm busy with bringing tail calls to another vm and
writing a thesis about it ;).

Whether it will be more efficient i can't answer off hand. Sorry. But
probably not  because the code generated should be quite similar.

For the sret version the move of the result is performed before the return.
  store { i8*, i8* } %15, { i8*, i8* }* %19
  ret i32 0

For the struct return version this would be performed as part of
moving the result from the result registers to whatever virtual
register is expecting the result. If the register allocator decides to
merge the virtual register with the result registers than no further
move is needed.
%2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*,
i8*} %0, i32 %1)

Note that if you have a series of sequential recursive tail calls this
move will only performed once (at the bottom of the recursion,
respectively when the recursion returns) so it's impact on performance
should be minimal.

regards
arnold

Dan Gohman

2009-Feb-24 20:05 UTC

head link

[LLVMdev] Broke my tail (call)

On Feb 24, 2009, at 2:50 AM, Jon Harrop wrote:
> On Monday 23 February 2009 16:14:57 Arnold Schwaighofer wrote:
>>
>>
>> So the current situation is that tail calls on functions that  
>> return a
>> struct are not optimized (as you correctly observed ;). Sorry.
>
> Thanks for the clarification. That makes a lot more sense!
>
> LLVM's support for structs is wonderful but I don't think they can
be
> called "first-class structs" until all such arbitrary
restrictions
> have been
> removed, even though the workaround (using sret form) is trivial in  
> this
> case.
"first-class" in this context means "Values can have this
type", not
"everything that anyone might want is supported in all backends."

A meta issue is that LLVM IR differs from many virtual machine ISAs in
that it's more of a language for expressing programs than a  
specification
of what some particular machine can actually execute.  When a
particular LLVM component doesn't support a particular aspect of the
language, it's considered a bug in the component, and not necessarily
considered a problem in the language.

This arrangement is convenient for people developing LLVM itself,
because new language features can be added without the requirement
that all backends and execution engines support all features in full
generality, which isn't always immediately useful, and sometimes not
even meaningful.  However, it's inconvenient for people using LLVM,
because there's no complete list of which features are actually
supported on any given backend.  Some features that are commonly
unsupported are now mentioned LangRef.html, though it's certainly
not complete.

One way this issue might be addressed is with an LLVM IR
conformance test suite.  Covering the whole language would be a
big project, but the work could be done incrementally.  I wonder if
people would be interested in working on this.
> Shall I file this as a bug against LLVM?
Yes, thanks.

Dan

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Feb 2009 - [LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

Reasonably Related Threads