thr3ads.net - llvm dev - [LLVMdev] Broke my tail (call) [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Arnold Schwaighofer

2009-Feb-24 14:54 UTC

[LLVMdev] Broke my tail (call)

On Tue, Feb 24, 2009 at 11:50 AM, Jon Harrop <jon at ffconsultancy.com>
wrote:
> Thanks for the clarification. That makes a lot more sense!
>
> LLVM's support for structs is wonderful but I don't think they can
be
> called "first-class structs" until all such arbitrary
restrictions have been
> removed, even though the workaround (using sret form) is trivial in this
> case.
>
> Shall I file this as a bug against LLVM?Yes please do and maybe include the following code which already
exhibits the error in the bug report.

define fastcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) {
entry:
      %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*,
i8*}*, i8*} %0, i32 %1)
      ret { { i8*, i8* }*, i8*} %2
}
> Assuming this gets fixed, will it be more efficient that the sret form?Don't expect this to be fixed soon as i am currently on a leave of
absence from llvm busy with bringing tail calls to another vm and
writing a thesis about it ;).

Whether it will be more efficient i can't answer off hand. Sorry. But
probably not  because the code generated should be quite similar.

For the sret version the move of the result is performed before the return.
  store { i8*, i8* } %15, { i8*, i8* }* %19
  ret i32 0

For the struct return version this would be performed as part of
moving the result from the result registers to whatever virtual
register is expecting the result. If the register allocator decides to
merge the virtual register with the result registers than no further
move is needed.
%2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*,
i8*} %0, i32 %1)

Note that if you have a series of sequential recursive tail calls this
move will only performed once (at the bottom of the recursion,
respectively when the recursion returns) so it's impact on performance
should be minimal.

regards
arnold

Jon Harrop

2009-Feb-24 17:42 UTC

head link

[LLVMdev] Broke my tail (call)

On Tuesday 24 February 2009 14:54:12 Arnold Schwaighofer
wrote:> Whether it will be more efficient i can't answer off hand. Sorry. But
> probably not  because the code generated should be quite similar.
>
> For the sret version the move of the result is performed before the return.
>   store { i8*, i8* } %15, { i8*, i8* }* %19
>   ret i32 0
>
> For the struct return version this would be performed as part of
> moving the result from the result registers to whatever virtual
> register is expecting the result. If the register allocator decides to
> merge the virtual register with the result registers than no further
> move is needed.
> %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*,
> i8*} %0, i32 %1)
>
> Note that if you have a series of sequential recursive tail calls this
> move will only performed once (at the bottom of the recursion,
> respectively when the recursion returns) so it's impact on performance
> should be minimal.
Hmm, that makes it sound as though the moves between a tail call and the 
following return are redundant?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Arnold Schwaighofer

2009-Feb-24 22:19 UTC

head link

[LLVMdev] Broke my tail (call)

0, i32 %1)>>
>> Note that if you have a series of sequential recursive tail calls this
>> move will only performed once (at the bottom of the recursion,
>> respectively when the recursion returns) so it's impact on
performance
>> should be minimal.
>
> Hmm, that makes it sound as though the moves between a tail call and the
> following return are redundant?I am not sure i understand what you mean by redundant.

What i was trying to say is that if you have

i32 a() {
  %1 = tailcall b()
   ret %1
}


i32 b() {
 %1 = tailcall c()
  ret %1
}

i32 c() {
  %1 = tailcall d()
   ret %1
}

i32 d() {
  ret i32 5
}

only d() will actually perform the return i.e the move of the result
to register %eax on x86 or in case of a struct return the move to
whatever registers/(stackslots?) are used to return the elements of
the struct.

regards
arnold

Jon Harrop

2009-Mar-28 05:16 UTC

head link

[LLVMdev] Broke my tail (call)

On Tuesday 24 February 2009 14:54:12 Arnold Schwaighofer
wrote:> On Tue, Feb 24, 2009 at 11:50 AM, Jon Harrop <jon at
ffconsultancy.com> wrote:
> > Thanks for the clarification. That makes a lot more sense!
> >
> > LLVM's support for structs is wonderful but I don't think they
can be
> > called "first-class structs" until all such arbitrary
restrictions have
> > been removed, even though the workaround (using sret form) is trivial
in
> > this case.
> >
> > Shall I file this as a bug against LLVM?
>
> Yes please do and maybe include the following code which already
> exhibits the error in the bug report.
>
> define fastcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) {
> entry:
>       %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*,
> i8*}*, i8*} %0, i32 %1)
>       ret { { i8*, i8* }*, i8*} %2
> }
I just noticed that altering my representation of the unit type also seems to 
break tail calls. Specifically, if I return the i1 or i8 types. Is that 
expected to break tail calls as well as returning structs? If so, when 
exactly can tail calls be relied upon?

Looks like i64 works fine on 32-bit, so it is not just non-word-sized values 
that are breaking it.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Arnold Schwaighofer

2009-Mar-28 08:50 UTC

head link

[LLVMdev] Broke my tail (call)

Hello Jon,

Yes you are right tail call optimization breaks for i1.

It breaks for the same reason it breaks when structs are returned.

Simple example

define fastcc i1 @i1test(i32, i32, i32, i32) {
  entry:
  %4 = tail call fastcc i1 @i1test( i32 %0, i32 %1, i32 %2, i32 %3)
  ret i1 %4
}

In the intermediate representation, the SelectionDAG, there is a node
that converts the result of i1test from a i8 to a i1. This should not
hinder tail call optimization. But since i don't detect this case
correctly tail calls are turned off.

A fix for this bug and the struct return bug is in svn (revision 67934).

regards
arnold
On Sat, Mar 28, 2009 at 6:16 AM, Jon Harrop <jon at ffconsultancy.com>
wrote:> I just noticed that altering my representation of the unit type also seems
to
> break tail calls. Specifically, if I return the i1 or i8 types. Is that
> expected to break tail calls as well as returning structs? If so, when
> exactly can tail calls be relied upon?
Whenever i did not miss something ;). Sorry!

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Feb 2009 - [LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

[LLVMdev] Broke my tail (call)

Possibly Parallel Threads