On Tue, Feb 24, 2009 at 11:50 AM, Jon Harrop <jon at ffconsultancy.com> wrote:> Thanks for the clarification. That makes a lot more sense! > > LLVM's support for structs is wonderful but I don't think they can be > called "first-class structs" until all such arbitrary restrictions have been > removed, even though the workaround (using sret form) is trivial in this > case. > > Shall I file this as a bug against LLVM?Yes please do and maybe include the following code which already exhibits the error in the bug report. define fastcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) { entry: %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*, i8*} %0, i32 %1) ret { { i8*, i8* }*, i8*} %2 }> Assuming this gets fixed, will it be more efficient that the sret form?Don't expect this to be fixed soon as i am currently on a leave of absence from llvm busy with bringing tail calls to another vm and writing a thesis about it ;). Whether it will be more efficient i can't answer off hand. Sorry. But probably not because the code generated should be quite similar. For the sret version the move of the result is performed before the return. store { i8*, i8* } %15, { i8*, i8* }* %19 ret i32 0 For the struct return version this would be performed as part of moving the result from the result registers to whatever virtual register is expecting the result. If the register allocator decides to merge the virtual register with the result registers than no further move is needed. %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*, i8*} %0, i32 %1) Note that if you have a series of sequential recursive tail calls this move will only performed once (at the bottom of the recursion, respectively when the recursion returns) so it's impact on performance should be minimal. regards arnold
On Tuesday 24 February 2009 14:54:12 Arnold Schwaighofer wrote:> Whether it will be more efficient i can't answer off hand. Sorry. But > probably not because the code generated should be quite similar. > > For the sret version the move of the result is performed before the return. > store { i8*, i8* } %15, { i8*, i8* }* %19 > ret i32 0 > > For the struct return version this would be performed as part of > moving the result from the result registers to whatever virtual > register is expecting the result. If the register allocator decides to > merge the virtual register with the result registers than no further > move is needed. > %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, i8*}*, > i8*} %0, i32 %1) > > Note that if you have a series of sequential recursive tail calls this > move will only performed once (at the bottom of the recursion, > respectively when the recursion returns) so it's impact on performance > should be minimal.Hmm, that makes it sound as though the moves between a tail call and the following return are redundant? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
0, i32 %1)>> >> Note that if you have a series of sequential recursive tail calls this >> move will only performed once (at the bottom of the recursion, >> respectively when the recursion returns) so it's impact on performance >> should be minimal. > > Hmm, that makes it sound as though the moves between a tail call and the > following return are redundant?I am not sure i understand what you mean by redundant. What i was trying to say is that if you have i32 a() { %1 = tailcall b() ret %1 } i32 b() { %1 = tailcall c() ret %1 } i32 c() { %1 = tailcall d() ret %1 } i32 d() { ret i32 5 } only d() will actually perform the return i.e the move of the result to register %eax on x86 or in case of a struct return the move to whatever registers/(stackslots?) are used to return the elements of the struct. regards arnold
On Tuesday 24 February 2009 14:54:12 Arnold Schwaighofer wrote:> On Tue, Feb 24, 2009 at 11:50 AM, Jon Harrop <jon at ffconsultancy.com> wrote: > > Thanks for the clarification. That makes a lot more sense! > > > > LLVM's support for structs is wonderful but I don't think they can be > > called "first-class structs" until all such arbitrary restrictions have > > been removed, even though the workaround (using sret form) is trivial in > > this case. > > > > Shall I file this as a bug against LLVM? > > Yes please do and maybe include the following code which already > exhibits the error in the bug report. > > define fastcc { { i8*, i8* }*, i8*} @init({ { i8*, i8* }*, i8*}, i32) { > entry: > %2 = tail call fastcc { { i8*, i8* }*, i8* } @init({ { i8*, > i8*}*, i8*} %0, i32 %1) > ret { { i8*, i8* }*, i8*} %2 > }I just noticed that altering my representation of the unit type also seems to break tail calls. Specifically, if I return the i1 or i8 types. Is that expected to break tail calls as well as returning structs? If so, when exactly can tail calls be relied upon? Looks like i64 works fine on 32-bit, so it is not just non-word-sized values that are breaking it. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
Hello Jon, Yes you are right tail call optimization breaks for i1. It breaks for the same reason it breaks when structs are returned. Simple example define fastcc i1 @i1test(i32, i32, i32, i32) { entry: %4 = tail call fastcc i1 @i1test( i32 %0, i32 %1, i32 %2, i32 %3) ret i1 %4 } In the intermediate representation, the SelectionDAG, there is a node that converts the result of i1test from a i8 to a i1. This should not hinder tail call optimization. But since i don't detect this case correctly tail calls are turned off. A fix for this bug and the struct return bug is in svn (revision 67934). regards arnold On Sat, Mar 28, 2009 at 6:16 AM, Jon Harrop <jon at ffconsultancy.com> wrote:> I just noticed that altering my representation of the unit type also seems to > break tail calls. Specifically, if I return the i1 or i8 types. Is that > expected to break tail calls as well as returning structs? If so, when > exactly can tail calls be relied upon?Whenever i did not miss something ;). Sorry!