On Tuesday 22 July 2008 01:23, Chris Lattner wrote:> David, I'm not sure I follow. It is, of course, very important for us > that llvm-gcc generate ABI compliant code on x86-64. I'm just saying > that if struct-return does not provide the ABI required for a specific > source construct that another lowering would be needed.Ah, ok. I misunderstood your statement.> In the case of X86-64, llvm-gcc does use aggregate return (for the > interesting cases which return things in registers) and it does do theI don't follow. By "aggregate return" do you mean "structs as first class values?" That is, llvm-gcc generates a return of a struct by value?> right thing. However, returning a {i64, i64, i64, i64} by value and > having it automatically be returned "by pointer" is less interesting,What do you mean by "less interesting?"> as we already have a direct way to handle that (and llvm-gcc already > produces it).So by, "direct way," you mean, "by using llvm-gcc?" Unfortunately, that doesn't work for everyone. It seems to me that target-specific issues like ABI compatibility should be handled by llvm directly.> AFAIK, llvm-gcc/g++ does an *extremely* good job of matching the > X86-64 ABI on mainline.But that's all implemented within llvm-gcc. LLVM codegen right now does not implement the ABI correctly. Apologies if I've misunderstood things again. I'm trying to get clarity on this issue. -Dave
On Jul 23, 2008, at 9:05 AM, David Greene wrote:>> In the case of X86-64, llvm-gcc does use aggregate return (for the >> interesting cases which return things in registers) and it does do >> the > > I don't follow. By "aggregate return" do you mean "structs as first > class > values?" That is, llvm-gcc generates a return of a struct by value?Yes, consider: struct foo { double X; long Y; }; struct foo test(double *P1, long *P2) { struct foo F; F.X = *P1; F.Y = *P2; return F; } we compile this to: %struct.foo = type { double, i64 } define %struct.foo @test(double* %P1, i64* %P2) nounwind { entry: load double* %P1, align 8 ; <double>:0 [#uses=1] load i64* %P2, align 8 ; <i64>:1 [#uses=1] %mrv3 = insertvalue %struct.foo undef, double %0, 0 ; <%struct.foo> [#uses=1] %mrv4 = insertvalue %struct.foo %mrv3, i64 %1, 1 ; <%struct.foo> [#uses=1] ret %struct.foo %mrv4 } which was previously (before first class aggregates got enabled yesterday): define %struct.foo @test(double* %P1, i64* %P2) nounwind { entry: load double* %P1, align 8 ; <double>:0 [#uses=1] load i64* %P2, align 8 ; <i64>:1 [#uses=1] ret double %0, i64 %1 } and both produce this machine code: _test: movq (%rsi), %rax movsd (%rdi), %xmm0 ret>> right thing. However, returning a {i64, i64, i64, i64} by value and >> having it automatically be returned "by pointer" is less interesting, > > What do you mean by "less interesting?"There are already other ways to handle this, rather than returning the entire aggregate by value. For example, we compile: struct foo { double X; long Y, Z; }; struct foo test(double *P1, long *P2) { struct foo F; F.X = *P1; F.Y = *P2; return F; } into: %struct.foo = type { double, i64, i64 } define void @test(%struct.foo* noalias sret %agg.result, double* %P1, i64* %P2) nounwind { entry: load double* %P1, align 8 ; <double>:0 [#uses=1] load i64* %P2, align 8 ; <i64>:1 [#uses=1] getelementptr %struct.foo* %agg.result, i32 0, i32 0 ; <double*>:2 [#uses=1] store double %0, double* %2, align 8 getelementptr %struct.foo* %agg.result, i32 0, i32 1 ; <i64*>:3 [#uses=1] store i64 %1, i64* %3, align 8 ret void } which has no first class aggregates. When the struct is very large (e.g. containing an array) you REALLY REALLY do not want to use first- class aggregate return, you want to return explicitly by pointer so the memcpy is explicit in the IR.>> AFAIK, llvm-gcc/g++ does an *extremely* good job of matching the >> X86-64 ABI on mainline. > > But that's all implemented within llvm-gcc. LLVM codegen right now > does not implement the ABI correctly.Getting the ABI right requires the front-end to do target-specific work. Without exposing the entire C (and every other language) type through to the code generator, there is no good solution for this. We are working to incrementally improve things though. Thinking the code generator will just magically handle all your ABI issues for you is wishful thinking :) -Chris
On Wednesday 23 July 2008 12:22, Chris Lattner wrote:> and both produce this machine code: > > _test: > movq (%rsi), %rax > movsd (%rdi), %xmm0 > retOk, that's good.> >> right thing. However, returning a {i64, i64, i64, i64} by value and > >> having it automatically be returned "by pointer" is less interesting, > > > > What do you mean by "less interesting?" > > There are already other ways to handle this, rather than returning the > entire aggregate by value. For example, we compile: > > struct foo { double X; long Y, Z; }; > struct foo test(double *P1, long *P2) { > struct foo F; > F.X = *P1; > F.Y = *P2; > return F; > } > > into: > > %struct.foo = type { double, i64, i64 } > define void @test(%struct.foo* noalias sret %agg.result, double* %P1, > i64* %P2) nounwind { > entry: > load double* %P1, align 8 ; <double>:0 [#uses=1] > load i64* %P2, align 8 ; <i64>:1 [#uses=1] > getelementptr %struct.foo* %agg.result, i32 0, i32 0 ; <double*>:2 > [#uses=1] > store double %0, double* %2, align 8 > getelementptr %struct.foo* %agg.result, i32 0, i32 1 ; <i64*>:3 > [#uses=1] > store i64 %1, i64* %3, align 8 > ret void > } > > which has no first class aggregates. When the struct is very large > (e.g. containing an array) you REALLY REALLY do not want to use first- > class aggregate return, you want to return explicitly by pointer so > the memcpy is explicit in the IR.Ok, I see what you mean. llvm-gcc does the transformation to the hidden pointer argument. I'm still not sure this is good. If I hand-write LLVM IR that returns a large struct by value, the generated code should be correct. It's not right now AFAIK. If you want to make the memcpy explicit, we could do the l;owering in a separate pass. That's fine with me, as long as it's handled by LLVM so it produces correct code.> >> AFAIK, llvm-gcc/g++ does an *extremely* good job of matching the > >> X86-64 ABI on mainline. > > > > But that's all implemented within llvm-gcc. LLVM codegen right now > > does not implement the ABI correctly. > > Getting the ABI right requires the front-end to do target-specific > work. Without exposing the entire C (and every other language) type > through to the code generator, there is no good solution for this. We > are working to incrementally improve things though. Thinking the code > generator will just magically handle all your ABI issues for you is > wishful thinking :)I never said anything about magic. You don't need "the entire C (and every other language) type." What you need is something that tells the code generator what ABI to use and possibly what the source language was. It should be possible to then define how each LLVM IR type maps onto the ABI. -Dave
I work with David Greene and I've been looking into the x86-64 ABI. Right now I'm working with the 2.3 release of LLVM and LLVM-GCC 4.2/2.3 (binary release for linux). I'm unable to replicate the results you posted here - I'm getting wildly different output. I wonder what I'm doing wrong. I suspect it has something to do with my not being able to pass it the -m64 or --64 options, I get "sorry, unimplemented: 64-bit mode not compiled in" I want to make sure I'm using the same options and roughly the same version to make sure I'm on the same page. I'm running on 64 bit SUSE linux. Can you tell me what options to llvm-gcc you used and how you built that llvm-gcc (i.e. do I need to build an llvm-gcc from source to support 64bit mode?) Thanks! -Tony S. Chris Lattner wrote:> On Jul 23, 2008, at 9:05 AM, David Greene wrote: > >>> In the case of X86-64, llvm-gcc does use aggregate return (for the >>> interesting cases which return things in registers) and it does do >>> the >>> >> I don't follow. By "aggregate return" do you mean "structs as first >> class >> values?" That is, llvm-gcc generates a return of a struct by value? >> > > Yes, consider: > > struct foo { double X; long Y; }; > struct foo test(double *P1, long *P2) { > struct foo F; > F.X = *P1; > F.Y = *P2; > return F; > } > > we compile this to: > > %struct.foo = type { double, i64 } > > define %struct.foo @test(double* %P1, i64* %P2) nounwind { > entry: > load double* %P1, align 8 ; <double>:0 [#uses=1] > load i64* %P2, align 8 ; <i64>:1 [#uses=1] > %mrv3 = insertvalue %struct.foo undef, double %0, 0 ; <%struct.foo> > [#uses=1] > %mrv4 = insertvalue %struct.foo %mrv3, i64 %1, 1 ; <%struct.foo> > [#uses=1] > ret %struct.foo %mrv4 > } > > which was previously (before first class aggregates got enabled > yesterday): > > define %struct.foo @test(double* %P1, i64* %P2) nounwind { > entry: > load double* %P1, align 8 ; <double>:0 [#uses=1] > load i64* %P2, align 8 ; <i64>:1 [#uses=1] > ret double %0, i64 %1 > } > > and both produce this machine code: > > _test: > movq (%rsi), %rax > movsd (%rdi), %xmm0 > ret > > >>> right thing. However, returning a {i64, i64, i64, i64} by value and >>> having it automatically be returned "by pointer" is less interesting, >>> >> What do you mean by "less interesting?" >> > > There are already other ways to handle this, rather than returning the > entire aggregate by value. For example, we compile: > > struct foo { double X; long Y, Z; }; > struct foo test(double *P1, long *P2) { > struct foo F; > F.X = *P1; > F.Y = *P2; > return F; > } > > into: > > %struct.foo = type { double, i64, i64 } > define void @test(%struct.foo* noalias sret %agg.result, double* %P1, > i64* %P2) nounwind { > entry: > load double* %P1, align 8 ; <double>:0 [#uses=1] > load i64* %P2, align 8 ; <i64>:1 [#uses=1] > getelementptr %struct.foo* %agg.result, i32 0, i32 0 ; <double*>:2 > [#uses=1] > store double %0, double* %2, align 8 > getelementptr %struct.foo* %agg.result, i32 0, i32 1 ; <i64*>:3 > [#uses=1] > store i64 %1, i64* %3, align 8 > ret void > } > > which has no first class aggregates. When the struct is very large > (e.g. containing an array) you REALLY REALLY do not want to use first- > class aggregate return, you want to return explicitly by pointer so > the memcpy is explicit in the IR. > > >>> AFAIK, llvm-gcc/g++ does an *extremely* good job of matching the >>> X86-64 ABI on mainline. >>> >> But that's all implemented within llvm-gcc. LLVM codegen right now >> does not implement the ABI correctly. >> > > Getting the ABI right requires the front-end to do target-specific > work. Without exposing the entire C (and every other language) type > through to the code generator, there is no good solution for this. We > are working to incrementally improve things though. Thinking the code > generator will just magically handle all your ABI issues for you is > wishful thinking :) > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >