Reid Kleckner via llvm-dev
2018-Nov-26 19:53 UTC
[llvm-dev] understanding llvm's codegen for function forwarding
Re-adding llvm-dev -- silly phones not defaulting to reply-all... There are several things here. The first one is -fno-omit-frame-pointer is causing the generation of "push %rbp ; mov %rsp, %rbp". This would be required for accurate stack traces, so we can't simplify to just "call / ret" as you suggest, without changing the option. The less obvious one is the spilling of RDI to stack memory and reloading it into RAX, which is what I was raising. The Sys V ABI requires that the address of a struct returned by pointer be returned in RAX, and LLVM complies. It looks like I misremembered. We've always returned RDI in RAX for sret functions, since 2008 / r50075. However, we never did the right thing in 32-bit. I fixed that in https://bugs.llvm.org/show_bug.cgi?id=23491 / r237639. We don't yet implement the general optimization of avoiding such spills by reusing the value returned in RAX, which is why we don't get the simple "call / ret" code you suggest. Finally, we miss the tail call opportunity because today we just give up if sret is present on either the caller of the callee. I think we could refine that to check for, do they agree, does the sret parameter match. On Sat, Nov 24, 2018 at 9:20 AM Andrew Kelley <superjoe30 at gmail.com> wrote:> On Sat, Nov 24, 2018 at 12:11 PM Reid Kleckner <rnk at google.com> wrote: > > > > Llvm is trying to return RDI in RAX. It doesn't trust the callee to do > it, because that was a bug that we fixed long ago. > > You're saying these extra instructions are working around a bug that > no longer exists? Can they be removed now? > > What was the bug? Why can't the callee be trusted? > > > > > On Fri, Nov 23, 2018, 11:49 AM Andrew Kelley via llvm-dev < > llvm-dev at lists.llvm.org wrote: > >> > >> When compiling this LLVM IR with -O0 (no optimizations) > >> > >> define internal fastcc void @bar2(%Bar* nonnull sret) unnamed_addr #2 > !dbg !74 { > >> Entry: > >> call fastcc void @bar(%Bar* sret %0), !dbg !79 > >> ret void, !dbg !81 > >> } > >> > >> why does this generate this? > >> > >> 0000000000000090 <bar2>: > >> 90: 55 push %rbp > >> 91: 48 89 e5 mov %rsp,%rbp > >> 94: 48 83 ec 10 sub $0x10,%rsp > >> 98: 48 89 f8 mov %rdi,%rax > >> 9b: 48 89 45 f8 mov %rax,-0x8(%rbp) > >> 9f: e8 0c 00 00 00 callq b0 <bar> > >> a4: 48 8b 45 f8 mov -0x8(%rbp),%rax > >> a8: 48 83 c4 10 add $0x10,%rsp > >> ac: 5d pop %rbp > >> ad: c3 retq > >> ae: 66 90 xchg %ax,%ax > >> > >> > >> instead of something like this? > >> > >> 0000000000000090 <bar2>: > >> 9f: e8 0c 00 00 00 callq b0 <bar> > >> ad: c3 retq > >> > >> when I add `musttail` to the IR it gives me this assembly: > >> > >> 00000000000000a0 <bar2>: > >> a0: 55 push %rbp > >> a1: 48 89 e5 mov %rsp,%rbp > >> a4: 48 83 ec 10 sub $0x10,%rsp > >> a8: 48 89 f8 mov %rdi,%rax > >> ab: 48 89 45 f8 mov %rax,-0x8(%rbp) > >> af: 48 83 c4 10 add $0x10,%rsp > >> b3: 5d pop %rbp > >> b4: e9 07 00 00 00 jmpq c0 <bar> > >> b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) > >> > >> which does not have a call instruction but it has prologue that I > >> would not expect. > >> > >> What's going on here? Is this something that can not really be > >> improved without optimization passes? > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181126/c80ec42f/attachment.html>