On Feb 15, 2012, at 12:16 PM, Chris Lattner <clattner at apple.com> wrote:> > On Feb 14, 2012, at 10:30 AM, David Terei wrote: > >> Hmm writing a blog post about TNTC is beyond the time I have right now. > > Sure, understandable. I'm surprised someone else hasn't already :) > >> Here is some high level documentation of the layout of Heap objects in GHC: >> >> http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#InfoTables >> >> With TNTC enabled we generate code for closures of this form: >> >> .text >> .align 8 >> .long Main_main1_srt-(Main_main1_info)+0 >> .long 0 >> .quad 4294967299 >> .quad 0 >> .quad 270582939663 >> .globl Main_main1_info >> .type Main_main1_info, @object >> Main_main1_info: >> .Lc1Df: >> leaq -8(%rbp),%rax >> cmpq %r15,%rax >> jb .Lc1Dh > > Ok. I'd strongly recommend the approach of generating the table inside the prolog of the function. This means you'd get something like this: >This is starting to look very similar to how ARM constant islands work, without the extra ugliness from how small the ARM immediate displacements are. -Jim> .text > .align 8 > .globl Main_main1_info > .type Main_main1_info, @object > Main_main1_info: > .Lc1Df: > jmp .Ltmp > .long Main_main1_srt-(Main_main1_info)+0 > .long 0 > .quad 4294967299 > .quad 0 > .quad 270582939663 > .Ltmp: > leaq -8(%rbp),%rax > cmpq %r15,%rax > jb .Lc1Dh > > > Since the jmp is a fixed 2 bytes (0xEB, tablesize), all references to the table can still be done with trivial pc/RIP-relative addressing within the closure, and you just need one pointer for both the table and the closure data. > > If you want to get extra special and tricky, you could be even more devious by storing "Main_main1_info + 2 + table size" as the canonical pointer. If you jump to *that* when dispatching to the closure, then you completely avoid the runtime overhead of the extra unconditional jump and get exactly the same code you're getting with GHC's native code generator. > > To access the table in LLVM IR, you'll be generating some truly special (i.e. horrible :) IR along the lines of (e.g. to load the 4294967299 field): > > load (gep (bitcast @Main_main1_info to i64*), 0, 2) > > The code generator probably isn't smart enough to turn that into a rip-relative memory access, but adding that should be straight-forward. > > The tricky bit will be figuring out how to ensure that the inline asm blob containing the table will come before the standard prolog. Perhaps this can be handled by the existing GHC calling convention, or through creative use of the naked attribute. > > -Chris > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> This is starting to look very similar to how ARM constant islands work, without the extra ugliness from how small the ARM immediate displacements are. > > -JimWould there be any reason that this couldn't be seen as an opportunity to move the constant islands pass out of the ARM backend and make the target-independent constant pools (which ARM bypasses completely) more generic? ______________________________________ From: llvmdev-bounces at cs.uiuc.edu [llvmdev-bounces at cs.uiuc.edu] On Behalf Of Jim Grosbach [grosbach at apple.com] Sent: 15 February 2012 21:59 To: Chris Lattner Cc: Simon Marlow; Sergiu Ivanov; llvmdev at cs.uiuc.edu; cvs-ghc; Gabor Greif Subject: Re: [LLVMdev] LLVM GHC Backend: Tables Next To Code On Feb 15, 2012, at 12:16 PM, Chris Lattner <clattner at apple.com> wrote:> > On Feb 14, 2012, at 10:30 AM, David Terei wrote: > >> Hmm writing a blog post about TNTC is beyond the time I have right now. > > Sure, understandable. I'm surprised someone else hasn't already :) > >> Here is some high level documentation of the layout of Heap objects in GHC: >> >> http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#InfoTables >> >> With TNTC enabled we generate code for closures of this form: >> >> .text >> .align 8 >> .long Main_main1_srt-(Main_main1_info)+0 >> .long 0 >> .quad 4294967299 >> .quad 0 >> .quad 270582939663 >> .globl Main_main1_info >> .type Main_main1_info, @object >> Main_main1_info: >> .Lc1Df: >> leaq -8(%rbp),%rax >> cmpq %r15,%rax >> jb .Lc1Dh > > Ok. I'd strongly recommend the approach of generating the table inside the prolog of the function. This means you'd get something like this: >This is starting to look very similar to how ARM constant islands work, without the extra ugliness from how small the ARM immediate displacements are. -Jim> .text > .align 8 > .globl Main_main1_info > .type Main_main1_info, @object > Main_main1_info: > .Lc1Df: > jmp .Ltmp > .long Main_main1_srt-(Main_main1_info)+0 > .long 0 > .quad 4294967299 > .quad 0 > .quad 270582939663 > .Ltmp: > leaq -8(%rbp),%rax > cmpq %r15,%rax > jb .Lc1Dh > > > Since the jmp is a fixed 2 bytes (0xEB, tablesize), all references to the table can still be done with trivial pc/RIP-relative addressing within the closure, and you just need one pointer for both the table and the closure data. > > If you want to get extra special and tricky, you could be even more devious by storing "Main_main1_info + 2 + table size" as the canonical pointer. If you jump to *that* when dispatching to the closure, then you completely avoid the runtime overhead of the extra unconditional jump and get exactly the same code you're getting with GHC's native code generator. > > To access the table in LLVM IR, you'll be generating some truly special (i.e. horrible :) IR along the lines of (e.g. to load the 4294967299 field): > > load (gep (bitcast @Main_main1_info to i64*), 0, 2) > > The code generator probably isn't smart enough to turn that into a rip-relative memory access, but adding that should be straight-forward. > > The tricky bit will be figuring out how to ensure that the inline asm blob containing the table will come before the standard prolog. Perhaps this can be handled by the existing GHC calling convention, or through creative use of the naked attribute. > > -Chris > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev_______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Feb 15, 2012, at 2:08 PM, James Molloy <James.Molloy at arm.com> wrote:>> This is starting to look very similar to how ARM constant islands work, without the extra ugliness from how small the ARM immediate displacements are. >> >> -Jim > > Would there be any reason that this couldn't be seen as an opportunity to move the constant islands pass out of the ARM backend and make the target-independent constant pools (which ARM bypasses completely) more generic? >The ARM pass does more than just constant islands (e.g., branch relaxation, TBB/TBH formation), so it's unfortunately not very generic. It might be possible to refactor it into using target hooks and such, but I have my doubts how practical that will turn out to be. Add in that the pass is very, very fragile and I get nervous about trying to generalize it too much. Keeping the generic pass pretty brain dead and letting targets override it seems sane to me. -Jim> ______________________________________ > From: llvmdev-bounces at cs.uiuc.edu [llvmdev-bounces at cs.uiuc.edu] On Behalf Of Jim Grosbach [grosbach at apple.com] > Sent: 15 February 2012 21:59 > To: Chris Lattner > Cc: Simon Marlow; Sergiu Ivanov; llvmdev at cs.uiuc.edu; cvs-ghc; Gabor Greif > Subject: Re: [LLVMdev] LLVM GHC Backend: Tables Next To Code > > On Feb 15, 2012, at 12:16 PM, Chris Lattner <clattner at apple.com> wrote: > >> >> On Feb 14, 2012, at 10:30 AM, David Terei wrote: >> >>> Hmm writing a blog post about TNTC is beyond the time I have right now. >> >> Sure, understandable. I'm surprised someone else hasn't already :) >> >>> Here is some high level documentation of the layout of Heap objects in GHC: >>> >>> http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#InfoTables >>> >>> With TNTC enabled we generate code for closures of this form: >>> >>> .text >>> .align 8 >>> .long Main_main1_srt-(Main_main1_info)+0 >>> .long 0 >>> .quad 4294967299 >>> .quad 0 >>> .quad 270582939663 >>> .globl Main_main1_info >>> .type Main_main1_info, @object >>> Main_main1_info: >>> .Lc1Df: >>> leaq -8(%rbp),%rax >>> cmpq %r15,%rax >>> jb .Lc1Dh >> >> Ok. I'd strongly recommend the approach of generating the table inside the prolog of the function. This means you'd get something like this: >> > > This is starting to look very similar to how ARM constant islands work, without the extra ugliness from how small the ARM immediate displacements are. > > -Jim > > >> .text >> .align 8 >> .globl Main_main1_info >> .type Main_main1_info, @object >> Main_main1_info: >> .Lc1Df: >> jmp .Ltmp >> .long Main_main1_srt-(Main_main1_info)+0 >> .long 0 >> .quad 4294967299 >> .quad 0 >> .quad 270582939663 >> .Ltmp: >> leaq -8(%rbp),%rax >> cmpq %r15,%rax >> jb .Lc1Dh >> >> >> Since the jmp is a fixed 2 bytes (0xEB, tablesize), all references to the table can still be done with trivial pc/RIP-relative addressing within the closure, and you just need one pointer for both the table and the closure data. >> >> If you want to get extra special and tricky, you could be even more devious by storing "Main_main1_info + 2 + table size" as the canonical pointer. If you jump to *that* when dispatching to the closure, then you completely avoid the runtime overhead of the extra unconditional jump and get exactly the same code you're getting with GHC's native code generator. >> >> To access the table in LLVM IR, you'll be generating some truly special (i.e. horrible :) IR along the lines of (e.g. to load the 4294967299 field): >> >> load (gep (bitcast @Main_main1_info to i64*), 0, 2) >> >> The code generator probably isn't smart enough to turn that into a rip-relative memory access, but adding that should be straight-forward. >> >> The tricky bit will be figuring out how to ensure that the inline asm blob containing the table will come before the standard prolog. Perhaps this can be handled by the existing GHC calling convention, or through creative use of the naked attribute. >> >> -Chris >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. >