Karel Gardas
2012-Jun-29  16:46 UTC
[LLVMdev] Request for merge: GHC/ARM calling convention.
Hi Renato, On 06/25/12 12:13 AM, Renato Golin wrote:> Hi Karel, > > I understand this patch has already been merged (to 3.0), so don't > take my question as stopping the merge to head, I'm just making sure I > got it right... The rest looks correct. > > + CCIfType<[v2f64], CCAssignToReg<[Q4, Q5]>>, > + CCIfType<[f64], CCAssignToReg<[D8, D9, D10, D11]>>, > + CCIfType<[f32], CCAssignToReg<[S16, S17, S18, S19, S20, S21, S22, S23]>>, > > Does this mean that for floating point support in GHC, you need VFP registers?Yes and no. Shortly: original GHC/ARM/LLVM port was done by Stephen on ARMv5/Qemu IIRC. I've later added whole VFP support and ARMv7 support. The code in GHC is properly #ifdefed, so if there is no VFP available on pre ARMv6, then it's not used. ie. GHC STG floating points regs are then allocated in RAM instead of real hardware regs.> I don't know much how tablegen would work in this case, but I'd expect > it to break during codegen (with a horrid error message) if you try to > compile that to an ARMv4-ish core.I'm not sure I understand you right here, but if you look into ARMCallingConv.td file, you will see that exactly the same statements are used in FastCC_ARM_APCS so I think if it's going to be broken, then the GHC calling convention is not the only culprit here. Or are you talking about actual compilation for the target platform? Anyway, I think majority of GHC/ARM related work will be done on ARMv7 if it makes you less nervous about the patch. Thanks, Karel
Renato Golin
2012-Jun-29  21:12 UTC
[LLVMdev] Request for merge: GHC/ARM calling convention.
On 29 June 2012 17:46, Karel Gardas <karel.gardas at centrum.cz> wrote:> Yes and no. Shortly: original GHC/ARM/LLVM port was done by Stephen on > ARMv5/Qemu IIRC. I've later added whole VFP support and ARMv7 support. The > code in GHC is properly #ifdefed, so if there is no VFP available on pre > ARMv6, then it's not used. ie. GHC STG floating points regs are then > allocated in RAM instead of real hardware regs.That's fine. As long as you don't try to interoperate with EABI libraries... ;)> I'm not sure I understand you right here, but if you look into > ARMCallingConv.td file, you will see that exactly the same statements are > used in FastCC_ARM_APCS so I think if it's going to be broken, then the GHC > calling convention is not the only culprit here.APCS is deprecated almost a decade, not sure there is anyone on the planet using that... except maybe Nokia... ;) You mentioned it's #ifdefed, so I guess you won't have that problem. Though it's not the best solution...> Anyway, I think majority of GHC/ARM related work will be done on ARMv7 if it > makes you less nervous about the patch.I'm not nervous, and as I said before, you shouldn't hold on because of my comments. I believe it works, but it may work on a sub-optimal and maybe incompatible way with the rest of the world. I was just making sure a standard ARM calling convention (as I believe GHC for ARM has become) could interoperate with the rest of the world. Even if not now, in the near future, as long as you're aware that this is probably not the best option. cheers, --renato
Karel Gardas
2012-Jun-29  21:56 UTC
[LLVMdev] Request for merge: GHC/ARM calling convention.
On 06/29/12 11:12 PM, Renato Golin wrote:> On 29 June 2012 17:46, Karel Gardas<karel.gardas at centrum.cz> wrote: >> Yes and no. Shortly: original GHC/ARM/LLVM port was done by Stephen on >> ARMv5/Qemu IIRC. I've later added whole VFP support and ARMv7 support. The >> code in GHC is properly #ifdefed, so if there is no VFP available on pre >> ARMv6, then it's not used. ie. GHC STG floating points regs are then >> allocated in RAM instead of real hardware regs. > > That's fine. As long as you don't try to interoperate with EABI libraries... ;)You makes me worried! Let me ask, am I wrong assuming Ubuntu 11.04/11.10/12.04 are using EABI? That's at least what I thought all the time... Anyway, if they are EABI, then GHC/ARM needs to interoperate well with EABI libs as Haskell code is using them too. For registerised GHC build there are two important functions provided in GHC's RTS written in C (or gcc inline asm compiled by gcc still using platform ABI) which performs important task of bridging C world ABI and GHC own ABI. It's StgRun and StgReturn. StgRun calls Haskell (STG) function from the C world and StgReturn return from the Haskell world into C world again safely. They are defined in rts/StgCRun.c and the comment from the top of the file might be more useful here than my writing about it: * STG-to-C glue. * * To run an STG function from C land, call * * rv = StgRun(f,BaseReg); * * where "f" is the STG function to call, and BaseReg is the address of the * RegTable for this run (we might have separate RegTables if we're running * multiple threads on an SMP machine). * * In the end, "f" must JMP to StgReturn (defined below), passing the * return-value "rv" in R1, to return to the caller of StgRun returning "rv" in * the whatever way C returns a value. * * NOTE: StgRun/StgReturn do *NOT* load or store Hp or any other registers * (other than saving the C callee-saves registers). Instead, the called * function "f" must do that in STG land. * * We also initially make sure that there are @RESERVED_C_STACK_BYTES@ on the * C-stack. This is done to reserve some space for the allocation of * temporaries in STG code. If you are then curious how ARM-specific StgRun implementation looks, then here it is: #ifdef arm_HOST_ARCH #if defined(__thumb__) #define THUMB_FUNC ".thumb\n\t.thumb_func\n\t" #else #define THUMB_FUNC #endif StgRegTable * StgRun(StgFunPtr f, StgRegTable *basereg) { StgRegTable * r; __asm__ volatile ( /* * save callee-saves registers on behalf of the STG code. */ "stmfd sp!, {r4-r10, fp, ip, lr}\n\t" #if !defined(arm_HOST_ARCH_PRE_ARMv6) "vstmdb sp!, {d8-d11}\n\t" #endif /* * allocate some space for Stg machine's temporary storage. * Note: RESERVER_C_STACK_BYTES has to be a round number here or * the assembler can't assemble it. */ "sub sp, sp, %3\n\t" /* * Set BaseReg */ "mov r4, %2\n\t" /* * Jump to function argument. */ "bx %1\n\t" ".global " STG_RETURN "\n\t" THUMB_FUNC ".type " STG_RETURN ", %%function\n" STG_RETURN ":\n\t" /* * Free the space we allocated */ "add sp, sp, %3\n\t" /* * Return the new register table, taking it from Stg's R1 (ARM's R7). */ "mov %0, r7\n\t" /* * restore callee-saves registers. */ #if !defined(arm_HOST_ARCH_PRE_ARMv6) "vldmia sp!, {d8-d11}\n\t" #endif "ldmfd sp!, {r4-r10, fp, ip, lr}\n\t" : "=r" (r) : "r" (f), "r" (basereg), "i" (RESERVED_C_STACK_BYTES) #if !defined(__thumb__) /* In ARM mode, r11/fp is frame-pointer and so we cannot mark it as clobbered. If we do so, GCC complains with error. */ : "%r4", "%r5", "%r6", "%r7", "%r8", "%r9", "%r10", "%ip", "%lr" #else /* In Thumb mode r7 is frame-pointer and so we cannot mark it as clobbered. On the other hand we mark as clobbered also those regs not used in Thumb mode. Hard to judge if this is needed, but certainly Haskell code is using them for placing GHC's virtual registers there. See includes/stg/MachRegs.h Please note that Haskell code is compiled by GHC/LLVM into ARM code (not Thumb!), at least as of February 2012 */ : "%r4", "%r5", "%r6", "%r8", "%r9", "%r10", "%fp", "%ip", "%lr" #endif ); return r; } #endif Please see https://github.com/ghc/ghc/blob/master/rts/StgCRun.c -- if you are interested to see implementations of that function for another architectures (x86/amd64/ppc/...) Honestly speaking I don't know if I'm making things more clear or more confused with this but I hope you can somehow distill it and tell me if there is anything wrong with our ARM/GHC support then. Thanks a lot! Karel