Alexey Perevalov
2015-Apr-23 13:03 UTC
[LLVMdev] question about alignment of structures on the stack (arm 32)
----------------------------------------> Date: Tue, 21 Apr 2015 09:15:02 -0700 > Subject: Re: [LLVMdev] question about alignment of structures on the stack (arm 32) > From: t.p.northover at gmail.com > To: alexey.perevalov at hotmail.com > CC: llvmdev at cs.uiuc.edu > >> I'm using MachO loader (https://github.com/LubosD/darling/). I'm trying to make it work on ARM. >> The scenario is to load MachO binary (e.g. compiled in xCode) that binary is invoking function from >> ELF library which implements libobjc2 and CoreFoundation. >> >> in MachO on the ARM stack is 4-bytes aligned. Code produced for ELF expects 8-bytes alignment. >> So in 50% cases when call made from MachO to ELF stack pointer register contains not a 8-bytes aligned address. > > Ah, that could do it. I see that LLVM does indeed make use of stack > alignment in this case. Regardless, this approach is going to go > really badly.> > By default almost all ELF platforms use an ABI called AAPCS (either > hard or soft float). iOS uses an older ABI called APCS. You can't mix > code from these two worlds in any kind of non-trivial case without a > translation layer.Do you mean translation layer in loader. If so, loader could replace any ELF invocation by stub function invocation, stub will adjust stack and so on, but stub in this case should know invoking function signature, otherwise arguments on stack could be missed, I think it's compiler responsibility. Or you meant something like virtual machines?> > You've discovered one issue: AAPCS requires 8-byte alignment for sp, > APCS only requires 4. It's the first of many without a more thorough > approach to the interface between the two.I have seen https://developer.apple.com/library/ios/documentation/Xcode/Conceptual/iPhoneOSABIReference/Articles/ARMv6FunctionCallingConventions.html, document says ARMv6 call convention is similar to ARMv7, and document refers to AAPCS, but also describes some discrepencies, like The stack is 4-byte aligned at the point of function calls. I faced here with bugs, due stack alignment, but as I wrote before, I think realignment or removing orr and use add instead could solve it. Large data types (larger than 4 bytes) are 4-byte aligned. I didn't yet test this case, but I think here could be the same pitfalls like with orr r0, r0, 4 Register R7 is used as a frame pointer If I truly understood it's for debug purpose only, but disasmly of my CoreFoundation(ELF) shows r7 usage. Frame pointer on my system is r11. Register R9 has special usage Document says r9 could be used since iOS 3.0, and I found a usage in my CoreFoundation. So I don't think it could be a problem.> >> I not yet tested some __attribute__((pcs("aapcs")))/-target-abi, maybe there is magic pcs attribute, and I could apply it for dangerous function, but I would prefer to solve that problem in general. > > I don't think so; __attribute__((pcs("apcs"))) might work, if it > existed. But it doesn't. You might find it's fairly easy to add it in > Clang, but I worry about the assumptions being made in the backend. >I tried -mstack-alignment=8 -mstackrealign for x86_64 and I found it working for example - subq $32, %rsp + andq $-8, %rsp + subq $24, %rsp ... and of course modified function body, but for arm nothing happened. I tried to understand what goes wrong in llvm, but too many layers of abstractions. Maybe that code exists, but condition from ARMBaseRegisterInfo::canRealignStack prevent its generation. BTW, I build llvm/clang 3.6 (it was impossible to build latest version from HEAD ) and something changed ;) - str r1, [sp, #20] - str r2, [sp, #16] - add r1, sp, #16 - orr r2, r1, #4 + str r1, [sp, #16] + str r2, [sp, #12] + add r1, sp, #12 + add r2, r1, #4 add instead of orr. Unfortunately, I didn't yet put 36 clang into my chroot to build (I'm not using cross compilation). But if somebody could point me to proper source code or name the patch, I'll be very appreciate.> Either way, I'd recommend against trying to hack just this one stack > alignment issue. > > Tim.
Tim Northover
2015-Apr-23 14:09 UTC
[LLVMdev] question about alignment of structures on the stack (arm 32)
>> By default almost all ELF platforms use an ABI called AAPCS (either >> hard or soft float). iOS uses an older ABI called APCS. You can't mix >> code from these two worlds in any kind of non-trivial case without a >> translation layer. > > Do you mean translation layer in loader. If so, loader could replace any ELF invocation by stub function invocation, stub will adjust stack and so on, but stub in this case should know invoking function signature, otherwise > arguments on stack could be missed,Yep, that's pretty much exactly what I had in mind. You'd probably need at least some assembler component.> I think it's compiler responsibility.Compilers generally don't take the responsibility for making two ABIs compatible, with certain exceptions (ironically, the main one I know of *is* in ARM, where AAPCS and AAPCS-VFP have some accommodations).> I faced here with bugs, due stack alignment, but as I wrote before, I think realignment or removing orr and use add instead could solve it.> Large data types (larger than 4 bytes) are 4-byte aligned.This is a big one. It means structs will be laid out differently unless you're careful, but the most difficult aspect is that it applies to function calls too. Consider: void func(int x, long long y) iOS will pass y in registers r1 and r2. ELF code will expect it in registers r2 and r3. Similar effects happen to arguments that get passed on the stack.> + Register R7 is used as a frame pointer > If I truly understood it's for debug purpose only, but disasmly of my CoreFoundation(ELF) shows r7 usage. Frame pointer on my system is r11. > + Register R9 has special usage > Document says r9 could be used since iOS 3.0, and I found a usage in my CoreFoundation. So I don't think it could be a problem.Yes, these ones are probably harmless. There are other issues too, particularly when you get to C++ (name mangling and exceptions spring to mind). But I expect you've got enough to worry about for now.> - orr r2, r1, #4 > + add r2, r1, #4 > add instead of orr. Unfortunately, I didn't yet put 36 clang into my chroot to build (I'm not using cross compilation). > But if somebody could point me to proper source code or name the patch, I'll be very appreciate.I wouldn't rely on this. Trunk emits orr again, it's likely just a random code perturbation and will bite you elsewhere without a real solution. Tim.
Alexey Perevalov
2015-Apr-23 16:07 UTC
[LLVMdev] question about alignment of structures on the stack (arm 32)
----------------------------------------> Date: Thu, 23 Apr 2015 07:09:47 -0700 > Subject: Re: [LLVMdev] question about alignment of structures on the stack (arm 32) > From: t.p.northover at gmail.com > To: alexey.perevalov at hotmail.com > CC: llvmdev at cs.uiuc.edu; lubos at dolezel.info > >>> By default almost all ELF platforms use an ABI called AAPCS (either >>> hard or soft float). iOS uses an older ABI called APCS. You can't mix >>> code from these two worlds in any kind of non-trivial case without a >>> translation layer. >> >> Do you mean translation layer in loader. If so, loader could replace any ELF invocation by stub function invocation, stub will adjust stack and so on, but stub in this case should know invoking function signature, otherwise >> arguments on stack could be missed, > > Yep, that's pretty much exactly what I had in mind. You'd probably > need at least some assembler component. > >> I think it's compiler responsibility. > > Compilers generally don't take the responsibility for making two ABIs > compatible, with certain exceptions (ironically, the main one I know > of *is* in ARM, where AAPCS and AAPCS-VFP have some accommodations).I wrote about responsibility, and took in mind: compiler knows function signature, but runtime/loader doesn't. Now I don't have any other ideas instead of keeping signatures of problem functions in loader.> >> I faced here with bugs, due stack alignment, but as I wrote before, I think realignment or removing orr and use add instead could solve it. > >> Large data types (larger than 4 bytes) are 4-byte aligned. > > This is a big one. It means structs will be laid out differently > unless you're careful, but the most difficult aspect is that it > applies to function calls too. Consider: > > void func(int x, long long y) > > iOS will pass y in registers r1 and r2. ELF code will expect it in > registers r2 and r3. Similar effects happen to arguments that get > passed on the stack.Strange, but in that simple case on ELF I got, mov r0, #1 mov r1, #18 mov r2, #0 bl long_long_func with the same endian as on iOS,> >> + Register R7 is used as a frame pointer >> If I truly understood it's for debug purpose only, but disasmly of my CoreFoundation(ELF) shows r7 usage. Frame pointer on my system is r11. >> + Register R9 has special usage >> Document says r9 could be used since iOS 3.0, and I found a usage in my CoreFoundation. So I don't think it could be a problem. > > Yes, these ones are probably harmless. > > There are other issues too, particularly when you get to C++ (name > mangling and exceptions spring to mind). But I expect you've got > enough to worry about for now. > >> - orr r2, r1, #4 >> + add r2, r1, #4 >> add instead of orr. Unfortunately, I didn't yet put 36 clang into my chroot to build (I'm not using cross compilation). >> But if somebody could point me to proper source code or name the patch, I'll be very appreciate. > > I wouldn't rely on this. Trunk emits orr again, it's likely just a > random code perturbation and will bite you elsewhere without a real > solution. >Trunk of llvm's source code ) ?> Tim.