thr3ads.net - llvm dev - [LLVMdev] question about alignment of structures on the stack (arm 32) [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Alexey Perevalov

2015-Apr-23 13:03 UTC

[LLVMdev] question about alignment of structures on the stack (arm 32)

----------------------------------------> Date: Tue, 21 Apr 2015 09:15:02 -0700
> Subject: Re: [LLVMdev] question about alignment of structures on the stack
(arm 32)
> From: t.p.northover at gmail.com
> To: alexey.perevalov at hotmail.com
> CC: llvmdev at cs.uiuc.edu
>
>> I'm using MachO loader (https://github.com/LubosD/darling/).
I'm trying to make it work on ARM.
>> The scenario is to load MachO binary (e.g. compiled in xCode) that
binary is invoking function from
>> ELF library which implements libobjc2 and CoreFoundation.
>>
>> in MachO on the ARM stack is 4-bytes aligned. Code produced for ELF
expects 8-bytes alignment.
>> So in 50% cases when call made from MachO to ELF stack pointer register
contains not a 8-bytes aligned address.
>
> Ah, that could do it. I see that LLVM does indeed make use of stack
> alignment in this case. Regardless, this approach is going to go
> really badly.

>
> By default almost all ELF platforms use an ABI called AAPCS (either
> hard or soft float). iOS uses an older ABI called APCS. You can't mix
> code from these two worlds in any kind of non-trivial case without a
> translation layer.Do you mean translation layer in loader. If so, loader could replace any ELF
invocation by stub function invocation, stub will adjust stack and so on, but
stub in this case should know invoking function signature, otherwise
arguments on stack could be missed, I think it's compiler responsibility.
Or you meant something like virtual machines?

>
> You've discovered one issue: AAPCS requires 8-byte alignment for sp,
> APCS only requires 4. It's the first of many without a more thorough
> approach to the interface between the two.I have seen
https://developer.apple.com/library/ios/documentation/Xcode/Conceptual/iPhoneOSABIReference/Articles/ARMv6FunctionCallingConventions.html,
document says ARMv6 call convention is similar to ARMv7, and document refers to
AAPCS, but also describes some discrepencies, like        The stack is 4-byte
aligned at the point of function calls.              I faced here with bugs, due
stack alignment, but as I wrote before, I think realignment or removing orr and
use add instead could solve it.        Large data types (larger than 4 bytes)
are 4-byte aligned.              I didn't yet test this case, but I think
here could be the same pitfalls like with orr r0, r0, 4
        Register R7 is used as a frame pointer              If I truly
understood it's for debug purpose only, but disasmly of my
CoreFoundation(ELF) shows r7 usage. Frame pointer on my system is r11.
        Register R9 has special usage              Document says r9 could be
used since iOS 3.0, and I found a usage in my CoreFoundation. So I don't
think it could be a problem.


>
>> I not yet tested some
__attribute__((pcs("aapcs")))/-target-abi, maybe there is magic pcs
attribute, and I could apply it for dangerous function, but I would prefer to
solve that problem in general.
>
> I don't think so; __attribute__((pcs("apcs"))) might work, if
it
> existed. But it doesn't. You might find it's fairly easy to add it
in
> Clang, but I worry about the assumptions being made in the backend.
>
I tried -mstack-alignment=8 -mstackrealign for x86_64 and I found it working
for example
-    subq    $32, %rsp
+    andq    $-8, %rsp
+    subq    $24, %rsp
...
and of course modified function body,
but for arm nothing happened. I tried to understand what goes wrong in llvm, but
too many layers of abstractions.
Maybe that code exists, but condition from ARMBaseRegisterInfo::canRealignStack
prevent its generation.


BTW, I build llvm/clang 3.6 (it was impossible to build  latest version from
HEAD ) and something changed ;)
-    str    r1, [sp, #20]
-    str    r2, [sp, #16]
-    add    r1, sp, #16
-    orr    r2, r1, #4
+    str    r1, [sp, #16]
+    str    r2, [sp, #12]
+    add    r1, sp, #12
+    add    r2, r1, #4
add instead of orr. Unfortunately, I didn't yet put 36 clang into my chroot
to build (I'm not using cross compilation).
But if somebody could point me to proper source code or name the patch, I'll
be very appreciate.



> Either way, I'd recommend against trying to hack just this one stack
> alignment issue.
>
> Tim.

Tim Northover

2015-Apr-23 14:09 UTC

head link

[LLVMdev] question about alignment of structures on the stack (arm 32)

>> By default almost all ELF platforms use an ABI called AAPCS (either
>> hard or soft float). iOS uses an older ABI called APCS. You can't
mix
>> code from these two worlds in any kind of non-trivial case without a
>> translation layer.
>
> Do you mean translation layer in loader. If so, loader could replace any
ELF invocation by stub function invocation, stub will adjust stack and so on,
but stub in this case should know invoking function signature, otherwise
> arguments on stack could be missed,
Yep, that's pretty much exactly what I had in mind. You'd probably
need at least some assembler component.
> I think it's compiler responsibility.
Compilers generally don't take the responsibility for making two ABIs
compatible, with certain exceptions (ironically, the main one I know
of *is* in ARM, where AAPCS and AAPCS-VFP have some accommodations).
> I faced here with bugs, due stack alignment, but as I wrote before, I think
realignment or removing orr and use add instead could solve it.
> Large data types (larger than 4 bytes) are 4-byte aligned.
This is a big one. It means structs will be laid out differently
unless you're careful, but the most difficult aspect is that it
applies to function calls too. Consider:

    void func(int x, long long y)

iOS will pass y in registers r1 and r2. ELF code will expect it in
registers r2 and r3. Similar effects happen to arguments that get
passed on the stack.
> + Register R7 is used as a frame pointer
> If I truly understood it's for debug purpose only, but disasmly of my
CoreFoundation(ELF) shows r7 usage. Frame pointer on my system is r11.
> + Register R9 has special usage
> Document says r9 could be used since iOS 3.0, and I found a usage in my
CoreFoundation. So I don't think it could be a problem.
Yes, these ones are probably harmless.

There are other issues too, particularly when you get to C++ (name
mangling and exceptions spring to mind). But I expect you've got
enough to worry about for now.
> -    orr    r2, r1, #4
> +    add    r2, r1, #4
> add instead of orr. Unfortunately, I didn't yet put 36 clang into my
chroot to build (I'm not using cross compilation).
> But if somebody could point me to proper source code or name the patch,
I'll be very appreciate.
I wouldn't rely on this. Trunk emits orr again, it's likely just a
random code perturbation and will bite you elsewhere without a real
solution.

Tim.

Alexey Perevalov

2015-Apr-23 16:07 UTC

head link

[LLVMdev] question about alignment of structures on the stack (arm 32)

----------------------------------------> Date: Thu, 23 Apr 2015 07:09:47 -0700
> Subject: Re: [LLVMdev] question about alignment of structures on the stack
(arm 32)
> From: t.p.northover at gmail.com
> To: alexey.perevalov at hotmail.com
> CC: llvmdev at cs.uiuc.edu; lubos at dolezel.info
>
>>> By default almost all ELF platforms use an ABI called AAPCS (either
>>> hard or soft float). iOS uses an older ABI called APCS. You
can't mix
>>> code from these two worlds in any kind of non-trivial case without
a
>>> translation layer.
>>
>> Do you mean translation layer in loader. If so, loader could replace
any ELF invocation by stub function invocation, stub will adjust stack and so
on, but stub in this case should know invoking function signature, otherwise
>> arguments on stack could be missed,
>
> Yep, that's pretty much exactly what I had in mind. You'd probably
> need at least some assembler component.
>
>> I think it's compiler responsibility.
>
> Compilers generally don't take the responsibility for making two ABIs
> compatible, with certain exceptions (ironically, the main one I know
> of *is* in ARM, where AAPCS and AAPCS-VFP have some accommodations).I wrote about responsibility, and took in mind: compiler knows function
signature, but runtime/loader doesn't. Now I don't have any other ideas
instead of keeping signatures of problem functions
in loader.

>
>> I faced here with bugs, due stack alignment, but as I wrote before, I
think realignment or removing orr and use add instead could solve it.
>
>> Large data types (larger than 4 bytes) are 4-byte aligned.
>
> This is a big one. It means structs will be laid out differently
> unless you're careful, but the most difficult aspect is that it
> applies to function calls too. Consider:
>
> void func(int x, long long y)
>
> iOS will pass y in registers r1 and r2. ELF code will expect it in
> registers r2 and r3. Similar effects happen to arguments that get
> passed on the stack.Strange, but in that simple case on ELF I got,
    mov r0, #1
    mov r1, #18
    mov r2, #0
    bl  long_long_func
with the same endian as on iOS,
>
>> + Register R7 is used as a frame pointer
>> If I truly understood it's for debug purpose only, but disasmly of
my CoreFoundation(ELF) shows r7 usage. Frame pointer on my system is r11.
>> + Register R9 has special usage
>> Document says r9 could be used since iOS 3.0, and I found a usage in my
CoreFoundation. So I don't think it could be a problem.
>
> Yes, these ones are probably harmless.
>
> There are other issues too, particularly when you get to C++ (name
> mangling and exceptions spring to mind). But I expect you've got
> enough to worry about for now.
>
>> - orr r2, r1, #4
>> + add r2, r1, #4
>> add instead of orr. Unfortunately, I didn't yet put 36 clang into
my chroot to build (I'm not using cross compilation).
>> But if somebody could point me to proper source code or name the patch,
I'll be very appreciate.
>
> I wouldn't rely on this. Trunk emits orr again, it's likely just a
> random code perturbation and will bite you elsewhere without a real
> solution.
>Trunk of llvm's source code ) ?
> Tim.

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Apr 2015 - [LLVMdev] question about alignment of structures on the stack (arm 32)

[LLVMdev] question about alignment of structures on the stack (arm 32)

[LLVMdev] question about alignment of structures on the stack (arm 32)

[LLVMdev] question about alignment of structures on the stack (arm 32)

Possibly Parallel Threads