thr3ads.net - llvm dev - [LLVMdev] question about alignment of structures on the stack (arm 32) [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Alexey Perevalov

2015-Apr-20 18:09 UTC

[LLVMdev] question about alignment of structures on the stack (arm 32)

Dear community,

I faced with code which was generated by llvm, assembly instructions of that
code is relying on 8-bytes alignment for structures on the stack.
The part of Objective C code is following:
-(void)getCharacters:(unichar *)unicode {
    NSRange range;
    range.location = 0;
    range.length = [self length];
    printf("%p, %p\n", &range.location, &range.length);

And before printf call I see an argument preparation, and one of the most
interesting instruction

    orr    r3, r2, #4 ;for address of range.length

Does this mean llvm always expects "range" address aligned by 8 bytes?
Is it possible to tweak it somehow by cmd line option for clang, e.g. to set
4-bytes alignment,
and generate another code instead of orr, e.g. add?


Best Regards,
Alexey

Tim Northover

2015-Apr-20 18:45 UTC

head link

[LLVMdev] question about alignment of structures on the stack (arm 32)

On 20 April 2015 at 11:09, Alexey Perevalov
<alexey.perevalov at hotmail.com> wrote:> And before printf call I see an argument preparation, and one of the most
interesting instruction
>
>     orr    r3, r2, #4 ;for address of range.length
This is certainly odd, and I can't reproduce the behaviour here. Even
if the stack itself is 8-byte aligned (it's not on iOS), that struct
would usually only be 4-byte aligned. LLVM shouldn't be using
"orr"
there.

Do you have a self-contained example (code, compiler version & command
line flags)?

Cheers.

Tim.

Alexey Perevalov

2015-Apr-21 15:54 UTC

head link

[LLVMdev] question about alignment of structures on the stack (arm 32)

Hello Tim, thanks for response

----------------------------------------> Date: Mon, 20 Apr 2015 11:45:03 -0700
> Subject: Re: [LLVMdev] question about alignment of structures on the stack
(arm 32)
> From: t.p.northover at gmail.com
> To: alexey.perevalov at hotmail.com
> CC: llvmdev at cs.uiuc.edu
>
> On 20 April 2015 at 11:09, Alexey Perevalov
> <alexey.perevalov at hotmail.com> wrote:
>> And before printf call I see an argument preparation, and one of the
most interesting instruction
>>
>> orr r3, r2, #4 ;for address of range.length
>
> This is certainly odd, and I can't reproduce the behaviour here. Even
> if the stack itself is 8-byte aligned (it's not on iOS), that struct
> would usually only be 4-byte aligned. LLVM shouldn't be using
"orr"
> there.
Yes, you're right, it's odd ).

Sorry I didn't clearly described my environment.
I'm using MachO loader (https://github.com/LubosD/darling/). I'm trying
to make it work on ARM.
The scenario is to load MachO binary (e.g. compiled in xCode) that binary is
invoking function from
ELF library which implements libobjc2 and CoreFoundation.

 in MachO on the ARM stack is 4-bytes aligned. Code produced for ELF expects
8-bytes alignment.
So in 50% cases when call made from MachO to ELF stack pointer register contains
not a 8-bytes aligned address.
Even in case of trivial call
NSLog(@"Test string") from MachO
it leads to -[NSString getCharacters:]
------
-(void)getCharacters:(unichar *)unicode {
   NSRange range={0,[self length]};
   [self getCharacters:unicode range:range];
}

------
when "range" is copying by value, and second field of
"range" is evaluated incorrectly,
its address evaluated as address of the structure itself.
because of orr r3, r2, #4,

The minimum example I think is:
#include <stdio.h>

typedef struct
{
    int a;
    char b;
} MyStruct;


int main(void) {
    MyStruct mStruct = {11, 100};
    printf("%p, %p\n", &mStruct.a, &mStruct.b);
    return 0;
}
compile it by clang
----
clang version 3.3 (tags/RELEASE_33/final)
Target: armv7l-unknown-linux-gnueabi
Thread model: posix
-----
And we get following code of assembler language:
main:
    push    {r11, lr}
    mov    r11, sp
    sub    sp, sp, #24
    mov    r0, #0
    str    r0, [r11, #-4]
    add    r1, sp, #8
    movw    r2, :lower16:.Lmain.mStruct
    movt    r2, :upper16:.Lmain.mStruct
    vldr    d16, [r2]
    vstr    d16, [sp, #8]
    orr    r2, r1, #4
    movw    r3, :lower16:.L.str
    movt    r3, :upper16:.L.str
    str    r0, [sp, #4]
    mov    r0, r3
    bl    printf
    ldr    r1, [sp, #4]
    str    r0, [sp]
    mov    r0, r1
    mov    sp, r11
    pop    {r11, pc}

r2 populates by r1 plus 4 (but plus here is optimized). I think you know it
better than me ;)
And if address of mStruct mod 4 = 0 and != mod 8, I got r2 the same as r1.

Due I can't modify MachO binaries, I'm looking for a way to avoid orr
and use add instruction here.
Maybe it will not solve all of my problems due difference in ABI, I suggest
it's the easiest way.
I found -mstack-alignment= options, and I tried 4 value there for ELF build, but
orr still used. BTW for x86_64 it worked, both on linux and mac.

Another way, I think, it's make realignment inside all of ELF function, here
could be a performance penalty, I tried -mstackrealign, but it wasn't lead
to 8-bytes aligned stack, I mean sp wasn't aligned to 0x.....0/8,
as well as address of structure on the stack. Also I tried -mstrict-align.
So I assume, somewhere should be patches for llvm, which could do it )

I not yet tested some __attribute__((pcs("aapcs")))/-target-abi, maybe
there is magic pcs attribute, and I could apply it for dangerous function, but I
would prefer to solve that problem in general.
>
> Do you have a self-contained example (code, compiler version & command
> line flags)?
>
> Cheers.
>
> Tim.
Best regards,

Alexey

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Apr 2015 - [LLVMdev] question about alignment of structures on the stack (arm 32)

[LLVMdev] question about alignment of structures on the stack (arm 32)

[LLVMdev] question about alignment of structures on the stack (arm 32)

[LLVMdev] question about alignment of structures on the stack (arm 32)

Reasonably Related Threads