Alexey Perevalov
2015-Apr-20 18:09 UTC
[LLVMdev] question about alignment of structures on the stack (arm 32)
Dear community, I faced with code which was generated by llvm, assembly instructions of that code is relying on 8-bytes alignment for structures on the stack. The part of Objective C code is following: -(void)getCharacters:(unichar *)unicode { NSRange range; range.location = 0; range.length = [self length]; printf("%p, %p\n", &range.location, &range.length); And before printf call I see an argument preparation, and one of the most interesting instruction orr r3, r2, #4 ;for address of range.length Does this mean llvm always expects "range" address aligned by 8 bytes? Is it possible to tweak it somehow by cmd line option for clang, e.g. to set 4-bytes alignment, and generate another code instead of orr, e.g. add? Best Regards, Alexey
Tim Northover
2015-Apr-20 18:45 UTC
[LLVMdev] question about alignment of structures on the stack (arm 32)
On 20 April 2015 at 11:09, Alexey Perevalov <alexey.perevalov at hotmail.com> wrote:> And before printf call I see an argument preparation, and one of the most interesting instruction > > orr r3, r2, #4 ;for address of range.lengthThis is certainly odd, and I can't reproduce the behaviour here. Even if the stack itself is 8-byte aligned (it's not on iOS), that struct would usually only be 4-byte aligned. LLVM shouldn't be using "orr" there. Do you have a self-contained example (code, compiler version & command line flags)? Cheers. Tim.
Alexey Perevalov
2015-Apr-21 15:54 UTC
[LLVMdev] question about alignment of structures on the stack (arm 32)
Hello Tim, thanks for response ----------------------------------------> Date: Mon, 20 Apr 2015 11:45:03 -0700 > Subject: Re: [LLVMdev] question about alignment of structures on the stack (arm 32) > From: t.p.northover at gmail.com > To: alexey.perevalov at hotmail.com > CC: llvmdev at cs.uiuc.edu > > On 20 April 2015 at 11:09, Alexey Perevalov > <alexey.perevalov at hotmail.com> wrote: >> And before printf call I see an argument preparation, and one of the most interesting instruction >> >> orr r3, r2, #4 ;for address of range.length > > This is certainly odd, and I can't reproduce the behaviour here. Even > if the stack itself is 8-byte aligned (it's not on iOS), that struct > would usually only be 4-byte aligned. LLVM shouldn't be using "orr" > there.Yes, you're right, it's odd ). Sorry I didn't clearly described my environment. I'm using MachO loader (https://github.com/LubosD/darling/). I'm trying to make it work on ARM. The scenario is to load MachO binary (e.g. compiled in xCode) that binary is invoking function from ELF library which implements libobjc2 and CoreFoundation. in MachO on the ARM stack is 4-bytes aligned. Code produced for ELF expects 8-bytes alignment. So in 50% cases when call made from MachO to ELF stack pointer register contains not a 8-bytes aligned address. Even in case of trivial call NSLog(@"Test string") from MachO it leads to -[NSString getCharacters:] ------ -(void)getCharacters:(unichar *)unicode { NSRange range={0,[self length]}; [self getCharacters:unicode range:range]; } ------ when "range" is copying by value, and second field of "range" is evaluated incorrectly, its address evaluated as address of the structure itself. because of orr r3, r2, #4, The minimum example I think is: #include <stdio.h> typedef struct { int a; char b; } MyStruct; int main(void) { MyStruct mStruct = {11, 100}; printf("%p, %p\n", &mStruct.a, &mStruct.b); return 0; } compile it by clang ---- clang version 3.3 (tags/RELEASE_33/final) Target: armv7l-unknown-linux-gnueabi Thread model: posix ----- And we get following code of assembler language: main: push {r11, lr} mov r11, sp sub sp, sp, #24 mov r0, #0 str r0, [r11, #-4] add r1, sp, #8 movw r2, :lower16:.Lmain.mStruct movt r2, :upper16:.Lmain.mStruct vldr d16, [r2] vstr d16, [sp, #8] orr r2, r1, #4 movw r3, :lower16:.L.str movt r3, :upper16:.L.str str r0, [sp, #4] mov r0, r3 bl printf ldr r1, [sp, #4] str r0, [sp] mov r0, r1 mov sp, r11 pop {r11, pc} r2 populates by r1 plus 4 (but plus here is optimized). I think you know it better than me ;) And if address of mStruct mod 4 = 0 and != mod 8, I got r2 the same as r1. Due I can't modify MachO binaries, I'm looking for a way to avoid orr and use add instruction here. Maybe it will not solve all of my problems due difference in ABI, I suggest it's the easiest way. I found -mstack-alignment= options, and I tried 4 value there for ELF build, but orr still used. BTW for x86_64 it worked, both on linux and mac. Another way, I think, it's make realignment inside all of ELF function, here could be a performance penalty, I tried -mstackrealign, but it wasn't lead to 8-bytes aligned stack, I mean sp wasn't aligned to 0x.....0/8, as well as address of structure on the stack. Also I tried -mstrict-align. So I assume, somewhere should be patches for llvm, which could do it ) I not yet tested some __attribute__((pcs("aapcs")))/-target-abi, maybe there is magic pcs attribute, and I could apply it for dangerous function, but I would prefer to solve that problem in general.> > Do you have a self-contained example (code, compiler version & command > line flags)? > > Cheers. > > Tim.Best regards, Alexey