Weiming Zhao
2014-Mar-14 18:34 UTC
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
Hi Rafael, Yes, merging gv prevents linker to do garbage collection. Should it be implemented as a peephole pass? If we do it too early, the distance between GVs are not fixed yet. PS: Below is the GCC output with "extern" hidden: ldr r2, .L2 stmfd sp!, {r3, lr} .save {r3, lr} .LPIC0: add r0, pc, r2 bl _Z4initPv(PLT) ldr r1, .L2+4 .LPIC1: add r0, pc, r1 bl _Z4initPv(PLT) ldr r0, .L2+8 .LPIC2: add r0, pc, r0 ldmfd sp!, {r3, lr} b _Z4initPv(PLT) .L3: .align 2 .L2: .word g0-(.LPIC0+8) .word g1-(.LPIC1+8) .word g2-(.LPIC2+8) Thanks, Weiming Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -----Original Message----- From: Rafael Espíndola [mailto:rafael.espindola at gmail.com] Sent: Friday, March 14, 2014 9:04 AM To: Tim Northover Cc: Weiming Zhao; LLVM Developers Mailing List; Jim Grosbach; Nick Kledzik Subject: Re: [LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable> After a very brief thought, I'd still go for GlobalMerge now, in > conjunction with an enhanced "alias" so that you could emit something > like: > > @g1 = hidden alias [100 x i32]* bitcast(i32* getelementptr([300 x > i32]* @Merged, i32 0, i32 0) to [100 x i32]*) > > We certainly don't seem to handle this alias properly now though, and > it may violate the intended uses. Rafael's doing some thinking about > "alias" at the moment, so I've CCed him. > > Would that be a horrific abuse of the poor alias system?I think it would :-) Folding objects like this prevents the linker from deleting one of them if it is unused for example. I think it is just a missing optimization in the ARM backend. If it knows multiple objecs are in the same DSO, it can use the address of one to find the other. Given: @g0 = hidden global [100 x i32] zeroinitializer, align 4 @g1 = hidden global [100 x i32] zeroinitializer, align 4 define void @foo() { tail call void @bar(i8* bitcast ([100 x i32]* @g0 to i8*)) tail call void @bar(i8* bitcast ([100 x i32]* @g1 to i8*)) ret void } declare void @bar(i8*) The command "llc -mtriple=i686-pc-linux -relocation-model=pic" produces calll .L0$pb .L0$pb: popl %ebx .Ltmp3: addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp3-.L0$pb), %ebx leal g0 at GOTOFF(%ebx), %eax movl %eax, (%esp) calll bar at PLT leal g1 at GOTOFF(%ebx), %eax movl %eax, (%esp) calll bar at PLT Which is ok , since the add of ebx is folded and the constant is an immediate in x86. On ARM, that is not the case. We produce ldr r0, .LCPI0_0 add r4, pc, r0 // r4 is the equivalent of ebx in the x86 case. ldr r0, .LCPI0_1 // r0 is the constant that is an immediate in x86. add r0, r0, r4 // that is the add that is folded in x86 ... .LCPI0_0: .long _GLOBAL_OFFSET_TABLE_-(.LPC0_0+8) .LCPI0_1: .long g0(GOTOFF) For ARM, codegen already keeps tracks of offset so it can implement the constant islands, so it should be able to see that the two globals are close enough that offset between them fits an immediate. Nick, will this work on MachO or can ld64 move _g0, _g1 and _g2 too far apart? BTW, what will gcc produce for void init(void *); extern int g0[100] __attribute__((visibility("hidden"))); extern int g1[100] __attribute__((visibility("hidden"))); extern int g2[100] __attribute__((visibility("hidden"))); void foo() { init(&g0); init(&g1); init(&g2); } Cheers, Rafael
Rafael Espíndola
2014-Mar-14 20:48 UTC
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
On 14 March 2014 14:34, Weiming Zhao <weimingz at codeaurora.org> wrote:> Hi Rafael, > > Yes, merging gv prevents linker to do garbage collection. Should it be implemented as a peephole pass? If we do it too early, the distance between GVs are not fixed yet.Correct. It would be somewhere in CodeGen, I am not exactly sure where.> PS: > Below is the GCC output with "extern" hidden: > ldr r2, .L2 > stmfd sp!, {r3, lr} > .save {r3, lr} > .LPIC0: > add r0, pc, r2 > bl _Z4initPv(PLT) > ldr r1, .L2+4 > .LPIC1: > add r0, pc, r1 > bl _Z4initPv(PLT) > ldr r0, .L2+8 > .LPIC2: > add r0, pc, r0 > ldmfd sp!, {r3, lr} > b _Z4initPv(PLT) > .L3: > .align 2 > .L2: > .word g0-(.LPIC0+8) > .word g1-(.LPIC1+8) > .word g2-(.LPIC2+8)That is pretty neat too. Cheers, Rafael
Weiming Zhao
2014-Mar-14 21:16 UTC
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
I just gave a try to MegeGlobal with alias because I thought it's easy to do. However, another issue with it is: Although I got aliases like: @h0 = alias getelementptr inbounds (... at _MergedGlobals, 0, 0) @h1 = alias getelementptr inbounds (... at _MergedGlobals, 0, 1) @h2 = alias getelementptr inbounds (... at _MergedGlobals, 0, 2) They cannot be lowered to correct asm. The all be aliases of _MergedGlobals: .globl h0 .set h0, _MergedGlobals .globl h1 .set h1, _MergedGlobals .globl h2 .set h2, _MergedGlobals I guess there is no support in asm to alias to a member of struct, right? Thanks, Weiming Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -----Original Message----- From: Rafael Espíndola [mailto:rafael.espindola at gmail.com] Sent: Friday, March 14, 2014 1:49 PM To: Weiming Zhao Cc: Tim Northover; LLVM Developers Mailing List; Jim Grosbach; Nick Kledzik Subject: Re: [LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable On 14 March 2014 14:34, Weiming Zhao <weimingz at codeaurora.org> wrote:> Hi Rafael, > > Yes, merging gv prevents linker to do garbage collection. Should it be implemented as a peephole pass? If we do it too early, the distance between GVs are not fixed yet.Correct. It would be somewhere in CodeGen, I am not exactly sure where.> PS: > Below is the GCC output with "extern" hidden: > ldr r2, .L2 > stmfd sp!, {r3, lr} > .save {r3, lr} > .LPIC0: > add r0, pc, r2 > bl _Z4initPv(PLT) > ldr r1, .L2+4 > .LPIC1: > add r0, pc, r1 > bl _Z4initPv(PLT) > ldr r0, .L2+8 > .LPIC2: > add r0, pc, r0 > ldmfd sp!, {r3, lr} > b _Z4initPv(PLT) > .L3: > .align 2 > .L2: > .word g0-(.LPIC0+8) > .word g1-(.LPIC1+8) > .word g2-(.LPIC2+8)That is pretty neat too. Cheers, Rafael