weimingz at codeaurora.org
2014-Mar-14 04:45 UTC
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
Hi Tim, The global merge pass puts the GVs into a sturcture to guarantee their address are contiguous. It works for static GVs but for global hidden GVs, this will cause name resoltion fail during linking .o into .so Any thoughs? Thanks, Weiming> Hi Weiming, > > On 12 March 2014 17:43, Weiming Zhao <weimingz at codeaurora.org> wrote: >> Clang will emit 1 GOT entry for each GV and 2 instructions to get the >> address: >> >> GCC does this only for the first GV. The rest GV address are computed >> directly: > > This looks like it would be the job of lib/Transforms/GlobalMerge.cpp. > It looks like ARM runs it in all cases, perhaps it doesn't understand > some ELF linkage subtleties? > > Cheers. > > Tim. >
Tim Northover
2014-Mar-14 13:57 UTC
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
> It works for static GVs but for global hidden GVs, this will cause name > resoltion fail during linking .o into .soAh, I see. I've just looked up the semantics and GlobalMerge probably won't work, I agree.> Any thoughs?I'm now struggling to see how GCC justifies it. What if a different translation-unit declared those variables in a different order? I also can't get the same behaviour here, do you have a more complete command-line? Cheers. Tim.
Tim Northover
2014-Mar-14 14:07 UTC
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
>> Any thoughs? > > I'm now struggling to see how GCC justifies it. What if a different > translation-unit declared those variables in a different order? I also > can't get the same behaviour here, do you have a more complete > command-line?Ah, I see; the translation-unit that does the optimisation needs to have them as a definition (i.e. "= {0}") rather than a declaration for the optimisation to kick in, giving it precedence over other declarations. And the hidden-visibility means they won't be R_ARM_COPYed out of their initial location. After a very brief thought, I'd still go for GlobalMerge now, in conjunction with an enhanced "alias" so that you could emit something like: @g1 = hidden alias [100 x i32]* bitcast(i32* getelementptr([300 x i32]* @Merged, i32 0, i32 0) to [100 x i32]*) We certainly don't seem to handle this alias properly now though, and it may violate the intended uses. Rafael's doing some thinking about "alias" at the moment, so I've CCed him. Would that be a horrific abuse of the poor alias system? Cheers. Tim.
Reasonably Related Threads
- [LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
- [LLVMdev] Contants generation
- [LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
- KNL Assembly Code for Matrix Multiplication
- [LLVMdev] ARM assembler's syntax in clang