Leonard Chan via llvm-dev
2021-Feb-02 01:57 UTC
[llvm-dev] [IR] Constant::needsRelocation() can place data into different read-only sections
Hi all, I noticed that given the following IR: ``` declare dso_local void @func() @glob1 = dso_local unnamed_addr constant [2 x i32] [ i32 0, i32 trunc ( i64 sub ( i64 ptrtoint (void ()* @func to i64), i64 ptrtoint ([2 x i32]* @glob1 to i64) ) to i32) ] ``` Compiling this with `./bin/llc /tmp/test2.ll -o - -relocation-model=static -data-sections=1 --mtriple=x86_64` gives the following assembly: ``` .type glob1, at object # @glob1 .section .rodata.cst8,"aM", at progbits,8 .globl glob1 .p2align 2 glob1: .long 0 # 0x0 .long func-glob1 .size glob1, 8 ``` Despite passing `-data-sections=1`, the global does not have its own section name and is stored in `.section .rodata.cst8`. However, given the same size global and a static relocation model: ``` @glob2 = dso_local unnamed_addr constant [1 x i8*] [ i8* bitcast ([1 x i8*]* @glob2 to i8*) ] ``` Using the same invocation returns: ``` .type glob2, at object # @glob2 .section .rodata.glob2,"a", at progbits .globl glob2 .p2align 3 glob2: .quad glob2 .size glob2, 8 ``` which does properly give `@glob2` its own section name (`.section .rodata.glob2`) and not stored in a mergeable const section. There seems to be some inconsistency between what section the globals are placed in (`.rodata` vs `.rodata.cstN`) despite them having the same size. For the case of `@glob1`, this is a problem if this symbol is dso_local/hidden and unused in its current DSO because if `-data-section` is used, then this symbol will not be collected by `--gc-sections` at link time because it's in a mergeable const section. After some digging, it seems that the reason for this difference is because `@glob2` requires relocations according to `Constant::needsRelocation()` (because its initializer contains direct references to a global) whereas `@glob1` does not require relocations (because it stores a constant int and relative reference between two dso_local globals). The path `@glob2` takes <https://github.com/llvm/llvm-project/blob/b545667d0a4e8d3ca7d4789c3c4004b2816c1b84/llvm/lib/Target/TargetLoweringObjectFile.cpp#L285> does not allow it to be placed in a mergeable const section whereas `@glob1` can. It seems that there are two issues with this logic: 1. Some globals that could be placed into mergeable sections don't ever have the chance to if `needsRelocation` returns false. 2. There are some globals placed in mergeable sections even if `-data-sections` is used if `needsRelocation` returns true. I wanted to know if there was perhaps some intended reason for this (or if there were any errors in my logic)? If this was just a shortcoming, it seems that a potential solution for (1) would be to teach `needsRelocation` about the target relocation model, and a potential solution for (2) would be to return a ReadOnly section kind in `TargetLoweringObjectFile::getKindForGlobal()` if `-data-sections` is passed. Although these might just be kludges instead of a proper fix. What do folks think? Thanks, Leonard -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210201/cae0602e/attachment.html>