thr3ads.net - llvm dev - [llvm-dev] [IR] Constant::needsRelocation() can place data into different read-only sections [Feb 2021]

If this information is useful, please help other people find it:
Share via:

Leonard Chan via llvm-dev

2021-Feb-02 01:57 UTC

[llvm-dev] [IR] Constant::needsRelocation() can place data into different read-only sections

Hi all,

I noticed that given the following IR:

```
declare dso_local void @func()

@glob1 = dso_local unnamed_addr constant [2 x i32] [
  i32 0,
  i32 trunc (
      i64 sub (
        i64 ptrtoint (void ()* @func to i64),
        i64 ptrtoint ([2 x i32]* @glob1 to i64)
      ) to i32)
]
```

Compiling this with `./bin/llc /tmp/test2.ll -o - -relocation-model=static
-data-sections=1 --mtriple=x86_64` gives the following assembly:

```
   .type glob1, at object                   # @glob1
   .section .rodata.cst8,"aM", at progbits,8
   .globl glob1
   .p2align 2
glob1:
   .long 0                               # 0x0
   .long func-glob1
   .size glob1, 8
```

Despite passing `-data-sections=1`, the global does not have its own
section name and is stored in `.section .rodata.cst8`. However, given the
same size global and a static relocation model:

```
@glob2 = dso_local unnamed_addr constant [1 x i8*] [
  i8* bitcast ([1 x i8*]* @glob2 to i8*)
]
```

Using the same invocation returns:

```
    .type glob2, at object                   # @glob2
   .section .rodata.glob2,"a", at progbits
   .globl glob2
   .p2align 3
glob2:
   .quad glob2
   .size glob2, 8
```

which does properly give `@glob2` its own section name (`.section
.rodata.glob2`) and not stored in a mergeable const section. There seems to
be some inconsistency between what section the globals are placed in
(`.rodata` vs `.rodata.cstN`) despite them having the same size. For the
case of `@glob1`, this is a problem if this symbol is dso_local/hidden and
unused in its current DSO because if `-data-section` is used, then this
symbol will not be collected by `--gc-sections` at link time because it's
in a mergeable const section.

After some digging, it seems that the reason for this difference is because
`@glob2` requires relocations according to `Constant::needsRelocation()`
(because its initializer contains direct references to a global) whereas
`@glob1` does not require relocations (because it stores a constant int and
relative reference between two dso_local globals). The path `@glob2` takes
<https://github.com/llvm/llvm-project/blob/b545667d0a4e8d3ca7d4789c3c4004b2816c1b84/llvm/lib/Target/TargetLoweringObjectFile.cpp#L285>
does not allow it to be placed in a mergeable const section whereas
`@glob1` can.

It seems that there are two issues with this logic:
1. Some globals that could be placed into mergeable sections don't ever
have the chance to if `needsRelocation` returns false.
2. There are some globals placed in mergeable sections even if
`-data-sections` is used if `needsRelocation` returns true.

I wanted to know if there was perhaps some intended reason for this (or if
there were any errors in my logic)? If this was just a shortcoming, it
seems that a potential solution for (1) would be to teach `needsRelocation`
about the target relocation model, and a potential solution for (2) would
be to return a ReadOnly section kind in
`TargetLoweringObjectFile::getKindForGlobal()` if `-data-sections` is
passed. Although these might just be kludges instead of a proper fix.

What do folks think?

Thanks,
Leonard
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210201/cae0602e/attachment.html>

llvm dev - Feb 2021 - [IR] Constant::needsRelocation() can place data into different read-only sections

[llvm-dev] [IR] Constant::needsRelocation() can place data into different read-only sections