Fāng-ruì Sòng via llvm-dev
2021-Feb-24 08:49 UTC
[llvm-dev] __attribute__((retain)) && llvm.used/llvm.compiler.used
Currently __attribute__((used)) lowers to llvm.used. * On Mach-O, a GlobalObject in llvm.used gets the S_ATTR_NO_DEAD_STRIP attribute, which prevents linker GC (dead stripping). * On COFF, a non-local-linkage GlobalObject[1] in llvm.used gets the /INCLUDE: linker option (similar to ELF `ld -u`), which prevents linker GC. It should be possible to work with local linkage GlobalObject's as well but that will require a complex COMDAT dance. * On ELF, a global object llvm.used can be discarded by ld.bfd/gold/ld.lld --gc-sections. (If the section is a C identifier name, __start_/__stop_ relocations from a live input section can retain the section, even if its defined symbols are not referenced. [2] . I understand that some folks use `__attribute__((used, section("C_ident")))` and expect the sections to be similar to GC roots, however, non-C-identifier cases are very common, too. They don't get __start_/__stop_ linker magic and the sections can always be GCed. ) In LangRef, the description of llvm.used contains:> If a symbol appears in the @llvm.used list, then the compiler, assembler, and **linker** are required to treat the symbol as if there is a reference to the symbol that it cannot see (which is why they have to be named). For example, if a variable has internal linkage and no references other than that from the @llvm.used list, it cannot be deleted. This is commonly used to represent references from inline asms and other things the compiler cannot “see”, and corresponds to “attribute((used))” in GNU C.Note that the "linker" part does not match the reality on ELF targets. It does match the reality on Mach-O and partially on COFF. llvm.compiler.used:> The @llvm.compiler.used directive is the same as the @llvm.used directive, except that it only prevents the compiler from touching the symbol. On targets that support it, this allows an **intelligent linker to optimize references to the symbol without being impeded** as it would be by @llvm.used.Note that this explicitly mentions linker GC, so this appears to be the closest thing to __attribute__((used)) on ELF. However, LangRef also says:> This is a rare construct that should only be used in rare circumstances, and should not be exposed to source languages.My goal is to implement __attribute__((retain)) (which will be in GCC 11) on ELF. GCC folks think that 'used' and 'retain are orthogonal. (see https://reviews.llvm.org/D96838#2578127) Shall we 1. Lift the source language restriction on llvm.compiler.used and change __attribute__((used)) to use llvm.compiler.used on ELF. 2. Or add a metadata (like https://reviews.llvm.org/D96837)? I lean to option 1 to leverage the existing mechanism. The downside is that clang codegen will have some target inconsistency (llvm.compiler.used on ELF while llvm.used on others). [1]: The implementation additionally allows GlobalAlias. [2]: See https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order "C identifier name sections" for details.
Fāng-ruì Sòng via llvm-dev
2021-Feb-24 09:09 UTC
[llvm-dev] __attribute__((retain)) && llvm.used/llvm.compiler.used
On 2021-02-24, Fāng-ruì Sòng wrote:>Currently __attribute__((used)) lowers to llvm.used. > >* On Mach-O, a GlobalObject in llvm.used gets the S_ATTR_NO_DEAD_STRIP >attribute, which prevents linker GC (dead stripping). >* On COFF, a non-local-linkage GlobalObject[1] in llvm.used gets the >/INCLUDE: linker option (similar to ELF `ld -u`), which prevents >linker GC. > It should be possible to work with local linkage GlobalObject's as >well but that will require a complex COMDAT dance. >* On ELF, a global object llvm.used can be discarded by >ld.bfd/gold/ld.lld --gc-sections. > (If the section is a C identifier name, __start_/__stop_ relocations >from a live input section can retain the section, even if its defined >symbols are not referenced. [2] . > I understand that some folks use `__attribute__((used, >section("C_ident")))` and expect the sections to be similar to GC >roots, however, > non-C-identifier cases are very common, too. They don't get >__start_/__stop_ linker magic and the sections can always be GCed. > ) > >In LangRef, the description of llvm.used contains: > >> If a symbol appears in the @llvm.used list, then the compiler, assembler, and **linker** are required to treat the symbol as if there is a reference to the symbol that it cannot see (which is why they have to be named). For example, if a variable has internal linkage and no references other than that from the @llvm.used list, it cannot be deleted. This is commonly used to represent references from inline asms and other things the compiler cannot “see”, and corresponds to “attribute((used))” in GNU C. > >Note that the "linker" part does not match the reality on ELF targets. >It does match the reality on Mach-O and partially on COFF. > >llvm.compiler.used: > >> The @llvm.compiler.used directive is the same as the @llvm.used directive, except that it only prevents the compiler from touching the symbol. On targets that support it, this allows an **intelligent linker to optimize references to the symbol without being impeded** as it would be by @llvm.used. > >Note that this explicitly mentions linker GC, so this appears to be >the closest thing to __attribute__((used)) on ELF. >However, LangRef also says: > >> This is a rare construct that should only be used in rare circumstances, and should not be exposed to source languages. > > > >My goal is to implement __attribute__((retain)) (which will be in GCC >11) on ELF. GCC folks think that 'used' and 'retain are orthogonal. >(see https://reviews.llvm.org/D96838#2578127) > >Shall we > >1. Lift the source language restriction on llvm.compiler.used and >change __attribute__((used)) to use llvm.compiler.used on ELF.It is too late here and I did not think of it clearly;-) Clarify: 1. Lift the source language restriction on llvm.compiler.used, let llvm.used use SHF_GNU_RETAIN on ELF, and change __attribute__((used)) to use llvm.compiler.used on ELF. __attribute__((retain)) has semantics which are not described by llvm.used/llvm.compiler.used. To facilitate linker GC, __attribute__((retain)) causes the section to be placed in a unique section. The separate section behavior can be undesired in some cases (e.g. poorly written Linux kernel linker scripts which expect one section per name). So in the -fno-function-sections -fno-data-sections case, a retained function/variable does not cause the whole .text/.data/.rodata to be retained. The test llvm/test/CodeGen/X86/elf-retain.ll in https://reviews.llvm.org/D96837 demonstrates the behavior. So I am not particularly clear that we should use llvm.compiler.used/llvm.used to describe __attribute__((retain)) .>2. Or add a metadata (like https://reviews.llvm.org/D96837)? > > >I lean to option 1 to leverage the existing mechanism. >The downside is that clang codegen will have some target inconsistency >(llvm.compiler.used on ELF while llvm.used on others). > > > >[1]: The implementation additionally allows GlobalAlias. >[2]: See https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order >"C identifier name sections" for details.