Rafael Avila de Espindola via llvm-dev
2018-Feb-16 00:45 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
Steven Wu <stevenwu at apple.com> writes:>> On Feb 15, 2018, at 4:16 PM, Rafael Avila de Espindola <rafael.espindola at gmail.com> wrote: >> >> Steven Wu <stevenwu at apple.com> writes: >> >>> I did a bit more digging for the auto hide problem. Here is my finding that prevent us from doing this by default in GlobalOpts >>> >>> 1. When a symbol is linkonce_odr hidden unnamed_addr, it emits both '.private_extern' and '.weak_def_can_be_hidden' asm directives on macho platform. There result of that is .private_extern will win so this is essentially linkonce_odr hidden. >> >> What do those directives mean? I assume .weak_def_can_be_hidden is the >> "you can drop this from the symbol table", but .private_extern I am not >> sure. > > .private_extern is just suggesting the symbol has a hidden visibility with none local linkage type/ > >> >>> 2. ld64 does treat these two type of symbols differently. For example, ld64 will deduplicate all the can_be_hidden symbols to reduce code size. This can't be achieved when the symbols is private external. >> >> If I understand you correctly, ld64 will deduplicate >> std::vector<int>::push_back and std::vector<unsigned>::push_back, but it >> will not deduplicate std::vector<HiddenClassA>::push_back and >> std::vector<HiddenClassB>::push_back. Is that correct? Do you know why >> it has that limitation? > > ld64 will dedup identical atoms regardless of names which helps in the case of templates with same underlying types.Even if the symbols corresponding to the atoms are hidden? If so, why is having "linkonce_odr hidden unnamed_addr" causing problems? To be clear, the testcase I have in mind is #include <vector> struct __attribute__((visibility("hidden"))) Foo {}; struct __attribute__((visibility("hidden"))) Bar {}; void f1(std::vector<Foo> V, Foo X) { V.push_back(X); } void f2(std::vector<Bar> V, Bar X) { V.push_back(X); } One ELF the symbols are hidden: 0000000000000000 119 FUNC WEAK HIDDEN 8 _ZNSt6vectorI3BarSaIS0_EE9push_backERKS0_ 0000000000000000 119 FUNC WEAK HIDDEN 5 _ZNSt6vectorI3FooSaIS0_EE9push_backERKS0_ and with lld's ICF I get: /home/espindola/inst/clang/bin/ld.lld: selected .text._ZNSt6vectorI3FooSaIS0_EE9push_backERKS0_ /home/espindola/inst/clang/bin/ld.lld: removed .text._ZNSt6vectorI3BarSaIS0_EE9push_backERKS0_ Cheers, Rafael
Steven Wu via llvm-dev
2018-Feb-16 00:53 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
I explain that in the same thread to Peter. I talked to Nick yesterday and It turns out to be an implementation choice. The overhead to deduplicate all the non-external symbols are too high so ld64 picks a subset that can potentially be beneficial, which are the "auto hide" symbols. So this is not a correctness issue but we might need a different heuristic for performance. Steven> On Feb 15, 2018, at 4:45 PM, Rafael Avila de Espindola <rafael.espindola at gmail.com> wrote: > > Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> writes: > >>> On Feb 15, 2018, at 4:16 PM, Rafael Avila de Espindola <rafael.espindola at gmail.com> wrote: >>> >>> Steven Wu <stevenwu at apple.com> writes: >>> >>>> I did a bit more digging for the auto hide problem. Here is my finding that prevent us from doing this by default in GlobalOpts >>>> >>>> 1. When a symbol is linkonce_odr hidden unnamed_addr, it emits both '.private_extern' and '.weak_def_can_be_hidden' asm directives on macho platform. There result of that is .private_extern will win so this is essentially linkonce_odr hidden. >>> >>> What do those directives mean? I assume .weak_def_can_be_hidden is the >>> "you can drop this from the symbol table", but .private_extern I am not >>> sure. >> >> .private_extern is just suggesting the symbol has a hidden visibility with none local linkage type/ >> >>> >>>> 2. ld64 does treat these two type of symbols differently. For example, ld64 will deduplicate all the can_be_hidden symbols to reduce code size. This can't be achieved when the symbols is private external. >>> >>> If I understand you correctly, ld64 will deduplicate >>> std::vector<int>::push_back and std::vector<unsigned>::push_back, but it >>> will not deduplicate std::vector<HiddenClassA>::push_back and >>> std::vector<HiddenClassB>::push_back. Is that correct? Do you know why >>> it has that limitation? >> >> ld64 will dedup identical atoms regardless of names which helps in the case of templates with same underlying types. > > Even if the symbols corresponding to the atoms are hidden? If so, why is > having "linkonce_odr hidden unnamed_addr" causing problems? > > To be clear, the testcase I have in mind is > > #include <vector> > struct __attribute__((visibility("hidden"))) Foo {}; > struct __attribute__((visibility("hidden"))) Bar {}; > void f1(std::vector<Foo> V, Foo X) { > V.push_back(X); > } > void f2(std::vector<Bar> V, Bar X) { > V.push_back(X); > } > > One ELF the symbols are hidden: > > 0000000000000000 119 FUNC WEAK HIDDEN 8 _ZNSt6vectorI3BarSaIS0_EE9push_backERKS0_ > 0000000000000000 119 FUNC WEAK HIDDEN 5 _ZNSt6vectorI3FooSaIS0_EE9push_backERKS0_ > > and with lld's ICF I get: > > /home/espindola/inst/clang/bin/ld.lld: selected .text._ZNSt6vectorI3FooSaIS0_EE9push_backERKS0_ > /home/espindola/inst/clang/bin/ld.lld: removed .text._ZNSt6vectorI3BarSaIS0_EE9push_backERKS0_ > > Cheers, > Rafael-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180215/8dea3a46/attachment.html>
Rafael Avila de Espindola via llvm-dev
2018-Feb-16 01:06 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
Steven Wu <stevenwu at apple.com> writes:> I explain that in the same thread to Peter. > > I talked to Nick yesterday and It turns out to be an implementation choice. The overhead to deduplicate all the non-external symbols are too high so ld64 picks a subset that can potentially be beneficial, which are the "auto hide" symbols. So this is not a correctness issue but we might need a different heuristic for performance.The algorithm Rui implemented in lld might be something to consider. It handle cycles and is very fast. Cheers, Rafael