Steven Wu via llvm-dev
2018-Feb-07 01:35 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
Hi, I recently found that thinLTO doesn't deal with globals that has linkonce_odr and unnamed_addr (for macho at least) because it prohibits the autohide optimization during link time. In LLVM, we tagged a global linkonce_odr and unnamed_addr to indicate to the linker can hide them from symbol table if they were picked (aka, linkonce_odr_auto_hide linkage). It is very commonly used for some type of Tables for c++ code in clang for example. However, thinLTO is promoting these symbols to weak_odr + unnamed_addr, which lose the property. As a result, it introduces unnecessary weak external symbols and weak external are not good for performance on darwin platforms. I have few proposed solutions for this issue but I don't know which one works the best for none macho platforms and other LTO clients like lld. 1. Use llvm.compiler_used. As far as I know, the linkage promote are just there to keep the symbol through internalize and codegen so adding them to compiler used should solve this issue. I was told that there was some objections to do that in the first place. Is it because the globals added to compiler used is ignored by the optimizer so they cannot be internalized and they cannot be optimized away? This works well for the case I am looking at because c++ VTable can't really be optimized and for darwin platforms because we can rely on ld64 to do dead_stripping if needed. 2. Add visibility hidden when promote linkonce_odr + unnamed_addr. Well,this doesn't really preserve the link semantics, but neither does promoting linkonce_odr to weak_odr. The global will still end up in the symbol table but at least it isn't external so it doesn't come with a performance cost. 3. We can teach function importer that it cannot just reference to linkonce_odr + unnamed_addr symbols without importing them. I have some thoughts about how to do this so I can propose something if people are interested going down this route. I am expecting at least add an entry in the global summery and change the cost of importing symbols that references to linkonce_odr + unnamed_addr symbols. 4. As a temporary fix, just targeting at the VTables for c++. We can put a special case for global constants that uses this linkage so they are never promoted and their parents are never imported into other modules. The benefit for inlining global constants is very minimal and I don't think we are doing it currently. Let me know if any of those solutions work for other LTO client. Thanks Steven
Teresa Johnson via llvm-dev
2018-Feb-07 17:34 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
Hi Steven, I'd prefer not to inhibit importing. I am also concerned about putting these symbols in the llvm.compiler_used (I don't recall earlier discussion around this, but it seems like it could have effects on optimization as you mention). What are the downsides of #2 (adding visibility hidden)? We already do this when promoting internal linkage to external due to importing. I'm not an expert on how this would affect link semantics. Thanks, Teresa On Tue, Feb 6, 2018 at 5:35 PM, Steven Wu <stevenwu at apple.com> wrote:> Hi, > > I recently found that thinLTO doesn't deal with globals that has > linkonce_odr and unnamed_addr (for macho at least) because it prohibits the > autohide optimization during link time. > > In LLVM, we tagged a global linkonce_odr and unnamed_addr to indicate to > the linker can hide them from symbol table if they were picked (aka, > linkonce_odr_auto_hide linkage). It is very commonly used for some type of > Tables for c++ code in clang for example. > However, thinLTO is promoting these symbols to weak_odr + unnamed_addr, > which lose the property. As a result, it introduces unnecessary weak > external symbols and weak external are not good for performance on darwin > platforms. > > I have few proposed solutions for this issue but I don't know which one > works the best for none macho platforms and other LTO clients like lld. > > 1. Use llvm.compiler_used. > As far as I know, the linkage promote are just there to keep the symbol > through internalize and codegen so adding them to compiler used should > solve this issue. I was told that there was some objections to do that in > the first place. Is it because the globals added to compiler used is > ignored by the optimizer so they cannot be internalized and they cannot be > optimized away? This works well for the case I am looking at because c++ > VTable can't really be optimized and for darwin platforms because we can > rely on ld64 to do dead_stripping if needed. > > 2. Add visibility hidden when promote linkonce_odr + unnamed_addr. > Well,this doesn't really preserve the link semantics, but neither does > promoting linkonce_odr to weak_odr. The global will still end up in the > symbol table but at least it isn't external so it doesn't come with a > performance cost. > > 3. We can teach function importer that it cannot just reference to > linkonce_odr + unnamed_addr symbols without importing them. I have some > thoughts about how to do this so I can propose something if people are > interested going down this route. I am expecting at least add an entry in > the global summery and change the cost of importing symbols that references > to linkonce_odr + unnamed_addr symbols. > > 4. As a temporary fix, just targeting at the VTables for c++. We can put a > special case for global constants that uses this linkage so they are never > promoted and their parents are never imported into other modules. The > benefit for inlining global constants is very minimal and I don't think we > are doing it currently. > > Let me know if any of those solutions work for other LTO client. > > Thanks > > Steven >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180207/4314216e/attachment.html>
Steven Wu via llvm-dev
2018-Feb-07 17:46 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
> On Feb 7, 2018, at 9:34 AM, Teresa Johnson <tejohnson at google.com> wrote: > > Hi Steven, > > I'd prefer not to inhibit importing. I am also concerned about putting these symbols in the llvm.compiler_used (I don't recall earlier discussion around this, but it seems like it could have effects on optimization as you mention). > > What are the downsides of #2 (adding visibility hidden)? We already do this when promoting internal linkage to external due to importing. I'm not an expert on how this would affect link semantics.For macho, this should be a straight up improvement. It eliminates the weak external, which is big win, but it doesn't solve other side affects for this promotion, i.e, the promoted symbol cannot be dropped by compiler or linker anymore if they are not used. We also need to make sure the linker is doing the right thing when coming into merging visibility. When linker sees two weak symbols, one is hidden, one is default, I think the correct semantic is to take the default visibility. Same goes for unnamed_addr, quote Language Reference "Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant." I did a quick experiment yesterday. ld64 behaves correctly for visibility but llvm-link does not. I can fix that. If other targets can all agree on this behavior, fix #2 should not have any downside comparing to current implementation. We just need to find some other way to improve code size. Steven> > Thanks, > Teresa > > On Tue, Feb 6, 2018 at 5:35 PM, Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> wrote: > Hi, > > I recently found that thinLTO doesn't deal with globals that has linkonce_odr and unnamed_addr (for macho at least) because it prohibits the autohide optimization during link time. > > In LLVM, we tagged a global linkonce_odr and unnamed_addr to indicate to the linker can hide them from symbol table if they were picked (aka, linkonce_odr_auto_hide linkage). It is very commonly used for some type of Tables for c++ code in clang for example. > However, thinLTO is promoting these symbols to weak_odr + unnamed_addr, which lose the property. As a result, it introduces unnecessary weak external symbols and weak external are not good for performance on darwin platforms. > > I have few proposed solutions for this issue but I don't know which one works the best for none macho platforms and other LTO clients like lld. > > 1. Use llvm.compiler_used. > As far as I know, the linkage promote are just there to keep the symbol through internalize and codegen so adding them to compiler used should solve this issue. I was told that there was some objections to do that in the first place. Is it because the globals added to compiler used is ignored by the optimizer so they cannot be internalized and they cannot be optimized away? This works well for the case I am looking at because c++ VTable can't really be optimized and for darwin platforms because we can rely on ld64 to do dead_stripping if needed. > > 2. Add visibility hidden when promote linkonce_odr + unnamed_addr. > Well,this doesn't really preserve the link semantics, but neither does promoting linkonce_odr to weak_odr. The global will still end up in the symbol table but at least it isn't external so it doesn't come with a performance cost. > > 3. We can teach function importer that it cannot just reference to linkonce_odr + unnamed_addr symbols without importing them. I have some thoughts about how to do this so I can propose something if people are interested going down this route. I am expecting at least add an entry in the global summery and change the cost of importing symbols that references to linkonce_odr + unnamed_addr symbols. > > 4. As a temporary fix, just targeting at the VTables for c++. We can put a special case for global constants that uses this linkage so they are never promoted and their parents are never imported into other modules. The benefit for inlining global constants is very minimal and I don't think we are doing it currently. > > Let me know if any of those solutions work for other LTO client. > > Thanks > > Steven > > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com <mailto:tejohnson at google.com> | 408-460-2413-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180207/808c4753/attachment.html>
Reid Kleckner via llvm-dev
2018-Feb-07 18:29 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
There should be no semantic difference between linkonce_odr and weak_odr, except that weak_odr is non-discardable. Why doesn't the autohide optimization work just as well on weak_odr + unnamed_addr as linkonce_odr + unnamed_addr? On Tue, Feb 6, 2018 at 5:35 PM, Steven Wu via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > I recently found that thinLTO doesn't deal with globals that has > linkonce_odr and unnamed_addr (for macho at least) because it prohibits the > autohide optimization during link time. > > In LLVM, we tagged a global linkonce_odr and unnamed_addr to indicate to > the linker can hide them from symbol table if they were picked (aka, > linkonce_odr_auto_hide linkage). It is very commonly used for some type of > Tables for c++ code in clang for example. > However, thinLTO is promoting these symbols to weak_odr + unnamed_addr, > which lose the property. As a result, it introduces unnecessary weak > external symbols and weak external are not good for performance on darwin > platforms. > > I have few proposed solutions for this issue but I don't know which one > works the best for none macho platforms and other LTO clients like lld. > > 1. Use llvm.compiler_used. > As far as I know, the linkage promote are just there to keep the symbol > through internalize and codegen so adding them to compiler used should > solve this issue. I was told that there was some objections to do that in > the first place. Is it because the globals added to compiler used is > ignored by the optimizer so they cannot be internalized and they cannot be > optimized away? This works well for the case I am looking at because c++ > VTable can't really be optimized and for darwin platforms because we can > rely on ld64 to do dead_stripping if needed. > > 2. Add visibility hidden when promote linkonce_odr + unnamed_addr. > Well,this doesn't really preserve the link semantics, but neither does > promoting linkonce_odr to weak_odr. The global will still end up in the > symbol table but at least it isn't external so it doesn't come with a > performance cost. > > 3. We can teach function importer that it cannot just reference to > linkonce_odr + unnamed_addr symbols without importing them. I have some > thoughts about how to do this so I can propose something if people are > interested going down this route. I am expecting at least add an entry in > the global summery and change the cost of importing symbols that references > to linkonce_odr + unnamed_addr symbols. > > 4. As a temporary fix, just targeting at the VTables for c++. We can put a > special case for global constants that uses this linkage so they are never > promoted and their parents are never imported into other modules. The > benefit for inlining global constants is very minimal and I don't think we > are doing it currently. > > Let me know if any of those solutions work for other LTO client. > > Thanks > > Steven > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180207/251b7a5c/attachment.html>
Mehdi AMINI via llvm-dev
2018-Feb-07 18:58 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
Hi, My understanding is also that there should be no semantic difference between linkonce_odr and weak_odr other than the discardable aspect. So I don't understand why llvm.compiler_used would be a problem here? It seems to me that linkonce_odr + llvm.compiler_used is exactly like weak_odr, isn't it? Cheers, -- Mehdi 2018-02-07 10:29 GMT-08:00 Reid Kleckner via llvm-dev < llvm-dev at lists.llvm.org>:> There should be no semantic difference between linkonce_odr and weak_odr, > except that weak_odr is non-discardable. Why doesn't the autohide > optimization work just as well on weak_odr + unnamed_addr as linkonce_odr + > unnamed_addr? > > On Tue, Feb 6, 2018 at 5:35 PM, Steven Wu via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi, >> >> I recently found that thinLTO doesn't deal with globals that has >> linkonce_odr and unnamed_addr (for macho at least) because it prohibits the >> autohide optimization during link time. >> >> In LLVM, we tagged a global linkonce_odr and unnamed_addr to indicate to >> the linker can hide them from symbol table if they were picked (aka, >> linkonce_odr_auto_hide linkage). It is very commonly used for some type of >> Tables for c++ code in clang for example. >> However, thinLTO is promoting these symbols to weak_odr + unnamed_addr, >> which lose the property. As a result, it introduces unnecessary weak >> external symbols and weak external are not good for performance on darwin >> platforms. >> >> I have few proposed solutions for this issue but I don't know which one >> works the best for none macho platforms and other LTO clients like lld. >> >> 1. Use llvm.compiler_used. >> As far as I know, the linkage promote are just there to keep the symbol >> through internalize and codegen so adding them to compiler used should >> solve this issue. I was told that there was some objections to do that in >> the first place. Is it because the globals added to compiler used is >> ignored by the optimizer so they cannot be internalized and they cannot be >> optimized away? This works well for the case I am looking at because c++ >> VTable can't really be optimized and for darwin platforms because we can >> rely on ld64 to do dead_stripping if needed. >> >> 2. Add visibility hidden when promote linkonce_odr + unnamed_addr. >> Well,this doesn't really preserve the link semantics, but neither does >> promoting linkonce_odr to weak_odr. The global will still end up in the >> symbol table but at least it isn't external so it doesn't come with a >> performance cost. >> >> 3. We can teach function importer that it cannot just reference to >> linkonce_odr + unnamed_addr symbols without importing them. I have some >> thoughts about how to do this so I can propose something if people are >> interested going down this route. I am expecting at least add an entry in >> the global summery and change the cost of importing symbols that references >> to linkonce_odr + unnamed_addr symbols. >> >> 4. As a temporary fix, just targeting at the VTables for c++. We can put >> a special case for global constants that uses this linkage so they are >> never promoted and their parents are never imported into other modules. The >> benefit for inlining global constants is very minimal and I don't think we >> are doing it currently. >> >> Let me know if any of those solutions work for other LTO client. >> >> Thanks >> >> Steven >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180207/e8e75982/attachment.html>
Steven Wu via llvm-dev
2018-Feb-07 19:11 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
That is a good question and I don't know. The optimization is defined include/llvm/Analysis/ObjectUtils.h. If I enable that for weak_odr + unnamed_addr, no tests are failing so I guess it is a safe optimization? :) It is probably because the autohide optimization is targeted at c++ templates and inline functions and we know they have linkonce_odr linkage, which suggests whoever uses this symbol should have their own copy. Because the linkonce_odr is safe to drop so it is safe to assume that nothing else should be relying on the symbol to be available from the current linkage unit, so it is safe to hide from symbol table. weak_odr is often used to force to compiler and linker to provide the implementation for template instantiation that is not available in the header. I don't think they are safe to drop in all cases. Steven> On Feb 7, 2018, at 10:29 AM, Reid Kleckner <rnk at google.com> wrote: > > There should be no semantic difference between linkonce_odr and weak_odr, except that weak_odr is non-discardable. Why doesn't the autohide optimization work just as well on weak_odr + unnamed_addr as linkonce_odr + unnamed_addr? > > On Tue, Feb 6, 2018 at 5:35 PM, Steven Wu via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi, > > I recently found that thinLTO doesn't deal with globals that has linkonce_odr and unnamed_addr (for macho at least) because it prohibits the autohide optimization during link time. > > In LLVM, we tagged a global linkonce_odr and unnamed_addr to indicate to the linker can hide them from symbol table if they were picked (aka, linkonce_odr_auto_hide linkage). It is very commonly used for some type of Tables for c++ code in clang for example. > However, thinLTO is promoting these symbols to weak_odr + unnamed_addr, which lose the property. As a result, it introduces unnecessary weak external symbols and weak external are not good for performance on darwin platforms. > > I have few proposed solutions for this issue but I don't know which one works the best for none macho platforms and other LTO clients like lld. > > 1. Use llvm.compiler_used. > As far as I know, the linkage promote are just there to keep the symbol through internalize and codegen so adding them to compiler used should solve this issue. I was told that there was some objections to do that in the first place. Is it because the globals added to compiler used is ignored by the optimizer so they cannot be internalized and they cannot be optimized away? This works well for the case I am looking at because c++ VTable can't really be optimized and for darwin platforms because we can rely on ld64 to do dead_stripping if needed. > > 2. Add visibility hidden when promote linkonce_odr + unnamed_addr. > Well,this doesn't really preserve the link semantics, but neither does promoting linkonce_odr to weak_odr. The global will still end up in the symbol table but at least it isn't external so it doesn't come with a performance cost. > > 3. We can teach function importer that it cannot just reference to linkonce_odr + unnamed_addr symbols without importing them. I have some thoughts about how to do this so I can propose something if people are interested going down this route. I am expecting at least add an entry in the global summery and change the cost of importing symbols that references to linkonce_odr + unnamed_addr symbols. > > 4. As a temporary fix, just targeting at the VTables for c++. We can put a special case for global constants that uses this linkage so they are never promoted and their parents are never imported into other modules. The benefit for inlining global constants is very minimal and I don't think we are doing it currently. > > Let me know if any of those solutions work for other LTO client. > > Thanks > > Steven > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180207/317b4c7c/attachment.html>
Reid Kleckner via llvm-dev
2018-Feb-07 19:29 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
I agree with Teresa, we should probably do #2 to preserve behavior for now. On Wed, Feb 7, 2018 at 9:34 AM, Teresa Johnson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi Steven, > > I'd prefer not to inhibit importing. I am also concerned about putting > these symbols in the llvm.compiler_used (I don't recall earlier discussion > around this, but it seems like it could have effects on optimization as you > mention). > > What are the downsides of #2 (adding visibility hidden)? We already do > this when promoting internal linkage to external due to importing. I'm not > an expert on how this would affect link semantics. > > Thanks, > Teresa > > On Tue, Feb 6, 2018 at 5:35 PM, Steven Wu <stevenwu at apple.com> wrote: > >> Hi, >> >> I recently found that thinLTO doesn't deal with globals that has >> linkonce_odr and unnamed_addr (for macho at least) because it prohibits the >> autohide optimization during link time. >> >> In LLVM, we tagged a global linkonce_odr and unnamed_addr to indicate to >> the linker can hide them from symbol table if they were picked (aka, >> linkonce_odr_auto_hide linkage). It is very commonly used for some type of >> Tables for c++ code in clang for example. >> However, thinLTO is promoting these symbols to weak_odr + unnamed_addr, >> which lose the property. As a result, it introduces unnecessary weak >> external symbols and weak external are not good for performance on darwin >> platforms. >> >> I have few proposed solutions for this issue but I don't know which one >> works the best for none macho platforms and other LTO clients like lld. >> >> 1. Use llvm.compiler_used. >> As far as I know, the linkage promote are just there to keep the symbol >> through internalize and codegen so adding them to compiler used should >> solve this issue. I was told that there was some objections to do that in >> the first place. Is it because the globals added to compiler used is >> ignored by the optimizer so they cannot be internalized and they cannot be >> optimized away? This works well for the case I am looking at because c++ >> VTable can't really be optimized and for darwin platforms because we can >> rely on ld64 to do dead_stripping if needed. >> >> 2. Add visibility hidden when promote linkonce_odr + unnamed_addr. >> Well,this doesn't really preserve the link semantics, but neither does >> promoting linkonce_odr to weak_odr. The global will still end up in the >> symbol table but at least it isn't external so it doesn't come with a >> performance cost. >> >> 3. We can teach function importer that it cannot just reference to >> linkonce_odr + unnamed_addr symbols without importing them. I have some >> thoughts about how to do this so I can propose something if people are >> interested going down this route. I am expecting at least add an entry in >> the global summery and change the cost of importing symbols that references >> to linkonce_odr + unnamed_addr symbols. >> >> 4. As a temporary fix, just targeting at the VTables for c++. We can put >> a special case for global constants that uses this linkage so they are >> never promoted and their parents are never imported into other modules. The >> benefit for inlining global constants is very minimal and I don't think we >> are doing it currently. >> >> Let me know if any of those solutions work for other LTO client. >> >> Thanks >> >> Steven >> > > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 <(408)%20460-2413> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180207/3478c4c6/attachment.html>
Rafael Avila de Espindola via llvm-dev
2018-Feb-09 19:44 UTC
[llvm-dev] ThinLTO and linkonce_odr + unnamed_addr
Reid Kleckner <rnk at google.com> writes:> There should be no semantic difference between linkonce_odr and weak_odr, > except that weak_odr is non-discardable. Why doesn't the autohide > optimization work just as well on weak_odr + unnamed_addr as linkonce_odr + > unnamed_addr?Because we can hide a symbol only when we could have dropped it. A user of a library can have a undefined symbol to a weak_odr in that library, but not to a linkonce_odr in that library. Cheers, Rafael