David Blaikie via llvm-dev
2021-Mar-23 01:36 UTC
[llvm-dev] [RFC] Annotating global functions and variables to prevent ICF during linking
ICF: Identical Code Folding Linker deduplicates functions by collapsing any identical functions together - with icf=safe, the linker looks at a .addressing section in the object file and any functions listed in that section are not treated as collapsible (eg: because they need to meet C++'s "distinct functions have distinct addresses" guarantee) On Mon, Mar 22, 2021 at 6:16 PM Philip Reames via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Can you define ICF please? And give a bit of context? > > Philip > On 3/22/21 5:27 PM, Zequan Wu via llvm-dev wrote: > > Hi all, > > Background: > It's been a longstanding difficulty of debugging with ICF. Programmers > don't have control over which sections should be folded by ICF, which > sections shouldn't. The existing address significant table won't have > effect for code sections during all ICF mode in both ld.lld and lld-link. > By switching to safe ICF could mark code sections as unique, but at a cost > of increasing binary size out of control. So, it would be good if > programmers could selectively disable ICF in source code by annotating > global functions/variables with an attribute to improve debugging > experience and have the control on the binary size increase. > > My plan is to add a new section table(`.no_icf`) to object files. Sections > of all symbols inside the table should not be folded by all ICF mode. And > symbols can only be added into the table by annotating global > functions/variables with a new attribute(`no_icf`) in source code. > > What do you think about this approach? > > Thanks, > Zequan > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210322/efbeb8d9/attachment.html>
Fangrui Song via llvm-dev
2021-Mar-23 05:18 UTC
[llvm-dev] [RFC] Annotating global functions and variables to prevent ICF during linking
On 2021-03-22, David Blaikie via llvm-dev wrote:>ICF: Identical Code Folding > >Linker deduplicates functions by collapsing any identical functions >together - with icf=safe, the linker looks at a .addressing section in the >object file and any functions listed in that section are not treated as >collapsible (eg: because they need to meet C++'s "distinct functions have >distinct addresses" guarantee)The name originated from MSVC link.exe where icf stands for "identical COMDAT folding". gold named it "identical code folding" - which makes some sense because gold does not fold readonly data. In LLD, the name is not accurate for two reasons: (1) the feature can apply to readonly data as well; (2) the folding is by section, not by function. We define identical sections as they have identical content and their outgoing relocation sets cannot be distinguished: they need to have the same number of relocations, with the same relative locations, with the referenced symbols indistinguishable. Then, ld.lld --icf={safe,all} works like this: For a set of identical sections, the linker picks one representative and drops the rest, then redirects references to the representative. Note: this can confuse debuggers/symbolizers/profilers easily. lld-link /opt:icf is different from ld.lld --icf but I haven't looked into it closely. I find that the feature's saving is small given its downside (also increaded link time: the current LLD's implementation is inferior: it performs a quadratic number of comparisons among an equality class): This is the size differences for the 'lld' executable: % size lld.{none,safe,all} text data bss dec hex filename 96821040 7210504 550810 104582354 63bccd2 lld.none 95217624 7167656 550810 102936090 622ae1a lld.safe 94038808 7167144 550810 101756762 610af5a lld.all % size gold.{none,safe,all} text data bss dec hex filename 96857302 7174792 550825 104582919 63bcf07 gold.none 94469390 7174792 550825 102195007 6175f3f gold.safe 94184430 7174792 550825 101910047 613061f gold.all Note that the --icf=all result caps the potential saving of the proposed annotation. Actually with some large internal targets I get even smaller savings. ld.lld --icf=safe is safer than gold --icf=safe but probably misses some opportunities. It can be that clang codegen/optimizer fail to mark some cases as {,local_}unnamed_addr. I know Chromium and the Windows world can be different:) But I'd still want to get some numbers first. Last, I have seen that Chromium has some code like https://source.chromium.org/chromium/chromium/src/+/master:skia/ext/SkMemory_new_handler.cpp void sk_abort_no_print() { // Linker's ICF feature may merge this function with other functions with // the same definition (e.g. any function whose sole job is to call abort()) // and it may confuse the crash report processing system. // http://crbug.com/860850 static int static_variable_to_make_this_function_unique = 0x736b; // "sk" base::debug::Alias(&static_variable_to_make_this_function_unique); abort(); } If we want an approach to work with link.exe, I don't know what we can do... If no desire for link.exe compatibility, I can see that having a proper way marking the function can be useful... but in any case if an attribute is used, it probably should affect unnamed_addr directly instead of being called *icf*.>On Mon, Mar 22, 2021 at 6:16 PM Philip Reames via llvm-dev < >llvm-dev at lists.llvm.org> wrote: > >> Can you define ICF please? And give a bit of context? >> >> Philip >> On 3/22/21 5:27 PM, Zequan Wu via llvm-dev wrote: >> >> Hi all, >> >> Background: >> It's been a longstanding difficulty of debugging with ICF. Programmers >> don't have control over which sections should be folded by ICF, which >> sections shouldn't. The existing address significant table won't have >> effect for code sections during all ICF mode in both ld.lld and lld-link. >> By switching to safe ICF could mark code sections as unique, but at a cost >> of increasing binary size out of control. So, it would be good if >> programmers could selectively disable ICF in source code by annotating >> global functions/variables with an attribute to improve debugging >> experience and have the control on the binary size increase. >> >> My plan is to add a new section table(`.no_icf`) to object files. Sections >> of all symbols inside the table should not be folded by all ICF mode. And >> symbols can only be added into the table by annotating global >> functions/variables with a new attribute(`no_icf`) in source code. >> >> What do you think about this approach? >> >> Thanks, >> Zequan >> >> >> _______________________________________________ >> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>_______________________________________________ >LLVM Developers mailing list >llvm-dev at lists.llvm.org >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev