George Rimar via llvm-dev
2017-Dec-04 15:11 UTC
[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
Hi all ! We have an issue with LLD, it is "relocation R_X86_64_32 out of range" (PR31109) which occurs during resolving relocations in debug sections. It looks happens because .debug_info section can be too large sometimes and 32x relocation is not enough to represent the value. One of possible solutions looks to be to deduplicate information to reduce .debug_info size. The rest of mail contains information about experiments I did, the obtained results and some questions and suggestions as well. I was investigating idea to deduplicate debug types information. Idea is described at p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It suggests to split types information out of .debug_info and emit multiple .debug_types sections with use of COMDATs. Both clang and gcc I tested implements -fdebug-types-section flag for that: -fdebug-types-section, -fno-debug-types-section Place debug types in their own section (ELF Only) gcc's description is here: https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/Debugging-Options.html#Debugging-Options. This flag is disabled by default. I compared clang binaries to see the difference with and without the linker side optimisation. 1) Clang built with -g has size of 1.7 GB, .debug_info section size is 894.5 Mb. 2) Clang built with -g -fdebug-types-section has size of 1.0 GB. .debug_types size is 26.267 MB, .debug_info size is 227.7 MB. Difference is huge and I believe shows (though probably for most of readers here it was already obvious) that optimization can be useful. Though -fdebug-types-section is disabled by default. Looks it was initially disabled because not all of DWARF consumers were aware of .debug_types section. Now in 2017 situation is different. I think most of DWARF consumers knows about .debug_types, but: 1) DWARF5 specification explicitly eliminates the .debug_types section introduced in DWARF4: p8, "1.4 Changes from Version 4 to Version 5" http://dwarfstd.org/doc/DWARF5.pdf 2) Instead of emiting multiple .debug_types it suggests to emit multiple .debug_info COMDAT sections. (p375, p376). And it seems currently there is no way to make clang to emit multiple .debug_info with type information like DWARF5 suggests. I tried command line below: -g -fdebug-types-section -gdwarf-5 It still emits .debug_types and does not look there is a flag for emiting multiple .debug_info. Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is just always assumed .debug_info is a unique section in object. (also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as this section is incompatible with v5, probably it is a bug). So my questions are following: 1) Do we want to try to implement multiple .debug_info approach ? As it seems can be very useful sometimes. 2) For now in LLD may be we may want to extend our error message from "relocation X out of range" to something suggesting to use -fdebug-types-section (only for relocations in debug sections) ? 3) Why -fdebug-types-section is disabled by default ? ? Best regards, George | Developer | Access Softek, Inc -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171204/f736e666/attachment.html>
UE US via llvm-dev
2017-Dec-04 21:20 UTC
[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
Funny, I just filed a bug on that last night. Your solutions look like they'll help me extensively as cutting the size if half will prevent my 80GB make install issues. https://bugs.llvm.org/show_bug.cgi?id=35512 I'll leave the bug open for tracking purposes. GNOMETOYS On Mon, Dec 4, 2017 at 9:11 AM, George Rimar via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all ! > > We have an issue with LLD, it is "relocation R_X86_64_32 out of range" > (PR31109) > which occurs during resolving relocations in debug sections. It looks > happens > because .debug_info section can be too large sometimes and 32x relocation > is not enough > to represent the value. One of possible solutions looks to be to > deduplicate information > to reduce .debug_info size. > The rest of mail contains information about experiments I did, the > obtained results and > some questions and suggestions as well. > > I was investigating idea to deduplicate debug types information. Idea is > described at > p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It > suggests > to split types information out of .debug_info and emit multiple > .debug_types sections > with use of COMDATs. Both clang and gcc I tested implements > -fdebug-types-section flag for that: > > -fdebug-types-section, -fno-debug-types-section > Place debug types in their own section (ELF Only) > gcc's description is here: https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/ > Debugging-Options.html#Debugging-Options. > > This flag is disabled by default. I compared clang binaries to see the > difference > with and without the linker side optimisation. > 1) Clang built with -g has size of 1.7 GB, .debug_info section size is > 894.5 Mb. > 2) Clang built with -g -fdebug-types-section has size of 1.0 GB. > .debug_types size is 26.267 MB, .debug_info size is 227.7 MB. > > Difference is huge and I believe shows (though probably for most of > readers here it was > already obvious) that optimization can be useful. Though > -fdebug-types-section is disabled by default. > Looks it was initially disabled because not all of DWARF consumers were > aware of .debug_types section. > > Now in 2017 situation is different. I think most of DWARF consumers knows > about .debug_types, but: > 1) DWARF5 specification explicitly eliminates the .debug_types section > introduced in DWARF4: > p8, "1.4 Changes from Version 4 to Version 5" http://dwarfstd.org/doc/ > DWARF5.pdf > 2) Instead of emiting multiple .debug_types it suggests to emit multiple > .debug_info COMDAT > sections. (p375, p376). > > And it seems currently there is no way to make clang to emit multiple > .debug_info with type information > like DWARF5 suggests. I tried command line below: > -g -fdebug-types-section -gdwarf-5 > It still emits .debug_types and does not look there is a flag for emiting > multiple .debug_info. > Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is > just always assumed .debug_info is > a unique section in object. > (also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as > this section is incompatible with v5, > probably it is a bug). > > So my questions are following: > 1) Do we want to try to implement multiple .debug_info approach ? As it > seems can be very useful sometimes. > 2) For now in LLD may be we may want to extend our error message from > "relocation X out of range" to something > suggesting to use -fdebug-types-section (only for relocations in debug > sections) ? > 3) Why -fdebug-types-section is disabled by default ? > > > Best regards, > George | Developer | Access Softek, Inc > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171204/1e34aba3/attachment.html>
Rui Ueyama via llvm-dev
2017-Dec-05 05:06 UTC
[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
Thank you George for writing this up! On Mon, Dec 4, 2017 at 7:11 AM, George Rimar <grimar at accesssoftek.com> wrote:> Hi all ! > > We have an issue with LLD, it is "relocation R_X86_64_32 out of range" > (PR31109) > which occurs during resolving relocations in debug sections. It looks > happens > because .debug_info section can be too large sometimes and 32x relocation > is not enough > to represent the value. One of possible solutions looks to be to > deduplicate information > to reduce .debug_info size. > The rest of mail contains information about experiments I did, the > obtained results and > some questions and suggestions as well. > > I was investigating idea to deduplicate debug types information. Idea is > described at > p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It > suggests > to split types information out of .debug_info and emit multiple > .debug_types sections > with use of COMDATs. Both clang and gcc I tested implements > -fdebug-types-section flag for that: > > -fdebug-types-section, -fno-debug-types-section > Place debug types in their own section (ELF Only) > gcc's description is here: https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/ > Debugging-Options.html#Debugging-Options. > > This flag is disabled by default. I compared clang binaries to see the > difference > with and without the linker side optimisation. > 1) Clang built with -g has size of 1.7 GB, .debug_info section size is > 894.5 Mb. > 2) Clang built with -g -fdebug-types-section has size of 1.0 GB. > .debug_types size is 26.267 MB, .debug_info size is 227.7 MB. > > Difference is huge and I believe shows (though probably for most of > readers here it was > already obvious) that optimization can be useful. Though > -fdebug-types-section is disabled by default. > Looks it was initially disabled because not all of DWARF consumers were > aware of .debug_types section. > > Now in 2017 situation is different. I think most of DWARF consumers knows > about .debug_types, but: > 1) DWARF5 specification explicitly eliminates the .debug_types section > introduced in DWARF4: > p8, "1.4 Changes from Version 4 to Version 5" http://dwarfstd.org/doc/ > DWARF5.pdf > 2) Instead of emiting multiple .debug_types it suggests to emit multiple > .debug_info COMDAT > sections. (p375, p376). > > And it seems currently there is no way to make clang to emit multiple > .debug_info with type information > like DWARF5 suggests. I tried command line below: > -g -fdebug-types-section -gdwarf-5 > It still emits .debug_types and does not look there is a flag for emiting > multiple .debug_info. > Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is > just always assumed .debug_info is > a unique section in object. > (also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as > this section is incompatible with v5, > probably it is a bug). > > So my questions are following: > 1) Do we want to try to implement multiple .debug_info approach ? As it > seems can be very useful sometimes. > 2) For now in LLD may be we may want to extend our error message from > "relocation X out of range" to something > suggesting to use -fdebug-types-section (only for relocations in debug > sections) ? >What we ideally should do is to print out a hint message to add a flag to force the compiler to emit DWARF64 debug info (I don't know the flag name) and -fdebug-type-sections, along with a brief message describing why we can't satisfy the user request (e.g. relocation X is too large for 32-bit DWARF info). That said, looks like LLVM's DWARF64 support is incomplete yet, so it may make sense to print out a hint message as you suggested. 3) Why -fdebug-types-section is disabled by default ?> > > Best regards, > George | Developer | Access Softek, Inc >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171204/c43e2077/attachment.html>
George Rimar via llvm-dev
2017-Dec-05 14:39 UTC
[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
>Funny, I just filed a bug on that last night. Your solutions look like they'll help me extensively as cutting the size if half will >prevent my 80GB make install issues.>https://bugs.llvm.org/show_bug.cgi?id=35512 > >I'll leave the bug open for tracking purposes. > >GNOMETOYS-fdebug-types-section makes objects larger, but even with it my whole LLVM build folder built with -fdebug-types-section is 31.5GB, and 37.3GB without. Glad it can be helpfull for you :) George. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171205/884530d6/attachment.html>
David Blaikie via llvm-dev
2017-Dec-05 20:13 UTC
[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
If you're interested in things you can do in the linker for this - you might consider something more aggressive: Fully DWARF aware deduplication. This could be done hopefully by reusing some of the code in the dsymutil implementation in LLVM. This would be much more effective (and without the possible context-sensitive tradeoffs) than using type units. Though it'd possibly have a big tradeoff in link time and/or linker memory usage (I'm not sure how much dsymutil needs/uses of either). It doesn't seem especially important to implement the DWARF5 types -> debug_info thing for this situation, the type units as they are (in debug_types) offer the same size benefits here. But sure, if anyone wanted to implement it at some point, that'd be fine. I think Paul covered some of the reasons type units might not be a reasonable default. One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker) type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there's no opportunity to remove the duplication anyway (unless you're making a DWP - like a dsym file) On Mon, Dec 4, 2017 at 7:11 AM George Rimar via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all ! > > We have an issue with LLD, it is "relocation R_X86_64_32 out of range" > (PR31109) > which occurs during resolving relocations in debug sections. It looks > happens > because .debug_info section can be too large sometimes and 32x relocation > is not enough > to represent the value. One of possible solutions looks to be to > deduplicate information > to reduce .debug_info size. > The rest of mail contains information about experiments I did, the > obtained results and > some questions and suggestions as well. > > I was investigating idea to deduplicate debug types information. Idea is > described at > p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It > suggests > to split types information out of .debug_info and emit multiple > .debug_types sections > with use of COMDATs. Both clang and gcc I tested implements > -fdebug-types-section flag for that: > > -fdebug-types-section, -fno-debug-types-section > Place debug types in their own section (ELF Only) > gcc's description is here: > https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/Debugging-Options.html#Debugging-Options > . > > This flag is disabled by default. I compared clang binaries to see the > difference > with and without the linker side optimisation. > 1) Clang built with -g has size of 1.7 GB, .debug_info section size is > 894.5 Mb. > 2) Clang built with -g -fdebug-types-section has size of 1.0 GB. > .debug_types size is 26.267 MB, .debug_info size is 227.7 MB. > > Difference is huge and I believe shows (though probably for most of > readers here it was > already obvious) that optimization can be useful. Though > -fdebug-types-section is disabled by default. > Looks it was initially disabled because not all of DWARF consumers were > aware of .debug_types section. > > Now in 2017 situation is different. I think most of DWARF consumers knows > about .debug_types, but: > 1) DWARF5 specification explicitly eliminates the .debug_types section > introduced in DWARF4: > p8, "1.4 Changes from Version 4 to Version 5" > http://dwarfstd.org/doc/DWARF5.pdf > 2) Instead of emiting multiple .debug_types it suggests to emit multiple > .debug_info COMDAT > sections. (p375, p376). > > And it seems currently there is no way to make clang to emit multiple > .debug_info with type information > like DWARF5 suggests. I tried command line below: > -g -fdebug-types-section -gdwarf-5 > It still emits .debug_types and does not look there is a flag for emiting > multiple .debug_info. > Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is > just always assumed .debug_info is > a unique section in object. > (also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as > this section is incompatible with v5, > probably it is a bug). > > So my questions are following: > 1) Do we want to try to implement multiple .debug_info approach ? As it > seems can be very useful sometimes. > 2) For now in LLD may be we may want to extend our error message from > "relocation X out of range" to something > suggesting to use -fdebug-types-section (only for relocations in debug > sections) ? > 3) Why -fdebug-types-section is disabled by default ? > > > Best regards, > George | Developer | Access Softek, Inc > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171205/21826dc7/attachment.html>
George Rimar via llvm-dev
2017-Dec-06 11:15 UTC
[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
>If you're interested in things you can do in the linker for this - you might consider something more aggressive: Fully DWARF aware deduplication.> >This could be done hopefully by reusing some of the code in the dsymutil implementation in LLVM. > >This would be much more effective (and without the possible context-sensitive tradeoffs) than using type units. >Though it'd possibly have a big tradeoff in link time and/or linker memory usage (I'm not sure how much dsymutil needs/uses of either).+ Rui. I think LLD development direction vector currently is to avoid teaching linker about things it naturally should not be aware off. Like it should ideally work with sections as pieces and should not know about content. That is not always possible, for example we have to look inside .eh_frame to deuplicate FDEs, but that is probably what we would want to avoid in general.>It doesn't seem especially important to implement the DWARF5 types -> debug_info thing for this situation, the type units >as they are (in debug_types) offer the same size benefits here. But sure, if anyone wanted to implement it at some point, that'd be fine.But there is no .debug_types in DWARF5, so it is depricated approach as far I understand.>I think Paul covered some of the reasons type units might not be a reasonable default. > >One additional reason is that if you use Split DWARF (another great way to massively reduce the amount of debug info going to the linker) >type units are mostly /just/ overhead in the .dwo files: since the debug info is not linked, there's no opportunity to remove the >duplication anyway (unless you're making a DWP - like a >dsym file)Yeah. Looks -gsplit-dwarf? and -fdebug-types-section are harmfull together. Probably it worth to restrict using of them together or emit a warning (both clang and gcc silently allows the combination and output has size penalty you describing). But then does it make sence to emit multiple .debug_info sections with -gsplit-dwarf, so that objects will contain skeleton .debug_info and .debug_info sections with type units as described in DWARF5. So that linker will be able to do deduplication of types on a sections level as expected ? George. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171206/1acea48a/attachment.html>