Vedant Kumar via llvm-dev
2020-Jan-10 20:57 UTC
[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5
I don't totally follow the proposed encoding change & would appreciate a small example. Is the idea to replace e.g. an 'AT_low_pc (<direct address>) + relocation for <direct address>' with an 'AT_low_pc (<indirection into a pool of addresses> + offset)', s.t. the cost of a relocation for the address is paid down the more it's used? How do you figure the offset out? thanks, vedant> On Jan 8, 2020, at 1:33 PM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Sounds good all round - I'll commit these two modes, and maybe even the third (given Sony's interest & possible interest in changing their consumer to handle it) of a custom form to eek out the last few bytes from the more direct addr+offset encoding. > > I'll follow up here with flag names and revision numbers once they're in. > > On Wed, Jan 8, 2020 at 1:26 PM Robinson, Paul <paul.robinson at sony.com <mailto:paul.robinson at sony.com>> wrote: > On some previous occasion that introduced additional indirection > (don't remember the details) my debugger people groused about the > additional performance cost of chasing down data in a different > object-file section. So we (Sony) might be happier with low_pc as > expressions, than with a ranges-always solution. > > But hard to say without data, and getting both modes in at least > as a temporary thing sounds like a good plan. > --paulr > > > > -----Original Message----- > > From: aprantl at apple.com <mailto:aprantl at apple.com> <aprantl at apple.com <mailto:aprantl at apple.com>> > > Sent: Wednesday, January 8, 2020 1:49 PM > > To: David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> > > Cc: llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>; Jonas Devlieghere > > <jdevlieghere at apple.com <mailto:jdevlieghere at apple.com>>; Robinson, Paul <paul.robinson at sony.com <mailto:paul.robinson at sony.com>>; Eric > > Christopher <echristo at gmail.com <mailto:echristo at gmail.com>>; Frederic Riss <friss at apple.com <mailto:friss at apple.com>> > > Subject: Re: Increasing address pool reuse/reducing .o file size in > > DWARFv5 > > > > I think this sounds like a good plan for Linux. I would like to see the > > numbers for Darwin (= non-split DWARF) to decide whether we should just > > make that the default. Eric's suggestion of having this committed as an > > option first seems like a good step in that direction. If it is an > > advantage across the board we can remove the option and just make this the > > default behavior. > > > > thanks, > > adrian > > > > > On Dec 30, 2019, at 12:08 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote: > > > > > > tl;dr: in DWARFv5, using DW_AT_ranges even when the range is contiguous > > reduces linked, uncompressed debug_addr size for optimized builds by 93% > > and reduces total .o file size (with compression and split) by 15%. It > > does grow .dwo file size a bit - DWARFv5, no compression, not split shows > > the net effect if all bytes are equal: -O3 clang binary grows by 0.4%, -O0 > > clang binary shrinks by 0.1% > > > Should we enable this strategy by default for DWARFv5, for DWARFv5+Split > > DWARF, or not by default at all/only under a flag? > > > > > > > > > > > > So, I've brought this up a few times before - that DWARFv5 does a pretty > > good job of reducing relocations (& reducing .o file size with Split > > DWARF) by allowing many uses of addresses to include some kind of > > address+offset (debug_rnglists and loclists allowing "base_address" then > > offset_pairs (an improvement over similar functionality in DWARFv4 because > > the offset pairs can be uleb encoded - so they can be quite compact)) > > > > > > But one place that DWARFv5 misses to reduce relocations further is > > direct addresses from debug_info, such as DW_AT_low_pc. > > > > > > For a while I've wondered if we could use an extension form for > > addr+offset, and I prototyped this without an extension attribute, but > > instead using exprloc. This has slightly higher overhead to express the... > > expression. (it's 9 bytes in total, could be as few as 5 with a custom > > form) > > > > > > But I had another idea that's more instantly deployable: Why not use > > DW_AT_ranges even when the range is contiguous? That way the low_pc that > > previously couldn't use an existing address pool entry + offset, could use > > the rnglist support for base address. > > > > > > The only unnecessary address pool entries that remain that I've found > > are DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those > > in most code. So the "ranges everywhere" strategy gets the addresses for > > optimized clang down from 4758 (v4 address pool used 9923 addresses... ) > > to 342, with about ~4 "extra" addresses for DW_TAG_labels. > > > > > > This could also be a bit less costly if DWARFv5 rnglists didn't use a > > separate offset table (instead encoding the offsets directly in > > debug_info, rather than using indexes) > > > > > > I have patches for both the addr+offset exprloc and for the ranges- > > always, both with -mllvm flags - do people think they're both worth > > committing for experimentation? Neither? Default on in some cases (like > > Split DWARF)? > > > > > > Thanks, > > > - Dave > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200110/0ec0b006/attachment.html>
David Blaikie via llvm-dev
2020-Jan-12 19:43 UTC
[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5
On Fri, Jan 10, 2020 at 12:57 PM Vedant Kumar <vedant_kumar at apple.com> wrote:> I don't totally follow the proposed encoding change & would appreciate a > small example. > > Is the idea to replace e.g. an 'AT_low_pc (<direct address>) + relocation > for <direct address>' with an 'AT_low_pc (<indirection into a pool of > addresses> + offset)', >With Split DWARF or with DWARFv5 in LLVM at the moment, all addresses are indirected already. So it's: Replace "AT_low_pc (<indirection into a pool of addresses>)" with an "AT_low_pc (<indirection into a pool of addresses> + offset)".> s.t. the cost of a relocation for the address is paid down the more it's > used? >Right - specifically to reduce the pool of addresses down to, ideally, one address per section/indivisible chunk of machine code (per subsection in MachO, for instance) (whereas currently there are many addresses per section)> How do you figure the offset out? >Label difference - same as is done for DW_AT_high_pc today in DWARFv4 and DWARFv5 in LLVM. high_pc currently uses the low_pc addresse to be relative to, in this proposed situation, we'd use a symbol that's in the first bit of debug info in the section (or subsection in MachO). So the low_pc of the subprogram/function, for instance, or if there are two functions in the same section with debug info for both, the low_pc of the first of those functions, etc...> > thanks, > vedant > > On Jan 8, 2020, at 1:33 PM, David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Sounds good all round - I'll commit these two modes, and maybe even the > third (given Sony's interest & possible interest in changing their consumer > to handle it) of a custom form to eek out the last few bytes from the more > direct addr+offset encoding. > > I'll follow up here with flag names and revision numbers once they're in. > > On Wed, Jan 8, 2020 at 1:26 PM Robinson, Paul <paul.robinson at sony.com> > wrote: > >> On some previous occasion that introduced additional indirection >> (don't remember the details) my debugger people groused about the >> additional performance cost of chasing down data in a different >> object-file section. So we (Sony) might be happier with low_pc as >> expressions, than with a ranges-always solution. >> >> But hard to say without data, and getting both modes in at least >> as a temporary thing sounds like a good plan. >> --paulr >> >> >> > -----Original Message----- >> > From: aprantl at apple.com <aprantl at apple.com> >> > Sent: Wednesday, January 8, 2020 1:49 PM >> > To: David Blaikie <dblaikie at gmail.com> >> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Jonas Devlieghere >> > <jdevlieghere at apple.com>; Robinson, Paul <paul.robinson at sony.com>; Eric >> > Christopher <echristo at gmail.com>; Frederic Riss <friss at apple.com> >> > Subject: Re: Increasing address pool reuse/reducing .o file size in >> > DWARFv5 >> > >> > I think this sounds like a good plan for Linux. I would like to see the >> > numbers for Darwin (= non-split DWARF) to decide whether we should just >> > make that the default. Eric's suggestion of having this committed as an >> > option first seems like a good step in that direction. If it is an >> > advantage across the board we can remove the option and just make this >> the >> > default behavior. >> > >> > thanks, >> > adrian >> > >> > > On Dec 30, 2019, at 12:08 PM, David Blaikie <dblaikie at gmail.com> >> wrote: >> > > >> > > tl;dr: in DWARFv5, using DW_AT_ranges even when the range is >> contiguous >> > reduces linked, uncompressed debug_addr size for optimized builds by 93% >> > and reduces total .o file size (with compression and split) by 15%. It >> > does grow .dwo file size a bit - DWARFv5, no compression, not split >> shows >> > the net effect if all bytes are equal: -O3 clang binary grows by 0.4%, >> -O0 >> > clang binary shrinks by 0.1% >> > > Should we enable this strategy by default for DWARFv5, for >> DWARFv5+Split >> > DWARF, or not by default at all/only under a flag? >> > > >> > > >> > > >> > > So, I've brought this up a few times before - that DWARFv5 does a >> pretty >> > good job of reducing relocations (& reducing .o file size with Split >> > DWARF) by allowing many uses of addresses to include some kind of >> > address+offset (debug_rnglists and loclists allowing "base_address" then >> > offset_pairs (an improvement over similar functionality in DWARFv4 >> because >> > the offset pairs can be uleb encoded - so they can be quite compact)) >> > > >> > > But one place that DWARFv5 misses to reduce relocations further is >> > direct addresses from debug_info, such as DW_AT_low_pc. >> > > >> > > For a while I've wondered if we could use an extension form for >> > addr+offset, and I prototyped this without an extension attribute, but >> > instead using exprloc. This has slightly higher overhead to express >> the... >> > expression. (it's 9 bytes in total, could be as few as 5 with a custom >> > form) >> > > >> > > But I had another idea that's more instantly deployable: Why not use >> > DW_AT_ranges even when the range is contiguous? That way the low_pc that >> > previously couldn't use an existing address pool entry + offset, could >> use >> > the rnglist support for base address. >> > > >> > > The only unnecessary address pool entries that remain that I've found >> > are DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those >> > in most code. So the "ranges everywhere" strategy gets the addresses for >> > optimized clang down from 4758 (v4 address pool used 9923 addresses... ) >> > to 342, with about ~4 "extra" addresses for DW_TAG_labels. >> > > >> > > This could also be a bit less costly if DWARFv5 rnglists didn't use a >> > separate offset table (instead encoding the offsets directly in >> > debug_info, rather than using indexes) >> > > >> > > I have patches for both the addr+offset exprloc and for the ranges- >> > always, both with -mllvm flags - do people think they're both worth >> > committing for experimentation? Neither? Default on in some cases (like >> > Split DWARF)? >> > > >> > > Thanks, >> > > - Dave >> >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200112/e14d44dd/attachment.html>
Vedant Kumar via llvm-dev
2020-Jan-13 17:03 UTC
[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5
I think I get it now, thanks for explaining!>> On Jan 12, 2020, at 11:44 AM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > >> On Fri, Jan 10, 2020 at 12:57 PM Vedant Kumar <vedant_kumar at apple.com> wrote: >> I don't totally follow the proposed encoding change & would appreciate a small example. >> >> Is the idea to replace e.g. an 'AT_low_pc (<direct address>) + relocation for <direct address>' with an 'AT_low_pc (<indirection into a pool of addresses> + offset)', > > With Split DWARF or with DWARFv5 in LLVM at the moment, all addresses are indirected already. So it's: > > Replace "AT_low_pc (<indirection into a pool of addresses>)" with an "AT_low_pc (<indirection into a pool of addresses> + offset)". > >> s.t. the cost of a relocation for the address is paid down the more it's used? > > Right - specifically to reduce the pool of addresses down to, ideally, one address per section/indivisible chunk of machine code (per subsection in MachO, for instance) (whereas currently there are many addresses per section) > >> How do you figure the offset out? > > Label difference - same as is done for DW_AT_high_pc today in DWARFv4 and DWARFv5 in LLVM. high_pc currently uses the low_pc addresse to be relative to, in this proposed situation, we'd use a symbol that's in the first bit of debug info in the section (or subsection in MachO). So the low_pc of the subprogram/function, for instance, or if there are two functions in the same section with debug info for both, the low_pc of the first of those functions, etc...If the label difference in a low_pc attribute is relative to the start of a section, could a linker orderfile pass break the dwarf unless it updates the offset? Ditto, I suppose, for an intra-function offset when something like propeller is used to reorder basic blocks (I’m thinking of At_call_return_pc now). Apologies if this has been answered elsewhere, I suppose there must be a solution for this for At_high_pc to work. vedant> >> >> thanks, >> vedant >> >>> On Jan 8, 2020, at 1:33 PM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> Sounds good all round - I'll commit these two modes, and maybe even the third (given Sony's interest & possible interest in changing their consumer to handle it) of a custom form to eek out the last few bytes from the more direct addr+offset encoding. >>> >>> I'll follow up here with flag names and revision numbers once they're in. >>> >>>> On Wed, Jan 8, 2020 at 1:26 PM Robinson, Paul <paul.robinson at sony.com> wrote: >>>> On some previous occasion that introduced additional indirection >>>> (don't remember the details) my debugger people groused about the >>>> additional performance cost of chasing down data in a different >>>> object-file section. So we (Sony) might be happier with low_pc as >>>> expressions, than with a ranges-always solution. >>>> >>>> But hard to say without data, and getting both modes in at least >>>> as a temporary thing sounds like a good plan. >>>> --paulr >>>> >>>> >>>> > -----Original Message----- >>>> > From: aprantl at apple.com <aprantl at apple.com> >>>> > Sent: Wednesday, January 8, 2020 1:49 PM >>>> > To: David Blaikie <dblaikie at gmail.com> >>>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Jonas Devlieghere >>>> > <jdevlieghere at apple.com>; Robinson, Paul <paul.robinson at sony.com>; Eric >>>> > Christopher <echristo at gmail.com>; Frederic Riss <friss at apple.com> >>>> > Subject: Re: Increasing address pool reuse/reducing .o file size in >>>> > DWARFv5 >>>> > >>>> > I think this sounds like a good plan for Linux. I would like to see the >>>> > numbers for Darwin (= non-split DWARF) to decide whether we should just >>>> > make that the default. Eric's suggestion of having this committed as an >>>> > option first seems like a good step in that direction. If it is an >>>> > advantage across the board we can remove the option and just make this the >>>> > default behavior. >>>> > >>>> > thanks, >>>> > adrian >>>> > >>>> > > On Dec 30, 2019, at 12:08 PM, David Blaikie <dblaikie at gmail.com> wrote: >>>> > > >>>> > > tl;dr: in DWARFv5, using DW_AT_ranges even when the range is contiguous >>>> > reduces linked, uncompressed debug_addr size for optimized builds by 93% >>>> > and reduces total .o file size (with compression and split) by 15%. It >>>> > does grow .dwo file size a bit - DWARFv5, no compression, not split shows >>>> > the net effect if all bytes are equal: -O3 clang binary grows by 0.4%, -O0 >>>> > clang binary shrinks by 0.1% >>>> > > Should we enable this strategy by default for DWARFv5, for DWARFv5+Split >>>> > DWARF, or not by default at all/only under a flag? >>>> > > >>>> > > >>>> > > >>>> > > So, I've brought this up a few times before - that DWARFv5 does a pretty >>>> > good job of reducing relocations (& reducing .o file size with Split >>>> > DWARF) by allowing many uses of addresses to include some kind of >>>> > address+offset (debug_rnglists and loclists allowing "base_address" then >>>> > offset_pairs (an improvement over similar functionality in DWARFv4 because >>>> > the offset pairs can be uleb encoded - so they can be quite compact)) >>>> > > >>>> > > But one place that DWARFv5 misses to reduce relocations further is >>>> > direct addresses from debug_info, such as DW_AT_low_pc. >>>> > > >>>> > > For a while I've wondered if we could use an extension form for >>>> > addr+offset, and I prototyped this without an extension attribute, but >>>> > instead using exprloc. This has slightly higher overhead to express the... >>>> > expression. (it's 9 bytes in total, could be as few as 5 with a custom >>>> > form) >>>> > > >>>> > > But I had another idea that's more instantly deployable: Why not use >>>> > DW_AT_ranges even when the range is contiguous? That way the low_pc that >>>> > previously couldn't use an existing address pool entry + offset, could use >>>> > the rnglist support for base address. >>>> > > >>>> > > The only unnecessary address pool entries that remain that I've found >>>> > are DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those >>>> > in most code. So the "ranges everywhere" strategy gets the addresses for >>>> > optimized clang down from 4758 (v4 address pool used 9923 addresses... ) >>>> > to 342, with about ~4 "extra" addresses for DW_TAG_labels. >>>> > > >>>> > > This could also be a bit less costly if DWARFv5 rnglists didn't use a >>>> > separate offset table (instead encoding the offsets directly in >>>> > debug_info, rather than using indexes) >>>> > > >>>> > > I have patches for both the addr+offset exprloc and for the ranges- >>>> > always, both with -mllvm flags - do people think they're both worth >>>> > committing for experimentation? Neither? Default on in some cases (like >>>> > Split DWARF)? >>>> > > >>>> > > Thanks, >>>> > > - Dave >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200113/b53f4cd4/attachment.html>