Eric Christopher via llvm-dev
2016-Feb-06 01:53 UTC
[llvm-dev] Reducing DWARF emitter memory consumption
On Fri, Feb 5, 2016 at 5:51 PM Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Feb 5, 2016, at 5:40 PM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > > On Fri, Feb 05, 2016 at 04:58:45PM -0800, Mehdi Amini wrote: > >> > >>> On Feb 5, 2016, at 3:17 PM, Peter Collingbourne via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >>> > >>> Hi all, > >>> > >>> We have profiled [1] the memory usage in LLVM when LTO'ing Chromium, > and > >>> we've found that one of the top consumers of memory is the DWARF > emitter in > >>> lib/CodeGen/AsmPrinter/Dwarf*. > >> > >> I'm staring at the profile attached to the post #15 on the link you > posted, can you confirm that the Dwarf emitter accounts for > 6.7%+15.6%=22.3% of the the total allocated memory? > >> If I understand correctly the numbers, this does not tell anything > about how much the Dwarf emitter accounts on the *peak memory* usage (could > be more, could be nothing...). > > > > I think these nodes represent allocations from the DWARF emitter: > > > > DwarfDebug::DwarfDebug 9.5% > > DwarfDebug::endFunction 15.6% > > DIEValueList::addValue 9.1% > > total 34.2% > > > > I believe they are totals, but my reading of the code is that the DWARF > > emitter does not deallocate its memory until the end of code generation, > > That's sad :( > > > so total ~= peak in this case. > > Assuming the peak occurs during CodeGen (which is what I on my profile), > that sounds pretty reasonable! > > Thanks for the information (and the work!). > > Another question I have, is how worse the split codegen make the > situation? Naively there will be a lot of redundancy in the split modules, > for ThinLTO Teresa has to proceed with care to limit the amount of > duplication. > >Hmm? Can you reword this slightly? I'm not sure what you're asking here. -eric> Mehdi > > > > > > I am not surprised by these figures -- see e.g. DIEValueList::Node which > in > > the worst case can use up to 24 bytes on a 1-byte DWARF attribute record. > > > > Ivan was the person who collected the numbers, he may be able to comment > more. > > > > Thanks, > > -- > > Peter > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160206/1d64ebda/attachment.html>
Mehdi Amini via llvm-dev
2016-Feb-06 01:56 UTC
[llvm-dev] Reducing DWARF emitter memory consumption
> On Feb 5, 2016, at 5:53 PM, Eric Christopher <echristo at gmail.com> wrote: > > > > On Fri, Feb 5, 2016 at 5:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > On Feb 5, 2016, at 5:40 PM, Peter Collingbourne <peter at pcc.me.uk <mailto:peter at pcc.me.uk>> wrote: > > > > On Fri, Feb 05, 2016 at 04:58:45PM -0800, Mehdi Amini wrote: > >> > >>> On Feb 5, 2016, at 3:17 PM, Peter Collingbourne via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > >>> > >>> Hi all, > >>> > >>> We have profiled [1] the memory usage in LLVM when LTO'ing Chromium, and > >>> we've found that one of the top consumers of memory is the DWARF emitter in > >>> lib/CodeGen/AsmPrinter/Dwarf*. > >> > >> I'm staring at the profile attached to the post #15 on the link you posted, can you confirm that the Dwarf emitter accounts for 6.7%+15.6%=22.3% of the the total allocated memory? > >> If I understand correctly the numbers, this does not tell anything about how much the Dwarf emitter accounts on the *peak memory* usage (could be more, could be nothing...). > > > > I think these nodes represent allocations from the DWARF emitter: > > > > DwarfDebug::DwarfDebug 9.5% > > DwarfDebug::endFunction 15.6% > > DIEValueList::addValue 9.1% > > total 34.2% > > > > I believe they are totals, but my reading of the code is that the DWARF > > emitter does not deallocate its memory until the end of code generation, > > That's sad :( > > > so total ~= peak in this case. > > Assuming the peak occurs during CodeGen (which is what I on my profile), that sounds pretty reasonable! > > Thanks for the information (and the work!). > > Another question I have, is how worse the split codegen make the situation? Naively there will be a lot of redundancy in the split modules, for ThinLTO Teresa has to proceed with care to limit the amount of duplication. > > > Hmm? Can you reword this slightly? I'm not sure what you're asking here.The parallel split codegen will take the big LTO module with all the debug info and produce multiple modules. When splitting in multiple modules, you may have functions from the same DICompileUnit ending up in multiple modules. All the retained types would be pulled in. (this is assuming you are already taking care of not pulling the DICompileUnit when no functions referencing it is in the split module). Then each thread would do redundant work processing this type hierarchy (and other debug info). For ThinLTO, Teresa is taking care (review waiting here: http://reviews.llvm.org/D16440 ) to try to import as little as possible, and turn type definition into declaration when possible. -- Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160205/18023fd6/attachment.html>
David Blaikie via llvm-dev
2016-Feb-06 02:02 UTC
[llvm-dev] Reducing DWARF emitter memory consumption
On Fri, Feb 5, 2016 at 5:56 PM, Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > On Feb 5, 2016, at 5:53 PM, Eric Christopher <echristo at gmail.com> wrote: > > > > On Fri, Feb 5, 2016 at 5:51 PM Mehdi Amini via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> > On Feb 5, 2016, at 5:40 PM, Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> > >> > On Fri, Feb 05, 2016 at 04:58:45PM -0800, Mehdi Amini wrote: >> >> >> >>> On Feb 5, 2016, at 3:17 PM, Peter Collingbourne via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> >> >>> Hi all, >> >>> >> >>> We have profiled [1] the memory usage in LLVM when LTO'ing Chromium, >> and >> >>> we've found that one of the top consumers of memory is the DWARF >> emitter in >> >>> lib/CodeGen/AsmPrinter/Dwarf*. >> >> >> >> I'm staring at the profile attached to the post #15 on the link you >> posted, can you confirm that the Dwarf emitter accounts for >> 6.7%+15.6%=22.3% of the the total allocated memory? >> >> If I understand correctly the numbers, this does not tell anything >> about how much the Dwarf emitter accounts on the *peak memory* usage (could >> be more, could be nothing...). >> > >> > I think these nodes represent allocations from the DWARF emitter: >> > >> > DwarfDebug::DwarfDebug 9.5% >> > DwarfDebug::endFunction 15.6% >> > DIEValueList::addValue 9.1% >> > total 34.2% >> > >> > I believe they are totals, but my reading of the code is that the DWARF >> > emitter does not deallocate its memory until the end of code generation, >> >> That's sad :( >> >> > so total ~= peak in this case. >> >> Assuming the peak occurs during CodeGen (which is what I on my profile), >> that sounds pretty reasonable! >> >> Thanks for the information (and the work!). >> >> Another question I have, is how worse the split codegen make the >> situation? Naively there will be a lot of redundancy in the split modules, >> for ThinLTO Teresa has to proceed with care to limit the amount of >> duplication. >> >> > Hmm? Can you reword this slightly? I'm not sure what you're asking here. > > > The parallel split codegen will take the big LTO module with all the debug > info and produce multiple modules. > When splitting in multiple modules, you may have functions from the same > DICompileUnit ending up in multiple modules. All the retained types would > be pulled in. >> (this is assuming you are already taking care of not pulling the > DICompileUnit when no functions referencing it is in the split module). > Then each thread would do redundant work processing this type hierarchy > (and other debug info). > > For ThinLTO, Teresa is taking care (review waiting here: > http://reviews.llvm.org/D16440 ) to try to import as little as possible, > and turn type definition into declaration when possible. >Right - I don't think we'd ever need to import a definition - just rely on the fact that we will produce a type definition somewhere in the output (this may present problems for LLDB - it's certainly had issues with type declarations appearing where it would expect a definition (eg: a type that inherits from a declaration instead of a definition) not sure if that problem extends to the case of by-value function parameters) So the impact of that cross-module importuing should be pretty low for ThinLTO. But the benefit of any work Peter does should be equally beneficial to ThinLTO, since it still has to emit the types, build all the DIEs, etc, etc. - Dave> > -- > Mehdi > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160205/0bebd719/attachment.html>
Eric Christopher via llvm-dev
2016-Feb-06 02:16 UTC
[llvm-dev] Reducing DWARF emitter memory consumption
On Fri, Feb 5, 2016 at 5:56 PM Mehdi Amini <mehdi.amini at apple.com> wrote:> On Feb 5, 2016, at 5:53 PM, Eric Christopher <echristo at gmail.com> wrote: > > > > On Fri, Feb 5, 2016 at 5:51 PM Mehdi Amini via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> > On Feb 5, 2016, at 5:40 PM, Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> > >> > On Fri, Feb 05, 2016 at 04:58:45PM -0800, Mehdi Amini wrote: >> >> >> >>> On Feb 5, 2016, at 3:17 PM, Peter Collingbourne via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> >> >>> Hi all, >> >>> >> >>> We have profiled [1] the memory usage in LLVM when LTO'ing Chromium, >> and >> >>> we've found that one of the top consumers of memory is the DWARF >> emitter in >> >>> lib/CodeGen/AsmPrinter/Dwarf*. >> >> >> >> I'm staring at the profile attached to the post #15 on the link you >> posted, can you confirm that the Dwarf emitter accounts for >> 6.7%+15.6%=22.3% of the the total allocated memory? >> >> If I understand correctly the numbers, this does not tell anything >> about how much the Dwarf emitter accounts on the *peak memory* usage (could >> be more, could be nothing...). >> > >> > I think these nodes represent allocations from the DWARF emitter: >> > >> > DwarfDebug::DwarfDebug 9.5% >> > DwarfDebug::endFunction 15.6% >> > DIEValueList::addValue 9.1% >> > total 34.2% >> > >> > I believe they are totals, but my reading of the code is that the DWARF >> > emitter does not deallocate its memory until the end of code generation, >> >> That's sad :( >> >> > so total ~= peak in this case. >> >> Assuming the peak occurs during CodeGen (which is what I on my profile), >> that sounds pretty reasonable! >> >> Thanks for the information (and the work!). >> >> Another question I have, is how worse the split codegen make the >> situation? Naively there will be a lot of redundancy in the split modules, >> for ThinLTO Teresa has to proceed with care to limit the amount of >> duplication. >> >> > Hmm? Can you reword this slightly? I'm not sure what you're asking here. > > > The parallel split codegen will take the big LTO module with all the debug > info and produce multiple modules. > When splitting in multiple modules, you may have functions from the same > DICompileUnit ending up in multiple modules. All the retained types would > be pulled in. > (this is assuming you are already taking care of not pulling the > DICompileUnit when no functions referencing it is in the split module). > Then each thread would do redundant work processing this type hierarchy > (and other debug info). > >Right. I think, in general, that the code generation profile we're seeing here is going to be the same no matter the compilation mode, but merely compounded depending on how much debug info there is ultimately in the IR/etc. -eric> For ThinLTO, Teresa is taking care (review waiting here: > http://reviews.llvm.org/D16440 ) to try to import as little as possible, > and turn type definition into declaration when possible. > > -- > Mehdi > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160206/41a633c3/attachment.html>