Manman Ren
2013-Nov-12 20:42 UTC
[LLVMdev] Debug info: type uniquing for C++ and the status on building clang with "-flto -g"
Hi All, Type uniquing for C++ is in. Some data for Xalan with -flto -g: 9.9MB raw dwarf size, peak memory usage at 2.8GB The raw dwarf size was 58MB, memory usage was 7GB back in May, 2013. Other efforts at size reduction helped, and type uniquing improved on top of those. Data on building clang with "-flto -g" after type uniquing: 3.4GB MDNodes after parsing all bc files, 7GB MDNodes after linking all bc files 4.6GB DIEs 4G MCContext --> The memory usage is still too big. So how to reduce the memory footprint at MDNode level: 1> Combine integers into MDString and further combining MDStrings (see PR17891) A partial implementation on the important debug info nodes can reduce the MDNodes from 7GB to 5.7GB 2> Release MDNodes that are only used by source modules (I will send out a proposal) An estimation based on partial implementation: this will reduce MDNodes from 5.7GB to 3.9GB Thanks, Manman -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/a88e3283/attachment.html>
David Blaikie
2013-Nov-12 21:01 UTC
[LLVMdev] Debug info: type uniquing for C++ and the status on building clang with "-flto -g"
Hi Manman, Thanks for sending this summary and progress plans - it's great to see the impact your changes have had and ideas for future direction. Type uniquing for C++ is in. Some data for Xalan with -flto -g:> 9.9MB raw dwarf size, peak memory usage at 2.8GB > The raw dwarf size was 58MB, memory usage was 7GB back in May, 2013. > Other efforts at size reduction helped, and type uniquing improved on top > of those. > > Data on building clang with "-flto -g" after type uniquing: > 3.4GB MDNodes after parsing all bc files, 7GB MDNodes after linking all > bc files >What's the change between parsing and linking?> 4.6GB DIEs >It seems like the DIEs are a substantial (more than the pre-linked, but post-parsed BC files) part of the footprint. I think it might be important to do the CU-at-a-time work sooner rather than later as I'm concerned about the design impact it will have on existing and future work (it's already going to substantially change the cross-CU-DIE references, potentially changing the cost/benefit of that feature since we cannot inject DIEs from later CUs into prior ones).> 4G MCContext >What's the data in the MCContext that's relevant to debug info?> --> The memory usage is still too big. >Do we have an idea of what size is "small enough"? It would be useful to have a goal.> So how to reduce the memory footprint at MDNode level: > 1> Combine integers into MDString and further combining MDStrings (see > PR17891) > A partial implementation on the important debug info nodes can > reduce the MDNodes from 7GB to 5.7GB >I think this'll be an interesting, and potentially valuable, change even in non-LTO cases, but not necessarily where I would start just now.> 2> Release MDNodes that are only used by source modules (I will send out > a proposal) > An estimation based on partial implementation: this will reduce > MDNodes from 5.7GB to 3.9GB >I'll keep an eye out for your proposal, as I can't quite picture what you've got in mind from this brief description. - David -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/f829b6d2/attachment.html>
Manman Ren
2013-Nov-12 22:08 UTC
[LLVMdev] Debug info: type uniquing for C++ and the status on building clang with "-flto -g"
On Tue, Nov 12, 2013 at 1:01 PM, David Blaikie <dblaikie at gmail.com> wrote:> Hi Manman, > > Thanks for sending this summary and progress plans - it's great to see the > impact your changes have had and ideas for future direction. > > Type uniquing for C++ is in. Some data for Xalan with -flto -g: >> 9.9MB raw dwarf size, peak memory usage at 2.8GB >> The raw dwarf size was 58MB, memory usage was 7GB back in May, 2013. >> Other efforts at size reduction helped, and type uniquing improved on top >> of those. >> >> Data on building clang with "-flto -g" after type uniquing: >> 3.4GB MDNodes after parsing all bc files, 7GB MDNodes after linking all >> bc files >> > > What's the change between parsing and linking? >Parsing means reading in all bc files to source modules. Linking means linking in the source modules to the destination module. Extra MDNodes can be generated for the destination module.> >> 4.6GB DIEs >> > > It seems like the DIEs are a substantial (more than the pre-linked, but > post-parsed BC files) part of the footprint. I think it might be important > to do the CU-at-a-time work sooner rather than later as I'm concerned about > the design impact it will have on existing and future work (it's already > going to substantially change the cross-CU-DIE references, potentially > changing the cost/benefit of that feature since we cannot inject DIEs from > later CUs into prior ones). > > >> 4G MCContext >> > > What's the data in the MCContext that's relevant to debug info? >One data point on "Xalan": without -g, MCContext allocates 45MB, with -g, MCContext allocates 286MB.> > >> --> The memory usage is still too big. >> > > Do we have an idea of what size is "small enough"? It would be useful to > have a goal. > > >> So how to reduce the memory footprint at MDNode level: >> 1> Combine integers into MDString and further combining MDStrings (see >> PR17891) >> A partial implementation on the important debug info nodes can >> reduce the MDNodes from 7GB to 5.7GB >> > > I think this'll be an interesting, and potentially valuable, change even > in non-LTO cases, but not necessarily where I would start just now. > > >> 2> Release MDNodes that are only used by source modules (I will send >> out a proposal) >> An estimation based on partial implementation: this will reduce >> MDNodes from 5.7GB to 3.9GB >> > > I'll keep an eye out for your proposal, as I can't quite picture what > you've got in mind from this brief description. >Yes, I plan to send out the proposal today or tomorrow. Manman> > > - David >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/1ec12a17/attachment.html>
Sean Silva
2013-Nov-13 02:54 UTC
[LLVMdev] Debug info: type uniquing for C++ and the status on building clang with "-flto -g"
On Tue, Nov 12, 2013 at 3:42 PM, Manman Ren <manman.ren at gmail.com> wrote:> > Hi All, > > Type uniquing for C++ is in. Some data for Xalan with -flto -g: > 9.9MB raw dwarf size, peak memory usage at 2.8GB > The raw dwarf size was 58MB, memory usage was 7GB back in May, 2013. > Other efforts at size reduction helped, and type uniquing improved on top > of those. > > Data on building clang with "-flto -g" after type uniquing: > 3.4GB MDNodes after parsing all bc files, 7GB MDNodes after linking all > bc files > 4.6GB DIEs > 4G MCContext > --> The memory usage is still too big. >What fraction of the memory space occupied by MDNodes is just pointers? (IIRC our whole scheme for metadata is the epitome of "sea of linked nodes"). Do you have any statistics of how often the flexibility offered by links is used? (e.g. how often the links are changed). If huge swaths of these nodes are read-mostly, then it may be much more efficient to use a representation where the links are implicit. More generally, can you gather some statistics about the relative distribution of different operations on MDNodes that we do? (what is the most called method? are there a couple of methods that account for >90% of calls? How often do we mutate this data? etc.) -- Sean Silva> > So how to reduce the memory footprint at MDNode level: > 1> Combine integers into MDString and further combining MDStrings (see > PR17891) > A partial implementation on the important debug info nodes can > reduce the MDNodes from 7GB to 5.7GB > 2> Release MDNodes that are only used by source modules (I will send out > a proposal) > An estimation based on partial implementation: this will reduce > MDNodes from 5.7GB to 3.9GB > > Thanks, > Manman > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/7eeff946/attachment.html>
Apparently Analagous Threads
- [LLVMdev] Debug info: type uniquing for C++ and the status on building clang with "-flto -g"
- [LLVMdev] Debug info: type uniquing for C++ and the status on building clang with "-flto -g"
- [LLVMdev] Debug info: type uniquing for C++ and the status on building clang with "-flto -g"
- [LLVMdev] Proposal: type uniquing of debug info for LTO
- [LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)