Diego Novillo via llvm-dev
2015-Dec-09 15:52 UTC
[llvm-dev] Memory utilization problems in profile reader
I've been experimenting with profiled bootstraps using sample profiles. Initially, I made stage2 build stage3 while running under Perf. This produced a 20Gb profile which took too long to convert to LLVM, and used ~30Gb of RAM. So, I decided that this was not going to be very useful for general usage. I then changed the bootstrap to instead run each individual compile under Perf. This produced ~2,200 profiles, each of which took up to 1 minute to convert, and then they all have to be merged into a single profile. Also didn't like it. Since all compiles are more or less the same in terms of what the compiler does, I decided to take the top 10 biggest profiles and merge those. That seemed to work. This resulted in a 21Mb profile that I could use as input to -fprofile-sample-use. I started stage 3 of the bootstrap and left it to work. I noticed it was slow, so I thought "we'll need to speed things up". The build never finished. Instead, ninja crashed my machine. It turns out that each clang invocation was growing to 4Gb of RSS. All that memory is being allocated by the profile reader ( https://drive.google.com/file/d/0B9lq1VKvmXKFQVp1cGtZM2RSdWc/view?usp=sharing ). So, heads up, we need to trim it down. Perhaps by only loading one function profile at a time, use it and actively discard it. Or simply be better at flushing the reader data structures as they're used during annotations. I'll be sending patches about this in the coming days. It's likely that the sample reader is doing something silly here. Duncan, Justin, do you have memories of issues like this one with instrumentation? I'll be trying a similar experiment with it after I'm done with the biggest issues in the sampler. Thanks. Diego. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151209/f3b36da6/attachment.html>
Xinliang David Li via llvm-dev
2015-Dec-09 20:14 UTC
[llvm-dev] Memory utilization problems in profile reader
Can you extract the relevant part of the heap profile data? How large is the sample profile data fed to the compiler? The indexed format profile size for clang is <100MB. The InstrProfRecord for each function is read, used and discarded one at a time, so there should not be problem as described. David On Wed, Dec 9, 2015 at 7:52 AM, Diego Novillo via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > I've been experimenting with profiled bootstraps using sample profiles. > Initially, I made stage2 build stage3 while running under Perf. This > produced a 20Gb profile which took too long to convert to LLVM, and used > ~30Gb of RAM. So, I decided that this was not going to be very useful for > general usage. > > I then changed the bootstrap to instead run each individual compile under > Perf. This produced ~2,200 profiles, each of which took up to 1 minute to > convert, and then they all have to be merged into a single profile. Also > didn't like it. > > Since all compiles are more or less the same in terms of what the compiler > does, I decided to take the top 10 biggest profiles and merge those. That > seemed to work. This resulted in a 21Mb profile that I could use as input > to -fprofile-sample-use. > > I started stage 3 of the bootstrap and left it to work. I noticed it was > slow, so I thought "we'll need to speed things up". The build never > finished. Instead, ninja crashed my machine. > > It turns out that each clang invocation was growing to 4Gb of RSS. All > that memory is being allocated by the profile reader ( > https://drive.google.com/file/d/0B9lq1VKvmXKFQVp1cGtZM2RSdWc/view?usp=sharing > ). > > So, heads up, we need to trim it down. Perhaps by only loading one > function profile at a time, use it and actively discard it. Or simply be > better at flushing the reader data structures as they're used during > annotations. I'll be sending patches about this in the coming days. > > It's likely that the sample reader is doing something silly here. Duncan, > Justin, do you have memories of issues like this one with instrumentation? > I'll be trying a similar experiment with it after I'm done with the biggest > issues in the sampler. > > > Thanks. Diego. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151209/3d062f80/attachment.html>
Diego Novillo via llvm-dev
2015-Dec-10 20:56 UTC
[llvm-dev] Memory utilization problems in profile reader
On Wed, Dec 9, 2015 at 3:14 PM, Xinliang David Li <xinliangli at gmail.com> wrote:> Can you extract the relevant part of the heap profile data? >It's all profile data, actually. The heap utilization is massively dominated by the profile reader.> How large is the sample profile data fed to the compiler? > >For this run, the input file was 21Mb.> The indexed format profile size for clang is <100MB. The InstrProfRecord > for each function is read, used and discarded one at a time, so there > should not be problem as described. >Good. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151210/635bfd6f/attachment.html>
Sean Silva via llvm-dev
2015-Dec-12 00:48 UTC
[llvm-dev] Memory utilization problems in profile reader
On Wed, Dec 9, 2015 at 12:14 PM, Xinliang David Li via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Can you extract the relevant part of the heap profile data? How large is > the sample profile data fed to the compiler? > > The indexed format profile size for clang is <100MB. The InstrProfRecord > for each function is read, used and discarded one at a time, so there > should not be problem as described. >If I'm reading the code right, we are also doing O(keys of the hash table) memory allocation in the indexed reader here: http://llvm.org/docs/doxygen/html/classllvm_1_1InstrProfReaderIndex.html#acc49fd2c0a8c8dfc3e29b01e09869af7 ? That seems unnecessary. (it seems to be used for value profiling stuff for some reason?) -- Sean Silva> > David > > > > On Wed, Dec 9, 2015 at 7:52 AM, Diego Novillo via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> I've been experimenting with profiled bootstraps using sample profiles. >> Initially, I made stage2 build stage3 while running under Perf. This >> produced a 20Gb profile which took too long to convert to LLVM, and used >> ~30Gb of RAM. So, I decided that this was not going to be very useful for >> general usage. >> >> I then changed the bootstrap to instead run each individual compile under >> Perf. This produced ~2,200 profiles, each of which took up to 1 minute to >> convert, and then they all have to be merged into a single profile. Also >> didn't like it. >> >> Since all compiles are more or less the same in terms of what the >> compiler does, I decided to take the top 10 biggest profiles and merge >> those. That seemed to work. This resulted in a 21Mb profile that I could >> use as input to -fprofile-sample-use. >> >> I started stage 3 of the bootstrap and left it to work. I noticed it was >> slow, so I thought "we'll need to speed things up". The build never >> finished. Instead, ninja crashed my machine. >> >> It turns out that each clang invocation was growing to 4Gb of RSS. All >> that memory is being allocated by the profile reader ( >> https://drive.google.com/file/d/0B9lq1VKvmXKFQVp1cGtZM2RSdWc/view?usp=sharing >> ). >> >> So, heads up, we need to trim it down. Perhaps by only loading one >> function profile at a time, use it and actively discard it. Or simply be >> better at flushing the reader data structures as they're used during >> annotations. I'll be sending patches about this in the coming days. >> >> It's likely that the sample reader is doing something silly here. >> Duncan, Justin, do you have memories of issues like this one with >> instrumentation? I'll be trying a similar experiment with it after I'm >> done with the biggest issues in the sampler. >> >> >> Thanks. Diego. >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151211/b939c433/attachment.html>
Possibly Parallel Threads
- Memory utilization problems in profile reader
- Memory utilization problems in profile reader
- Memory utilization problems in profile reader
- [LLVMdev] IC profiling infrastructure
- [LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)