Bob Wilson
2014-Mar-13 15:51 UTC
[LLVMdev] RFC: Binary format for instrumentation based profiling data
On Mar 13, 2014, at 5:48 AM, Diego Novillo <dnovillo at google.com> wrote:> On Wed, Mar 12, 2014 at 9:09 PM, Justin Bogner <mail at justinbogner.com> wrote: > >> Functions are represented by strings, determined by the part of the >> frontend that both generates and uses this data. In our case, these are >> generally whatever clang thinks of as the function's name, with minor >> details added to disambiguate names that aren't necessarily unique. > > Why not just use mangled names here? No need to add minor details, nor > ad-hoc disambiguation. If you're already using mangled names, then I'm > not sure why we need disambiguating details.We need to prepend the file name to distinguish functions with local linkage. Also, Objective-C methods do not get mangled, so we don’t want to say that we just use the mangled names.> >> The counter data is simply an array of unsigned 64 bit values. Given an >> offset found in the index, a sequence follows: >> >> <function hash> <number of counters> <counters...> >> >> This is all of the data needed for a given function. > > How are counters represented? Are these line numbers together with the > counter? Basic blocks? Edges?There are no line numbers, basic blocks, or edges. It is just a sequence of counters that the front-end knows how to map to the code (the same as with our current textual file format).> > I wonder if it would make sense to use the existing gcov format for > this. OTOH, we could provide a converter in the Profile library.This is pretty clearly a different format from gcov, and I don’t see how we could convert between them. But, I agree that it would be nice to collect the code for handling different kinds of profile formats in one library, even if those file formats are not interchangeable.
Justin Bogner
2014-Mar-13 16:48 UTC
[LLVMdev] RFC: Binary format for instrumentation based profiling data
Bob Wilson <bob.wilson at apple.com> writes:> On Mar 13, 2014, at 5:48 AM, Diego Novillo <dnovillo at google.com> wrote: > >> On Wed, Mar 12, 2014 at 9:09 PM, Justin Bogner <mail at justinbogner.com> wrote: >> >>> Functions are represented by strings, determined by the part of the >>> frontend that both generates and uses this data. In our case, these are >>> generally whatever clang thinks of as the function's name, with minor >>> details added to disambiguate names that aren't necessarily unique. >> >> Why not just use mangled names here? No need to add minor details, nor >> ad-hoc disambiguation. If you're already using mangled names, then I'm >> not sure why we need disambiguating details. > > We need to prepend the file name to distinguish functions with local > linkage. Also, Objective-C methods do not get mangled, so we don’t > want to say that we just use the mangled names.Notably, we do use mangled names where we can, and we only add anything in cases where it isn't enough.>>> The counter data is simply an array of unsigned 64 bit values. Given an >>> offset found in the index, a sequence follows: >>> >>> <function hash> <number of counters> <counters...> >>> >>> This is all of the data needed for a given function. >> >> How are counters represented? Are these line numbers together with the >> counter? Basic blocks? Edges? > > There are no line numbers, basic blocks, or edges. It is just a > sequence of counters that the front-end knows how to map to the code > (the same as with our current textual file format). > >> >> I wonder if it would make sense to use the existing gcov format for >> this. OTOH, we could provide a converter in the Profile library. > > This is pretty clearly a different format from gcov, and I don’t see > how we could convert between them. But, I agree that it would be nice > to collect the code for handling different kinds of profile formats in > one library, even if those file formats are not interchangeable.
Diego Novillo
2014-Mar-13 21:14 UTC
[LLVMdev] RFC: Binary format for instrumentation based profiling data
On Thu, Mar 13, 2014 at 11:51 AM, Bob Wilson <bob.wilson at apple.com> wrote:> > On Mar 13, 2014, at 5:48 AM, Diego Novillo <dnovillo at google.com> wrote: >> >> How are counters represented? Are these line numbers together with the >> counter? Basic blocks? Edges? > > There are no line numbers, basic blocks, or edges. It is just a sequence of counters that the front-end knows how to map to the code (the same as with our current textual file format).Sorry, you lost me. How exactly does the FE map them to the code? In the sample profiler, each instrumented line consists of a line offset, a discriminator (to distinguish distinct control flow paths on the same line) and the counter. We match them by computing the absolute line number from the offset and assign the counter to the corresponding basic block. I think we should be able to use the same pass in lib/Transforms/Scalar/SampleProfile.cpp to read profiles generated from instrumentation. The information is basically the same, so a bit of generalization of that code should be all we need to pass those counters down into the analysis module. Diego.
Xinliang David Li
2014-Mar-13 21:37 UTC
[LLVMdev] RFC: Binary format for instrumentation based profiling data
On Thu, Mar 13, 2014 at 2:14 PM, Diego Novillo <dnovillo at google.com> wrote:> On Thu, Mar 13, 2014 at 11:51 AM, Bob Wilson <bob.wilson at apple.com> wrote: > > > > On Mar 13, 2014, at 5:48 AM, Diego Novillo <dnovillo at google.com> wrote: > >> > >> How are counters represented? Are these line numbers together with the > >> counter? Basic blocks? Edges? > > > > There are no line numbers, basic blocks, or edges. It is just a sequence > of counters that the front-end knows how to map to the code (the same as > with our current textual file format). > > Sorry, you lost me. How exactly does the FE map them to the code? In > the sample profiler, each instrumented line consists of a line offset, > a discriminator (to distinguish distinct control flow paths on the > same line) and the counter. We match them by computing the absolute > line number from the offset and assign the counter to the > corresponding basic block. >For GCC, it is CFG based matching -- it requires exact match of sources between instrumentation and annotation (the counters are laid out in some CFG order). David> I think we should be able to use the same pass in > lib/Transforms/Scalar/SampleProfile.cpp to read profiles generated > from instrumentation. The information is basically the same, so a bit > of generalization of that code should be all we need to pass those > counters down into the analysis module. > > > Diego. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140313/eca45609/attachment.html>
Bob Wilson
2014-Mar-13 22:57 UTC
[LLVMdev] RFC: Binary format for instrumentation based profiling data
On Mar 13, 2014, at 2:14 PM, Diego Novillo <dnovillo at google.com> wrote:> On Thu, Mar 13, 2014 at 11:51 AM, Bob Wilson <bob.wilson at apple.com> wrote: >> >> On Mar 13, 2014, at 5:48 AM, Diego Novillo <dnovillo at google.com> wrote: >>> >>> How are counters represented? Are these line numbers together with the >>> counter? Basic blocks? Edges? >> >> There are no line numbers, basic blocks, or edges. It is just a sequence of counters that the front-end knows how to map to the code (the same as with our current textual file format). > > Sorry, you lost me. How exactly does the FE map them to the code? In > the sample profiler, each instrumented line consists of a line offset, > a discriminator (to distinguish distinct control flow paths on the > same line) and the counter. We match them by computing the absolute > line number from the offset and assign the counter to the > corresponding basic block.This is a proposal for the instrumentation-based approach that I talked about at the dev meeting. I don’t see how it can share the a file format with the sample profiler, since the content is fundamentally different.> > I think we should be able to use the same pass in > lib/Transforms/Scalar/SampleProfile.cpp to read profiles generated > from instrumentation. The information is basically the same, so a bit > of generalization of that code should be all we need to pass those > counters down into the analysis module.?? The information is completely different.
Apparently Analagous Threads
- [LLVMdev] RFC: Binary format for instrumentation based profiling data
- [LLVMdev] RFC: Binary format for instrumentation based profiling data
- [LLVMdev] RFC - Profile Guided Optimization in LLVM
- [LLVMdev] RFC - Profile Guided Optimization in LLVM
- [LLVMdev] RFC - Improvements to PGO profile support