thr3ads.net - llvm dev - [llvm-dev] Code coverage BoF

If this information is useful, please help other people find it:
Share via:

Vedant Kumar via llvm-dev

2017-Oct-24 20:24 UTC

[llvm-dev] Code coverage BoF - notes and updates

Hello,

Our goals for the code coverage BoF (10/19) were to find areas where we can
improve the coverage tooling, and to learn more about how coverage is used.
I'd like to thank all of the attendees for their input and for making the
BoF productive. Special thanks to Mandeep Grang, who volunteered as a mic runner
at the last minute.

In this email I'll share my (rough) notes and outline some future plans.
Please feel free to ask for clarifications or to add your own notes.

Here are the slides from the BoF:
https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p
<https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p>

1. The header problem

Coverage instrumentation overhead is roughly quadratic in the number of
translation units in a project. The problem is that coverage mappings for
template instantiations and static inline functions from headers are pulled into
every TU. This bloats the profile metadata sections (which can slow down profile
I/O), results in large binaries, and causes long link times (or link failures).

We could solve this problem by maintaining an external coverage database and
discarding duplicate coverage mappings from the DB. Another idea is to emit
coverage mappings to a side file and unique them when generating coverage
reports. Both ideas require changes to the build workflow.

A third option is to emit named coverage mappings with linkonce_odr linkage (for
languages with an ODR). This would be a format-breaking change but it
wouldn't affect the build workflow. My plan is to try and evaluate this idea
in the coming week.

2. HTML report quality

There seems to be widespread interest in improving the quality of coverage
reports. We need volunteers to work on this and would love your help! Here are
some desired features:

* Search and filtering for coverage summaries
* Collapsing parts of a coverage summary by subdirectory
* Automatically generating a top 10 list of code regions which need better
coverage
* Searching via complex queries (e.g: 'give me uncovered regions in covered
lines', or 'give me uncovered regions after a call')
* Generating coverage deltas between two profiles, and identifying coverage
regressions in a patch/commit
* Simplified tracking of coverage trends over time

There is some consensus that this functionality should not be built on top of
the existing llvm-cov C++ codebase. It might be better to develop these features
in a language more amenable to rapid prototyping and interoperation with popular
web application frameworks (perhaps Python). To facilitate this, llvm-cov gained
support for exporting all of its data to JSON (see CoverageExporterJson.cpp
<https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-cov/CoverageExporterJson.cpp>).
If you are interested in working on these features, I would be happy to work
with you on design issues and on code review.

3. Optimizing profile counter placement
>From Eli's notes:
> I remember we also spent some time discussing the counter intrinsics, and
whether we could produce a different set of intrinsics in the frontend, and
produce the counters later in the pipeline to avoid duplicate counters. I
didn't completely follow that discussion; I haven't spent much time
looking at the counter intrinsics or how they're lowered.
Just to recap: the frontend emits calls to the llvm.instrprof_increment
intrinsic to implement counter updates. Each increment intrinsic is passed a
function name and a counter index (there's a mapping between AST nodes and
counter indices). The intrinsics are lowered in the InstrProfiling pass. During
lowering, an array of uint64_t counters is created for each function, and the
intrinsic calls are replaced by a load-add-store pattern.

Frontend counter updates can look highly redundant because of inlining. It's
common to see single basic blocks with tens of distinct counter updates, most of
which are redundant. One potential solution is to create a minimal set of
profile counter updates after the inliner runs, and to map these counters back
to AST nodes (https://bugs.llvm.org/show_bug.cgi?id=33500
<https://bugs.llvm.org/show_bug.cgi?id=33500>). This is the most promising
approach we know of to cut down on counter updates, but I don't have a
precise idea of how it would work. Here's a rough sketch of a solution:

* Have the frontend emit 'virtual' llvm.instrprof_increment intrinsics.
These will eventually be discarded during lowering.
* Run an early inlining step, then run the IR PGO pass.
* In the lowering step, emit a section into the object which describes how to
map the real counter updates to the virtual ones. I don't have a clear idea
of how to build or encode this mapping.
* Teach llvm-profdata how to reconstruct an indexed profile which the frontend
can understand (i.e map the real counters back to the virtual ones).
llvm-profdata would need to inspect the mapping section in the binary to
accomplish this.

4. Optimizing profile counter updates

We had a few different suggestions to speed up profile counter updates:

* Make function counter arrays linkonce_odr when possible. This is similar to
the solution from the first section ("The header problem"). I'll
try to evaluate this idea in the coming week.
* Enable register promotion for counter updates which occur within loops. David
Li has already done the work to enable this for IR PGO.
* Investigate the # of relocations emitted for counter updates. It might be
cheaper to load the address of the function counter array once and index into
it, instead of indexing into the global on each update.
* Use 32-bit counters. This would cut the size of the counters section in half
and speed up profile I/O.
* Use 1-bit counters. This could be useful for those who are only interested in
binary coverage. IMO there are other ideas we should try before compromising on
report accuracy.
* Use saturating counters. IMO this isn't likely to be a win in common
cases, but could increase compile time and code size.

5. Using coverage interactively while hacking on llvm

During the BoF I mentioned that it can be really useful to see coverage
reporting interactively, as you're working on a patch. Here's a hacky
way to do this:

* Build your code as you normally would (say, "ninja opt")
* Change the files you're interested in
* cd to your build directory and export
CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping
+-fprofile-instr-generate=/tmp/opt_%m.profraw"
* Rebuild ("ninja opt" again). This will enable coverage
instrumentation, but only for the files you've affected with your changes.
* Run a one-liner to generate a coverage report
(http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports
<http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports>)

I like this approach because it means I don't have to maintain a separate,
coverage-enabled build tree. It's an easy way to check that your patches
have decent test coverage. If I want to disable coverage reporting I just need
to unset CCC_OVERRIDE_OPTIONS and recompile.

6. C APIs for libCoverage

We didn't get a chance to discuss this in detail during the BoF, but I would
like to upstream some C APIs to surface functionality from libCoverage. This
will make it easier for IDEs and editors to display coverage information
"in-line", right next to source code. Here's what that might look
like:

https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html
<https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html>

If anyone has concerns about adding in these APIs, please let me know!

7. Making use of debug info
>From Eli's notes:
> It seemed like we got a lot of questions related to why we aren't using
debug info. :) It might be possible to come up with some sort of hybrid which
trades off runtime overhead for lower resolution, without completely throwing
away regions like gcov does. But it would be a big project, and the end result
would still have a lot of the same problems as actual gcov in terms of the
optimizer destroying necessary info.
To add to this: I think there are a lot of unanswered questions here. It's
unclear how clang would decide to use debug info instead of regions, or how the
different types of coverage counters would interact. I'm not very optimistic
about this.

thanks,
vedant
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171024/0c5f0032/attachment.html>

Dean Michael Berris via llvm-dev

2017-Oct-24 20:53 UTC

head link

[llvm-dev] Code coverage BoF - notes and updates

Thanks for the summary Vedant!

I'm sorry I missed this BoF session.

Others have mentioned the possibility of maybe using XRay for some of this
information (function-level coverage, maybe having more intrinsics for marking
branch/basic-block level instrumentation). Was this explored in the BoF? Is
there interest in potentially exploring this particular space?

Cheers
> On 25 Oct 2017, at 07:24, Vedant Kumar via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hello,
> 
> Our goals for the code coverage BoF (10/19) were to find areas where we can
improve the coverage tooling, and to learn more about how coverage is used.
I'd like to thank all of the attendees for their input and for making the
BoF productive. Special thanks to Mandeep Grang, who volunteered as a mic runner
at the last minute.
> 
> In this email I'll share my (rough) notes and outline some future
plans. Please feel free to ask for clarifications or to add your own notes.
> 
> Here are the slides from the BoF:
>
https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p
<https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p>
> 
> 1. The header problem
> 
> Coverage instrumentation overhead is roughly quadratic in the number of
translation units in a project. The problem is that coverage mappings for
template instantiations and static inline functions from headers are pulled into
every TU. This bloats the profile metadata sections (which can slow down profile
I/O), results in large binaries, and causes long link times (or link failures).
> 
> We could solve this problem by maintaining an external coverage database
and discarding duplicate coverage mappings from the DB. Another idea is to emit
coverage mappings to a side file and unique them when generating coverage
reports. Both ideas require changes to the build workflow.
> 
> A third option is to emit named coverage mappings with linkonce_odr linkage
(for languages with an ODR). This would be a format-breaking change but it
wouldn't affect the build workflow. My plan is to try and evaluate this idea
in the coming week.
> 
> 2. HTML report quality
> 
> There seems to be widespread interest in improving the quality of coverage
reports. We need volunteers to work on this and would love your help! Here are
some desired features:
> 
> * Search and filtering for coverage summaries
> * Collapsing parts of a coverage summary by subdirectory
> * Automatically generating a top 10 list of code regions which need better
coverage
> * Searching via complex queries (e.g: 'give me uncovered regions in
covered lines', or 'give me uncovered regions after a call')
> * Generating coverage deltas between two profiles, and identifying coverage
regressions in a patch/commit
> * Simplified tracking of coverage trends over time
> 
> There is some consensus that this functionality should not be built on top
of the existing llvm-cov C++ codebase. It might be better to develop these
features in a language more amenable to rapid prototyping and interoperation
with popular web application frameworks (perhaps Python). To facilitate this,
llvm-cov gained support for exporting all of its data to JSON (see
CoverageExporterJson.cpp
<https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-cov/CoverageExporterJson.cpp>).
If you are interested in working on these features, I would be happy to work
with you on design issues and on code review.
> 
> 3. Optimizing profile counter placement
> 
> From Eli's notes:
> 
>> I remember we also spent some time discussing the counter intrinsics,
and whether we could produce a different set of intrinsics in the frontend, and
produce the counters later in the pipeline to avoid duplicate counters.   I
didn't completely follow that discussion; I haven't spent much time
looking at the counter intrinsics or how they're lowered.
> 
> Just to recap: the frontend emits calls to the llvm.instrprof_increment
intrinsic to implement counter updates. Each increment intrinsic is passed a
function name and a counter index (there's a mapping between AST nodes and
counter indices). The intrinsics are lowered in the InstrProfiling pass. During
lowering, an array of uint64_t counters is created for each function, and the
intrinsic calls are replaced by a load-add-store pattern.
> 
> Frontend counter updates can look highly redundant because of inlining.
It's common to see single basic blocks with tens of distinct counter
updates, most of which are redundant. One potential solution is to create a
minimal set of profile counter updates after the inliner runs, and to map these
counters back to AST nodes (https://bugs.llvm.org/show_bug.cgi?id=33500
<https://bugs.llvm.org/show_bug.cgi?id=33500>). This is the most promising
approach we know of to cut down on counter updates, but I don't have a
precise idea of how it would work. Here's a rough sketch of a solution:
> 
> * Have the frontend emit 'virtual' llvm.instrprof_increment
intrinsics. These will eventually be discarded during lowering.
> * Run an early inlining step, then run the IR PGO pass.
> * In the lowering step, emit a section into the object which describes how
to map the real counter updates to the virtual ones. I don't have a clear
idea of how to build or encode this mapping.
> * Teach llvm-profdata how to reconstruct an indexed profile which the
frontend can understand (i.e map the real counters back to the virtual ones).
llvm-profdata would need to inspect the mapping section in the binary to
accomplish this.
> 
> 4. Optimizing profile counter updates
> 
> We had a few different suggestions to speed up profile counter updates:
> 
> * Make function counter arrays linkonce_odr when possible. This is similar
to the solution from the first section ("The header problem").
I'll try to evaluate this idea in the coming week.
> * Enable register promotion for counter updates which occur within loops.
David Li has already done the work to enable this for IR PGO.
> * Investigate the # of relocations emitted for counter updates. It might be
cheaper to load the address of the function counter array once and index into
it, instead of indexing into the global on each update.
> * Use 32-bit counters. This would cut the size of the counters section in
half and speed up profile I/O.
> * Use 1-bit counters. This could be useful for those who are only
interested in binary coverage. IMO there are other ideas we should try before
compromising on report accuracy.
> * Use saturating counters. IMO this isn't likely to be a win in common
cases, but could increase compile time and code size.
> 
> 5. Using coverage interactively while hacking on llvm
> 
> During the BoF I mentioned that it can be really useful to see coverage
reporting interactively, as you're working on a patch. Here's a hacky
way to do this:
> 
> * Build your code as you normally would (say, "ninja opt")
> * Change the files you're interested in
> * cd to your build directory and export
CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping
+-fprofile-instr-generate=/tmp/opt_%m.profraw"
> * Rebuild ("ninja opt" again). This will enable coverage
instrumentation, but only for the files you've affected with your changes.
> * Run a one-liner to generate a coverage report
(http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports
<http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports>)
> 
> I like this approach because it means I don't have to maintain a
separate, coverage-enabled build tree. It's an easy way to check that your
patches have decent test coverage. If I want to disable coverage reporting I
just need to unset CCC_OVERRIDE_OPTIONS and recompile.
> 
> 6. C APIs for libCoverage
> 
> We didn't get a chance to discuss this in detail during the BoF, but I
would like to upstream some C APIs to surface functionality from libCoverage.
This will make it easier for IDEs and editors to display coverage information
"in-line", right next to source code. Here's what that might look
like:
> 
>
https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html
<https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html>
> 
> If anyone has concerns about adding in these APIs, please let me know!
> 
> 7. Making use of debug info
> 
> From Eli's notes:
> 
>> It seemed like we got a lot of questions related to why we aren't
using debug info. :) It might be possible to come up with some sort of hybrid
which trades off runtime overhead for lower resolution, without completely
throwing away regions like gcov does.  But it would be a big project, and the
end result would still have a lot of the same problems as actual gcov in terms
of the optimizer destroying necessary info.
> 
> To add to this: I think there are a lot of unanswered questions here.
It's unclear how clang would decide to use debug info instead of regions, or
how the different types of coverage counters would interact. I'm not very
optimistic about this.
> 
> thanks,
> vedant
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- Dean

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171025/2e718299/attachment.html>

Vedant Kumar via llvm-dev

2017-Oct-24 21:06 UTC

head link

[llvm-dev] Code coverage BoF - notes and updates

Hi Dean,

We didn't discuss using XRay instrumentation during the BoF but it is an
interesting idea (by the way, thanks for your talk about XRay internals!). XRay
provides the advantage of being able to turn profiling on and off, but I'm
not sure how the resulting data could be used.

The code coverage feature is highly dependent on the frontend's profile
counter placement. The mapping between counters and parts of the AST is used to
gather accurate information about regions within a line. For example, the
coverage tool can show you that the l.h.s of "true || false" is
evaluated once, and the r.h.s isn't evaluated. This works with arbitrarily
nested short-circuit operators.

It might be possible to use XRay instrumentation to gather profile data, but I
think it will be challenging to precisely map that data back to the AST nodes
the frontend knows about. The problem is similar to the one I've outlined in
section 3 ("Optimizing profile counter placement"). The idea there is
to map a minimal set of counters placed by IR PGO back to AST nodes: the one
sketch of a solution I have still depends on running the frontend counter
placement pass to achieve this.

What are your thoughts on this?

thanks,
vedant
> On Oct 24, 2017, at 1:53 PM, Dean Michael Berris <dean.berris at
gmail.com> wrote:
> 
> Thanks for the summary Vedant!
> 
> I'm sorry I missed this BoF session.
> 
> Others have mentioned the possibility of maybe using XRay for some of this
information (function-level coverage, maybe having more intrinsics for marking
branch/basic-block level instrumentation). Was this explored in the BoF? Is
there interest in potentially exploring this particular space?
> 
> Cheers
> 
>> On 25 Oct 2017, at 07:24, Vedant Kumar via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>> Hello,
>> 
>> Our goals for the code coverage BoF (10/19) were to find areas where we
can improve the coverage tooling, and to learn more about how coverage is used.
I'd like to thank all of the attendees for their input and for making the
BoF productive. Special thanks to Mandeep Grang, who volunteered as a mic runner
at the last minute.
>> 
>> In this email I'll share my (rough) notes and outline some future
plans. Please feel free to ask for clarifications or to add your own notes.
>> 
>> Here are the slides from the BoF:
>>
https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p
<https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p>
>> 
>> 1. The header problem
>> 
>> Coverage instrumentation overhead is roughly quadratic in the number of
translation units in a project. The problem is that coverage mappings for
template instantiations and static inline functions from headers are pulled into
every TU. This bloats the profile metadata sections (which can slow down profile
I/O), results in large binaries, and causes long link times (or link failures).
>> 
>> We could solve this problem by maintaining an external coverage
database and discarding duplicate coverage mappings from the DB. Another idea is
to emit coverage mappings to a side file and unique them when generating
coverage reports. Both ideas require changes to the build workflow.
>> 
>> A third option is to emit named coverage mappings with linkonce_odr
linkage (for languages with an ODR). This would be a format-breaking change but
it wouldn't affect the build workflow. My plan is to try and evaluate this
idea in the coming week.
>> 
>> 2. HTML report quality
>> 
>> There seems to be widespread interest in improving the quality of
coverage reports. We need volunteers to work on this and would love your help!
Here are some desired features:
>> 
>> * Search and filtering for coverage summaries
>> * Collapsing parts of a coverage summary by subdirectory
>> * Automatically generating a top 10 list of code regions which need
better coverage
>> * Searching via complex queries (e.g: 'give me uncovered regions in
covered lines', or 'give me uncovered regions after a call')
>> * Generating coverage deltas between two profiles, and identifying
coverage regressions in a patch/commit
>> * Simplified tracking of coverage trends over time
>> 
>> There is some consensus that this functionality should not be built on
top of the existing llvm-cov C++ codebase. It might be better to develop these
features in a language more amenable to rapid prototyping and interoperation
with popular web application frameworks (perhaps Python). To facilitate this,
llvm-cov gained support for exporting all of its data to JSON (see
CoverageExporterJson.cpp
<https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-cov/CoverageExporterJson.cpp>).
If you are interested in working on these features, I would be happy to work
with you on design issues and on code review.
>> 
>> 3. Optimizing profile counter placement
>> 
>> From Eli's notes:
>> 
>>> I remember we also spent some time discussing the counter
intrinsics, and whether we could produce a different set of intrinsics in the
frontend, and produce the counters later in the pipeline to avoid duplicate
counters.   I didn't completely follow that discussion; I haven't spent
much time looking at the counter intrinsics or how they're lowered.
>> 
>> Just to recap: the frontend emits calls to the llvm.instrprof_increment
intrinsic to implement counter updates. Each increment intrinsic is passed a
function name and a counter index (there's a mapping between AST nodes and
counter indices). The intrinsics are lowered in the InstrProfiling pass. During
lowering, an array of uint64_t counters is created for each function, and the
intrinsic calls are replaced by a load-add-store pattern.
>> 
>> Frontend counter updates can look highly redundant because of inlining.
It's common to see single basic blocks with tens of distinct counter
updates, most of which are redundant. One potential solution is to create a
minimal set of profile counter updates after the inliner runs, and to map these
counters back to AST nodes (https://bugs.llvm.org/show_bug.cgi?id=33500
<https://bugs.llvm.org/show_bug.cgi?id=33500>). This is the most promising
approach we know of to cut down on counter updates, but I don't have a
precise idea of how it would work. Here's a rough sketch of a solution:
>> 
>> * Have the frontend emit 'virtual' llvm.instrprof_increment
intrinsics. These will eventually be discarded during lowering.
>> * Run an early inlining step, then run the IR PGO pass.
>> * In the lowering step, emit a section into the object which describes
how to map the real counter updates to the virtual ones. I don't have a
clear idea of how to build or encode this mapping.
>> * Teach llvm-profdata how to reconstruct an indexed profile which the
frontend can understand (i.e map the real counters back to the virtual ones).
llvm-profdata would need to inspect the mapping section in the binary to
accomplish this.
>> 
>> 4. Optimizing profile counter updates
>> 
>> We had a few different suggestions to speed up profile counter updates:
>> 
>> * Make function counter arrays linkonce_odr when possible. This is
similar to the solution from the first section ("The header problem").
I'll try to evaluate this idea in the coming week.
>> * Enable register promotion for counter updates which occur within
loops. David Li has already done the work to enable this for IR PGO.
>> * Investigate the # of relocations emitted for counter updates. It
might be cheaper to load the address of the function counter array once and
index into it, instead of indexing into the global on each update.
>> * Use 32-bit counters. This would cut the size of the counters section
in half and speed up profile I/O.
>> * Use 1-bit counters. This could be useful for those who are only
interested in binary coverage. IMO there are other ideas we should try before
compromising on report accuracy.
>> * Use saturating counters. IMO this isn't likely to be a win in
common cases, but could increase compile time and code size.
>> 
>> 5. Using coverage interactively while hacking on llvm
>> 
>> During the BoF I mentioned that it can be really useful to see coverage
reporting interactively, as you're working on a patch. Here's a hacky
way to do this:
>> 
>> * Build your code as you normally would (say, "ninja opt")
>> * Change the files you're interested in
>> * cd to your build directory and export
CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping
+-fprofile-instr-generate=/tmp/opt_%m.profraw"
>> * Rebuild ("ninja opt" again). This will enable coverage
instrumentation, but only for the files you've affected with your changes.
>> * Run a one-liner to generate a coverage report
(http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports
<http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports>)
>> 
>> I like this approach because it means I don't have to maintain a
separate, coverage-enabled build tree. It's an easy way to check that your
patches have decent test coverage. If I want to disable coverage reporting I
just need to unset CCC_OVERRIDE_OPTIONS and recompile.
>> 
>> 6. C APIs for libCoverage
>> 
>> We didn't get a chance to discuss this in detail during the BoF,
but I would like to upstream some C APIs to surface functionality from
libCoverage. This will make it easier for IDEs and editors to display coverage
information "in-line", right next to source code. Here's what that
might look like:
>> 
>>
https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html
<https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html>
>> 
>> If anyone has concerns about adding in these APIs, please let me know!
>> 
>> 7. Making use of debug info
>> 
>> From Eli's notes:
>> 
>>> It seemed like we got a lot of questions related to why we
aren't using debug info. :) It might be possible to come up with some sort
of hybrid which trades off runtime overhead for lower resolution, without
completely throwing away regions like gcov does.  But it would be a big project,
and the end result would still have a lot of the same problems as actual gcov in
terms of the optimizer destroying necessary info.
>> 
>> To add to this: I think there are a lot of unanswered questions here.
It's unclear how clang would decide to use debug info instead of regions, or
how the different types of coverage counters would interact. I'm not very
optimistic about this.
>> 
>> thanks,
>> vedant
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> -- Dean
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171024/d31ad9ec/attachment.html>

Alex L via llvm-dev

2017-Oct-24 23:19 UTC

head link

[llvm-dev] Code coverage BoF - notes and updates

On 24 October 2017 at 13:24, Vedant Kumar via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello,
>
> Our goals for the code coverage BoF (10/19) were to find areas where we
> can improve the coverage tooling, and to learn more about how coverage is
> used. I'd like to thank all of the attendees for their input and for
making
> the BoF productive. Special thanks to Mandeep Grang, who volunteered as a
> mic runner at the last minute.
>
> In this email I'll share my (rough) notes and outline some future
plans.
> Please feel free to ask for clarifications or to add your own notes.
>
> Here are the slides from the BoF:
> https://docs.google.com/presentation/d/e/2PACX-1vS-
> rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_
> RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&
> delayms=3000&slide=id.p
>
> 1. The header problem
>
> Coverage instrumentation overhead is roughly quadratic in the number of
> translation units in a project. The problem is that coverage mappings for
> template instantiations and static inline functions from headers are pulled
> into every TU. This bloats the profile metadata sections (which can slow
> down profile I/O), results in large binaries, and causes long link times
> (or link failures).
>
> We could solve this problem by maintaining an external coverage database
> and discarding duplicate coverage mappings from the DB. Another idea is to
> emit coverage mappings to a side file and unique them when generating
> coverage reports. Both ideas require changes to the build workflow.
>
> A third option is to emit named coverage mappings with linkonce_odr
> linkage (for languages with an ODR). This would be a format-breaking change
> but it wouldn't affect the build workflow. My plan is to try and
evaluate
> this idea in the coming week.
>
> 2. HTML report quality
>
> There seems to be widespread interest in improving the quality of coverage
> reports. We need volunteers to work on this and would love your help! Here
> are some desired features:
>
> * Search and filtering for coverage summaries
> * Collapsing parts of a coverage summary by subdirectory
> * Automatically generating a top 10 list of code regions which need better
> coverage
> * Searching via complex queries (e.g: 'give me uncovered regions in
> covered lines', or 'give me uncovered regions after a call')
> * Generating coverage deltas between two profiles, and identifying
> coverage regressions in a patch/commit
> * Simplified tracking of coverage trends over time
>
> There is some consensus that this functionality should not be built on top
> of the existing llvm-cov C++ codebase. It might be better to develop these
> features in a language more amenable to rapid prototyping and
> interoperation with popular web application frameworks (perhaps Python). To
> facilitate this, llvm-cov gained support for exporting all of its data to
> JSON (see CoverageExporterJson.cpp
>
<https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-cov/CoverageExporterJson.cpp>
> ). If you are interested in working on these features, I would be happy
> to work with you on design issues and on code review.
>
> 3. Optimizing profile counter placement
>
> From Eli's notes:
>
> I remember we also spent some time discussing the counter intrinsics, and
> whether we could produce a different set of intrinsics in the frontend, and
> produce the counters later in the pipeline to avoid duplicate counters.   I
> didn't completely follow that discussion; I haven't spent much time
looking
> at the counter intrinsics or how they're lowered.
>
>
> Just to recap: the frontend emits calls to the llvm.instrprof_increment
> intrinsic to implement counter updates. Each increment intrinsic is passed
> a function name and a counter index (there's a mapping between AST
nodes
> and counter indices). The intrinsics are lowered in the InstrProfiling
> pass. During lowering, an array of uint64_t counters is created for each
> function, and the intrinsic calls are replaced by a load-add-store pattern.
>
> Frontend counter updates can look highly redundant because of inlining.
> It's common to see single basic blocks with tens of distinct counter
> updates, most of which are redundant. One potential solution is to create a
> minimal set of profile counter updates after the inliner runs, and to map
> these counters back to AST nodes (https://bugs.llvm.org/show_
> bug.cgi?id=33500). This is the most promising approach we know of to cut
> down on counter updates, but I don't have a precise idea of how it
would
> work. Here's a rough sketch of a solution:
>
> * Have the frontend emit 'virtual' llvm.instrprof_increment
intrinsics.
> These will eventually be discarded during lowering.
> * Run an early inlining step, then run the IR PGO pass.
> * In the lowering step, emit a section into the object which describes how
> to map the real counter updates to the virtual ones. I don't have a
clear
> idea of how to build or encode this mapping.
> * Teach llvm-profdata how to reconstruct an indexed profile which the
> frontend can understand (i.e map the real counters back to the virtual
> ones). llvm-profdata would need to inspect the mapping section in the
> binary to accomplish this.
>
It might be possible to avoid changing llvm-profdata by teaching
compiler-rt how to propagate the counter values from a subset of emitted
counters to all "virtual" counters before the counter values are
written
out by compiler-rt to disk.

>
> 4. Optimizing profile counter updates
>
> We had a few different suggestions to speed up profile counter updates:
>
> * Make function counter arrays linkonce_odr when possible. This is similar
> to the solution from the first section ("The header problem").
I'll try to
> evaluate this idea in the coming week.
> * Enable register promotion for counter updates which occur within loops.
> David Li has already done the work to enable this for IR PGO.
> * Investigate the # of relocations emitted for counter updates. It might
> be cheaper to load the address of the function counter array once and index
> into it, instead of indexing into the global on each update.
> * Use 32-bit counters. This would cut the size of the counters section in
> half and speed up profile I/O.
> * Use 1-bit counters. This could be useful for those who are only
> interested in binary coverage. IMO there are other ideas we should try
> before compromising on report accuracy.
> * Use saturating counters. IMO this isn't likely to be a win in common
> cases, but could increase compile time and code size.
>
> 5. Using coverage interactively while hacking on llvm
>
> During the BoF I mentioned that it can be really useful to see coverage
> reporting interactively, as you're working on a patch. Here's a
hacky way
> to do this:
>
> * Build your code as you normally would (say, "ninja opt")
> * Change the files you're interested in
> * cd to your build directory and export
CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping
> +-fprofile-instr-generate=/tmp/opt_%m.profraw"
> * Rebuild ("ninja opt" again). This will enable coverage
instrumentation,
> but only for the files you've affected with your changes.
> * Run a one-liner to generate a coverage report (
> http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#
> creating-coverage-reports)
>
> I like this approach because it means I don't have to maintain a
separate,
> coverage-enabled build tree. It's an easy way to check that your
patches
> have decent test coverage. If I want to disable coverage reporting I just
> need to unset CCC_OVERRIDE_OPTIONS and recompile.
>
> 6. C APIs for libCoverage
>
> We didn't get a chance to discuss this in detail during the BoF, but I
> would like to upstream some C APIs to surface functionality from
> libCoverage. This will make it easier for IDEs and editors to display
> coverage information "in-line", right next to source code.
Here's what that
> might look like:
>
> https://developer.apple.com/library/content/documentation/
> DeveloperTools/Conceptual/testing_with_xcode/chapters/
> 07-code_coverage.html
>
> If anyone has concerns about adding in these APIs, please let me know!
>
> 7. Making use of debug info
>
> From Eli's notes:
>
> It seemed like we got a lot of questions related to why we aren't using
> debug info. :) It might be possible to come up with some sort of hybrid
> which trades off runtime overhead for lower resolution, without completely
> throwing away regions like gcov does.  But it would be a big project, and
> the end result would still have a lot of the same problems as actual gcov
> in terms of the optimizer destroying necessary info.
>
>
> To add to this: I think there are a lot of unanswered questions here.
It's
> unclear how clang would decide to use debug info instead of regions, or how
> the different types of coverage counters would interact. I'm not very
> optimistic about this.
>
> thanks,
> vedant
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171024/ca6d5ffa/attachment.html>

Moshtaghi, Alireza via llvm-dev

2017-Oct-25 21:45 UTC

head link

[llvm-dev] Code coverage BoF - notes and updates

Hi
I’m interested in implementing the solution for the header problem described in
(1.) to emit coverage mappings to a side file and unique them when generating
coverage reports. As you also mentioned this would require modifying the build
workflow. Can you explain how do you suggest changing the build workflow? I
tried objcopy-ing the profile data sections from the “.o” files and relinking
them again them again but that caused the (just to see if it is possible) but
the output raw profile data was not written into.

I’m open to trying any suggestion.

Thanks
A

From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Vedant
Kumar via llvm-dev <llvm-dev at lists.llvm.org>
Reply-To: Vedant Kumar <vsk at apple.com>
Date: Tuesday, October 24, 2017 at 1:25 PM
To: llvm-dev <llvm-dev at lists.llvm.org>
Subject: [llvm-dev] Code coverage BoF - notes and updates

Hello,

Our goals for the code coverage BoF (10/19) were to find areas where we can
improve the coverage tooling, and to learn more about how coverage is used.
I'd like to thank all of the attendees for their input and for making the
BoF productive. Special thanks to Mandeep Grang, who volunteered as a mic runner
at the last minute.

In this email I'll share my (rough) notes and outline some future plans.
Please feel free to ask for clarifications or to add your own notes.

Here are the slides from the BoF:
https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p

1. The header problem

Coverage instrumentation overhead is roughly quadratic in the number of
translation units in a project. The problem is that coverage mappings for
template instantiations and static inline functions from headers are pulled into
every TU. This bloats the profile metadata sections (which can slow down profile
I/O), results in large binaries, and causes long link times (or link failures).

We could solve this problem by maintaining an external coverage database and
discarding duplicate coverage mappings from the DB. Another idea is to emit
coverage mappings to a side file and unique them when generating coverage
reports. Both ideas require changes to the build workflow.

A third option is to emit named coverage mappings with linkonce_odr linkage (for
languages with an ODR). This would be a format-breaking change but it
wouldn't affect the build workflow. My plan is to try and evaluate this idea
in the coming week.

2. HTML report quality

There seems to be widespread interest in improving the quality of coverage
reports. We need volunteers to work on this and would love your help! Here are
some desired features:

* Search and filtering for coverage summaries
* Collapsing parts of a coverage summary by subdirectory
* Automatically generating a top 10 list of code regions which need better
coverage
* Searching via complex queries (e.g: 'give me uncovered regions in covered
lines', or 'give me uncovered regions after a call')
* Generating coverage deltas between two profiles, and identifying coverage
regressions in a patch/commit
* Simplified tracking of coverage trends over time

There is some consensus that this functionality should not be built on top of
the existing llvm-cov C++ codebase. It might be better to develop these features
in a language more amenable to rapid prototyping and interoperation with popular
web application frameworks (perhaps Python). To facilitate this, llvm-cov gained
support for exporting all of its data to JSON (see
CoverageExporterJson.cpp<https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-cov/CoverageExporterJson.cpp>).
If you are interested in working on these features, I would be happy to work
with you on design issues and on code review.

3. Optimizing profile counter placement

From Eli's notes:

I remember we also spent some time discussing the counter intrinsics, and
whether we could produce a different set of intrinsics in the frontend, and
produce the counters later in the pipeline to avoid duplicate counters.   I
didn't completely follow that discussion; I haven't spent much time
looking at the counter intrinsics or how they're lowered.

Just to recap: the frontend emits calls to the llvm.instrprof_increment
intrinsic to implement counter updates. Each increment intrinsic is passed a
function name and a counter index (there's a mapping between AST nodes and
counter indices). The intrinsics are lowered in the InstrProfiling pass. During
lowering, an array of uint64_t counters is created for each function, and the
intrinsic calls are replaced by a load-add-store pattern.

Frontend counter updates can look highly redundant because of inlining. It's
common to see single basic blocks with tens of distinct counter updates, most of
which are redundant. One potential solution is to create a minimal set of
profile counter updates after the inliner runs, and to map these counters back
to AST nodes (https://bugs.llvm.org/show_bug.cgi?id=33500). This is the most
promising approach we know of to cut down on counter updates, but I don't
have a precise idea of how it would work. Here's a rough sketch of a
solution:

* Have the frontend emit 'virtual' llvm.instrprof_increment intrinsics.
These will eventually be discarded during lowering.
* Run an early inlining step, then run the IR PGO pass.
* In the lowering step, emit a section into the object which describes how to
map the real counter updates to the virtual ones. I don't have a clear idea
of how to build or encode this mapping.
* Teach llvm-profdata how to reconstruct an indexed profile which the frontend
can understand (i.e map the real counters back to the virtual ones).
llvm-profdata would need to inspect the mapping section in the binary to
accomplish this.

4. Optimizing profile counter updates

We had a few different suggestions to speed up profile counter updates:

* Make function counter arrays linkonce_odr when possible. This is similar to
the solution from the first section ("The header problem"). I'll
try to evaluate this idea in the coming week.
* Enable register promotion for counter updates which occur within loops. David
Li has already done the work to enable this for IR PGO.
* Investigate the # of relocations emitted for counter updates. It might be
cheaper to load the address of the function counter array once and index into
it, instead of indexing into the global on each update.
* Use 32-bit counters. This would cut the size of the counters section in half
and speed up profile I/O.
* Use 1-bit counters. This could be useful for those who are only interested in
binary coverage. IMO there are other ideas we should try before compromising on
report accuracy.
* Use saturating counters. IMO this isn't likely to be a win in common
cases, but could increase compile time and code size.

5. Using coverage interactively while hacking on llvm

During the BoF I mentioned that it can be really useful to see coverage
reporting interactively, as you're working on a patch. Here's a hacky
way to do this:

* Build your code as you normally would (say, "ninja opt")
* Change the files you're interested in
* cd to your build directory and export
CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping
+-fprofile-instr-generate=/tmp/opt_%m.profraw"
* Rebuild ("ninja opt" again). This will enable coverage
instrumentation, but only for the files you've affected with your changes.
* Run a one-liner to generate a coverage report
(http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports)

I like this approach because it means I don't have to maintain a separate,
coverage-enabled build tree. It's an easy way to check that your patches
have decent test coverage. If I want to disable coverage reporting I just need
to unset CCC_OVERRIDE_OPTIONS and recompile.

6. C APIs for libCoverage

We didn't get a chance to discuss this in detail during the BoF, but I would
like to upstream some C APIs to surface functionality from libCoverage. This
will make it easier for IDEs and editors to display coverage information
"in-line", right next to source code. Here's what that might look
like:

https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html

If anyone has concerns about adding in these APIs, please let me know!

7. Making use of debug info

From Eli's notes:

It seemed like we got a lot of questions related to why we aren't using
debug info. :) It might be possible to come up with some sort of hybrid which
trades off runtime overhead for lower resolution, without completely throwing
away regions like gcov does.  But it would be a big project, and the end result
would still have a lot of the same problems as actual gcov in terms of the
optimizer destroying necessary info.

To add to this: I think there are a lot of unanswered questions here. It's
unclear how clang would decide to use debug info instead of regions, or how the
different types of coverage counters would interact. I'm not very optimistic
about this.

thanks,
vedant
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171025/14f9f3fe/attachment.html>

Reid Kleckner via llvm-dev

2017-Oct-25 22:21 UTC

head link

[llvm-dev] Code coverage BoF - notes and updates

On Wed, Oct 25, 2017 at 2:45 PM, Moshtaghi, Alireza via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi
>
> I’m interested in implementing the solution for the header problem
> described in (1.) to emit coverage mappings to a side file and unique them
> when generating coverage reports. As you also mentioned this would require
> modifying the build workflow. Can you explain how do you suggest changing
> the build workflow? I tried objcopy-ing the profile data sections from the
> “.o” files and relinking them again them again but that caused the (just to
> see if it is possible) but the output raw profile data was not written
into.
>
>
>
> I’m open to trying any suggestion.
>
Isn't that exactly what emitting the coverage data as linkonce_odr
accomplishes?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171025/d716afbb/attachment.html>

Vedant Kumar via llvm-dev

2017-Oct-30 20:41 UTC

head link

[llvm-dev] Code coverage BoF - notes and updates

> On Oct 24, 2017, at 1:24 PM, Vedant Kumar <vsk at apple.com> wrote:
> 
> Hello,
> 
> Our goals for the code coverage BoF (10/19) were to find areas where we can
improve the coverage tooling, and to learn more about how coverage is used.
I'd like to thank all of the attendees for their input and for making the
BoF productive. Special thanks to Mandeep Grang, who volunteered as a mic runner
at the last minute.
> 
> In this email I'll share my (rough) notes and outline some future
plans. Please feel free to ask for clarifications or to add your own notes.
> 
> Here are the slides from the BoF:
>
https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p
<https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p>
> 
> 1. The header problem
> 
> Coverage instrumentation overhead is roughly quadratic in the number of
translation units in a project. The problem is that coverage mappings for
template instantiations and static inline functions from headers are pulled into
every TU. This bloats the profile metadata sections (which can slow down profile
I/O), results in large binaries, and causes long link times (or link failures).
> 
> We could solve this problem by maintaining an external coverage database
and discarding duplicate coverage mappings from the DB. Another idea is to emit
coverage mappings to a side file and unique them when generating coverage
reports. Both ideas require changes to the build workflow.
> 
> A third option is to emit named coverage mappings with linkonce_odr linkage
(for languages with an ODR). This would be a format-breaking change but it
wouldn't affect the build workflow. My plan is to try and evaluate this idea
in the coming week.
Following up on this thread:

I found that marking coverage mappings, function records, and names of functions
from headers as linkonce_odr results in decent binary size savings. I tested
this idea out and reported my results here:
https://bugs.llvm.org/show_bug.cgi?id=34533
<https://bugs.llvm.org/show_bug.cgi?id=34533>. I think this is the
solution we should go with, but am curious to know what others think.

thanks,
vedant
> 
> 2. HTML report quality
> 
> There seems to be widespread interest in improving the quality of coverage
reports. We need volunteers to work on this and would love your help! Here are
some desired features:
> 
> * Search and filtering for coverage summaries
> * Collapsing parts of a coverage summary by subdirectory
> * Automatically generating a top 10 list of code regions which need better
coverage
> * Searching via complex queries (e.g: 'give me uncovered regions in
covered lines', or 'give me uncovered regions after a call')
> * Generating coverage deltas between two profiles, and identifying coverage
regressions in a patch/commit
> * Simplified tracking of coverage trends over time
> 
> There is some consensus that this functionality should not be built on top
of the existing llvm-cov C++ codebase. It might be better to develop these
features in a language more amenable to rapid prototyping and interoperation
with popular web application frameworks (perhaps Python). To facilitate this,
llvm-cov gained support for exporting all of its data to JSON (see
CoverageExporterJson.cpp). If you are interested in working on these features, I
would be happy to work with you on design issues and on code review.
> 
> 3. Optimizing profile counter placement
> 
> From Eli's notes:
> 
>> I remember we also spent some time discussing the counter intrinsics,
and whether we could produce a different set of intrinsics in the frontend, and
produce the counters later in the pipeline to avoid duplicate counters.   I
didn't completely follow that discussion; I haven't spent much time
looking at the counter intrinsics or how they're lowered.
> 
> Just to recap: the frontend emits calls to the llvm.instrprof_increment
intrinsic to implement counter updates. Each increment intrinsic is passed a
function name and a counter index (there's a mapping between AST nodes and
counter indices). The intrinsics are lowered in the InstrProfiling pass. During
lowering, an array of uint64_t counters is created for each function, and the
intrinsic calls are replaced by a load-add-store pattern.
> 
> Frontend counter updates can look highly redundant because of inlining.
It's common to see single basic blocks with tens of distinct counter
updates, most of which are redundant. One potential solution is to create a
minimal set of profile counter updates after the inliner runs, and to map these
counters back to AST nodes (https://bugs.llvm.org/show_bug.cgi?id=33500). This
is the most promising approach we know of to cut down on counter updates, but I
don't have a precise idea of how it would work. Here's a rough sketch of
a solution:
> 
> * Have the frontend emit 'virtual' llvm.instrprof_increment
intrinsics. These will eventually be discarded during lowering.
> * Run an early inlining step, then run the IR PGO pass.
> * In the lowering step, emit a section into the object which describes how
to map the real counter updates to the virtual ones. I don't have a clear
idea of how to build or encode this mapping.
> * Teach llvm-profdata how to reconstruct an indexed profile which the
frontend can understand (i.e map the real counters back to the virtual ones).
llvm-profdata would need to inspect the mapping section in the binary to
accomplish this.
> 
> 4. Optimizing profile counter updates
> 
> We had a few different suggestions to speed up profile counter updates:
> 
> * Make function counter arrays linkonce_odr when possible. This is similar
to the solution from the first section ("The header problem").
I'll try to evaluate this idea in the coming week.
> * Enable register promotion for counter updates which occur within loops.
David Li has already done the work to enable this for IR PGO.
> * Investigate the # of relocations emitted for counter updates. It might be
cheaper to load the address of the function counter array once and index into
it, instead of indexing into the global on each update.
> * Use 32-bit counters. This would cut the size of the counters section in
half and speed up profile I/O.
> * Use 1-bit counters. This could be useful for those who are only
interested in binary coverage. IMO there are other ideas we should try before
compromising on report accuracy.
> * Use saturating counters. IMO this isn't likely to be a win in common
cases, but could increase compile time and code size.
> 
> 5. Using coverage interactively while hacking on llvm
> 
> During the BoF I mentioned that it can be really useful to see coverage
reporting interactively, as you're working on a patch. Here's a hacky
way to do this:
> 
> * Build your code as you normally would (say, "ninja opt")
> * Change the files you're interested in
> * cd to your build directory and export
CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping
+-fprofile-instr-generate=/tmp/opt_%m.profraw"
> * Rebuild ("ninja opt" again). This will enable coverage
instrumentation, but only for the files you've affected with your changes.
> * Run a one-liner to generate a coverage report
(http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports)
> 
> I like this approach because it means I don't have to maintain a
separate, coverage-enabled build tree. It's an easy way to check that your
patches have decent test coverage. If I want to disable coverage reporting I
just need to unset CCC_OVERRIDE_OPTIONS and recompile.
> 
> 6. C APIs for libCoverage
> 
> We didn't get a chance to discuss this in detail during the BoF, but I
would like to upstream some C APIs to surface functionality from libCoverage.
This will make it easier for IDEs and editors to display coverage information
"in-line", right next to source code. Here's what that might look
like:
> 
>
https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html
> 
> If anyone has concerns about adding in these APIs, please let me know!
> 
> 7. Making use of debug info
> 
> From Eli's notes:
> 
>> It seemed like we got a lot of questions related to why we aren't
using debug info. :) It might be possible to come up with some sort of hybrid
which trades off runtime overhead for lower resolution, without completely
throwing away regions like gcov does.  But it would be a big project, and the
end result would still have a lot of the same problems as actual gcov in terms
of the optimizer destroying necessary info.
> 
> To add to this: I think there are a lot of unanswered questions here.
It's unclear how clang would decide to use debug info instead of regions, or
how the different types of coverage counters would interact. I'm not very
optimistic about this.
> 
> thanks,
> vedant
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171030/dd59feff/attachment.html>

Bob Wilson via llvm-dev

2017-Oct-30 22:24 UTC

head link

[llvm-dev] Code coverage BoF - notes and updates

> On Oct 30, 2017, at 1:41 PM, Vedant Kumar <vsk at apple.com> wrote:
> 
>> 
>> On Oct 24, 2017, at 1:24 PM, Vedant Kumar <vsk at apple.com
<mailto:vsk at apple.com>> wrote:
>> 
>> Hello,
>> 
>> Our goals for the code coverage BoF (10/19) were to find areas where we
can improve the coverage tooling, and to learn more about how coverage is used.
I'd like to thank all of the attendees for their input and for making the
BoF productive. Special thanks to Mandeep Grang, who volunteered as a mic runner
at the last minute.
>> 
>> In this email I'll share my (rough) notes and outline some future
plans. Please feel free to ask for clarifications or to add your own notes.
>> 
>> Here are the slides from the BoF:
>>
https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p
<https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p>
>> 
>> 1. The header problem
>> 
>> Coverage instrumentation overhead is roughly quadratic in the number of
translation units in a project. The problem is that coverage mappings for
template instantiations and static inline functions from headers are pulled into
every TU. This bloats the profile metadata sections (which can slow down profile
I/O), results in large binaries, and causes long link times (or link failures).
>> 
>> We could solve this problem by maintaining an external coverage
database and discarding duplicate coverage mappings from the DB. Another idea is
to emit coverage mappings to a side file and unique them when generating
coverage reports. Both ideas require changes to the build workflow.
>> 
>> A third option is to emit named coverage mappings with linkonce_odr
linkage (for languages with an ODR). This would be a format-breaking change but
it wouldn't affect the build workflow. My plan is to try and evaluate this
idea in the coming week.
> 
> Following up on this thread:
> 
> I found that marking coverage mappings, function records, and names of
functions from headers as linkonce_odr results in decent binary size savings. I
tested this idea out and reported my results here:
https://bugs.llvm.org/show_bug.cgi?id=34533
<https://bugs.llvm.org/show_bug.cgi?id=34533>. I think this is the
solution we should go with, but am curious to know what others think.
> 
> thanks,
> vedant
This seems like a good step forward. An external coverage database might still
be a good idea, but that these solutions are complementary.
> 
>> 
>> 2. HTML report quality
>> 
>> There seems to be widespread interest in improving the quality of
coverage reports. We need volunteers to work on this and would love your help!
Here are some desired features:
>> 
>> * Search and filtering for coverage summaries
>> * Collapsing parts of a coverage summary by subdirectory
>> * Automatically generating a top 10 list of code regions which need
better coverage
>> * Searching via complex queries (e.g: 'give me uncovered regions in
covered lines', or 'give me uncovered regions after a call')
>> * Generating coverage deltas between two profiles, and identifying
coverage regressions in a patch/commit
>> * Simplified tracking of coverage trends over time
>> 
>> There is some consensus that this functionality should not be built on
top of the existing llvm-cov C++ codebase. It might be better to develop these
features in a language more amenable to rapid prototyping and interoperation
with popular web application frameworks (perhaps Python). To facilitate this,
llvm-cov gained support for exporting all of its data to JSON (see
CoverageExporterJson.cpp). If you are interested in working on these features, I
would be happy to work with you on design issues and on code review.
>> 
>> 3. Optimizing profile counter placement
>> 
>> From Eli's notes:
>> 
>>> I remember we also spent some time discussing the counter
intrinsics, and whether we could produce a different set of intrinsics in the
frontend, and produce the counters later in the pipeline to avoid duplicate
counters.   I didn't completely follow that discussion; I haven't spent
much time looking at the counter intrinsics or how they're lowered.
>> 
>> Just to recap: the frontend emits calls to the llvm.instrprof_increment
intrinsic to implement counter updates. Each increment intrinsic is passed a
function name and a counter index (there's a mapping between AST nodes and
counter indices). The intrinsics are lowered in the InstrProfiling pass. During
lowering, an array of uint64_t counters is created for each function, and the
intrinsic calls are replaced by a load-add-store pattern.
>> 
>> Frontend counter updates can look highly redundant because of inlining.
It's common to see single basic blocks with tens of distinct counter
updates, most of which are redundant. One potential solution is to create a
minimal set of profile counter updates after the inliner runs, and to map these
counters back to AST nodes (https://bugs.llvm.org/show_bug.cgi?id=33500
<https://bugs.llvm.org/show_bug.cgi?id=33500>). This is the most promising
approach we know of to cut down on counter updates, but I don't have a
precise idea of how it would work. Here's a rough sketch of a solution:
>> 
>> * Have the frontend emit 'virtual' llvm.instrprof_increment
intrinsics. These will eventually be discarded during lowering.
>> * Run an early inlining step, then run the IR PGO pass.
>> * In the lowering step, emit a section into the object which describes
how to map the real counter updates to the virtual ones. I don't have a
clear idea of how to build or encode this mapping.
>> * Teach llvm-profdata how to reconstruct an indexed profile which the
frontend can understand (i.e map the real counters back to the virtual ones).
llvm-profdata would need to inspect the mapping section in the binary to
accomplish this.
>> 
>> 4. Optimizing profile counter updates
>> 
>> We had a few different suggestions to speed up profile counter updates:
>> 
>> * Make function counter arrays linkonce_odr when possible. This is
similar to the solution from the first section ("The header problem").
I'll try to evaluate this idea in the coming week.
>> * Enable register promotion for counter updates which occur within
loops. David Li has already done the work to enable this for IR PGO.
>> * Investigate the # of relocations emitted for counter updates. It
might be cheaper to load the address of the function counter array once and
index into it, instead of indexing into the global on each update.
>> * Use 32-bit counters. This would cut the size of the counters section
in half and speed up profile I/O.
>> * Use 1-bit counters. This could be useful for those who are only
interested in binary coverage. IMO there are other ideas we should try before
compromising on report accuracy.
>> * Use saturating counters. IMO this isn't likely to be a win in
common cases, but could increase compile time and code size.
>> 
>> 5. Using coverage interactively while hacking on llvm
>> 
>> During the BoF I mentioned that it can be really useful to see coverage
reporting interactively, as you're working on a patch. Here's a hacky
way to do this:
>> 
>> * Build your code as you normally would (say, "ninja opt")
>> * Change the files you're interested in
>> * cd to your build directory and export
CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping
+-fprofile-instr-generate=/tmp/opt_%m.profraw"
>> * Rebuild ("ninja opt" again). This will enable coverage
instrumentation, but only for the files you've affected with your changes.
>> * Run a one-liner to generate a coverage report
(http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports)
>> 
>> I like this approach because it means I don't have to maintain a
separate, coverage-enabled build tree. It's an easy way to check that your
patches have decent test coverage. If I want to disable coverage reporting I
just need to unset CCC_OVERRIDE_OPTIONS and recompile.
>> 
>> 6. C APIs for libCoverage
>> 
>> We didn't get a chance to discuss this in detail during the BoF,
but I would like to upstream some C APIs to surface functionality from
libCoverage. This will make it easier for IDEs and editors to display coverage
information "in-line", right next to source code. Here's what that
might look like:
>> 
>>
https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html
>> 
>> If anyone has concerns about adding in these APIs, please let me know!
>> 
>> 7. Making use of debug info
>> 
>> From Eli's notes:
>> 
>>> It seemed like we got a lot of questions related to why we
aren't using debug info. :) It might be possible to come up with some sort
of hybrid which trades off runtime overhead for lower resolution, without
completely throwing away regions like gcov does.  But it would be a big project,
and the end result would still have a lot of the same problems as actual gcov in
terms of the optimizer destroying necessary info.
>> 
>> To add to this: I think there are a lot of unanswered questions here.
It's unclear how clang would decide to use debug info instead of regions, or
how the different types of coverage counters would interact. I'm not very
optimistic about this.
>> 
>> thanks,
>> vedant
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171030/73c59944/attachment-0001.html>

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Oct 2017 - Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

[llvm-dev] Code coverage BoF - notes and updates

Seemingly Similar Threads