thr3ads.net - llvm dev - [llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED

If this information is useful, please help other people find it:
Share via:

Friedman, Eli via llvm-dev

2017-Jun-17 01:08 UTC

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

I've started looking at the state of code coverage recently; we figured 
LLVM itself would be a good test to figure out how mature it is, so I 
gave it a shot.  My experience:

1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld works; I 
didn't try).  If you link with binutils ld, the program will generate 
broken profile information.  Apparently, the linked binary is missing 
the __llvm_prf_names section.  This took me half a day to figure out.  
This issue isn't documented anywhere, and the only error message I got 
was "Assertion `!Key.empty()' failed." from llvm-cov.

2. The generated binaries are big and slow.  Comparing to a build 
without coverage, llc becomes 8x larger overall (text section becomes 
roughly 2x larger).  And check-llvm-codegen-arm goes from 3 seconds to 
250 seconds.

3. The generated profile information takes up a lot of space: llc 
generates a 90MB profraw file.

4. When prepare-code-coverage-artifact.py invokes llvm-profdata for the 
profiles generated by "make check", it takes 50GB of memory to process
about 1.5GB of profiles.  Is it supposed to use that much?

5. Using prepare-code-coverage-artifact.py generates "warning: 229 
functions have mismatched data".  I'm not sure what's causing
this... I
guess it has something to do with merging the profile data for multiple 
binaries?  The error message is not very helpful.

5. The HTML output highlights the semicolon after a break or return 
statement in some switch statements in red.  (For example, 
LowerADDC_ADDE_SUBC_SUBE in ARMISelLowering.cpp.)  Not really important, 
but annoying.

6. On the bright side, when it works, the generated coverage information 
is precise and easy to read.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

Vedant Kumar via llvm-dev

2017-Jun-18 22:51 UTC

head link

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

Hi Eli,

Thanks for sharing your experience. I'd very much like to fix the problems
you encountered.
> On Jun 16, 2017, at 6:08 PM, Friedman, Eli <efriedma at
codeaurora.org> wrote:
> 
> I've started looking at the state of code coverage recently; we figured
LLVM itself would be a good test to figure out how mature it is, so I gave it a
shot.
You may already be aware of this, but for readers who are not, there is a public
bot which produces coverage reports for llvm roughly twice a day. You can find
it by visiting llvm.org <http://llvm.org/> and clicking on the
"llvm-cov" link within the "Useful Links" box (in the
"Dev. Resources" section). Coverage is gathered by running
check-{llvm,clang,polly,lld} and the 'nightly' test suite.
> My experience:
> 
> 1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld works; I
didn't try).  If you link with binutils ld, the program will generate broken
profile information.  Apparently, the linked binary is missing the
__llvm_prf_names section.  This took me half a day to figure out.  This issue
isn't documented anywhere, and the only error message I got was
"Assertion `!Key.empty()' failed." from llvm-cov.
I expect llvm-cov to print out "Failed to load coverage:
<reason>" in this situation. There was some work done to tighten up
error reporting in ProfileData and its clients in r270020. If your host
toolchain does have these changes, please file a bug, and I'll have it
fixed.

I was not aware of the issue with the binutils linker. We do have some
end-to-end, runtime tests in compiler-rt which use this linker, so this type of
failure is surprising. I've CC'd David Li, who has some experience
working with this linker, in case he has any insight about the issue.

If you are using a relatively up-to-date host toolchain, I'll add a note to
our docs suggesting that users use gold when compiling with coverage enabled.
> 2. The generated binaries are big and slow.  Comparing to a build without
coverage, llc becomes 8x larger overall (text section becomes roughly 2x
larger).  And check-llvm-codegen-arm goes from 3 seconds to 250 seconds.
The binary size increase comes from coverage mapping data, counter increment
instrumentation, and profiling metadata.

The coverage mapping section is highly compressible, but exploiting the
compressibility has proven to be tricky. I filed: llvm.org/PR33499
<http://llvm.org/PR33499>.

Coverage makes use of frontend-based instrumentation, which is much less
efficient than the IR-based kind. If we can find a way to map counters inserted
by IR PGO to AST nodes, we could improve the situation. I filed:
llvm.org/PR33500 <http://llvm.org/PR33500>.

We can reduce testing time by *not* instrumented basic tools like count, not,
FileCheck etc. I filed: llvm.org/PR33501 <http://llvm.org/PR33501>.
> 3. The generated profile information takes up a lot of space: llc generates
a 90MB profraw file.
I don't have any ideas about how to fix this. You can decrease the space
overhead for raw profiles by altering LLVM_PROFILE_MERGE_POOL_SIZE from 4 to a
lower value.
> 4. When prepare-code-coverage-artifact.py invokes llvm-profdata for the
profiles generated by "make check", it takes 50GB of memory to process
about 1.5GB of profiles.  Is it supposed to use that much?
By default, llvm-profdata uses hardware_concurrency() to determine the number of
threads to use to merge profiles. You can change the default by passing
-j/--num-threads to llvm-profdata. I'm open to changing the 'prep'
script to use -j4 or something like that.
> 5. Using prepare-code-coverage-artifact.py generates "warning: 229
functions have mismatched data".  I'm not sure what's causing
this... I guess it has something to do with merging the profile data for
multiple binaries?  The error message is not very helpful.
This is unexpected. I'll try to reproduce this, and I'll fix the
diagnostic along the way. I filed: llvm.org/PR33502
<http://llvm.org/PR33502>.
> 5. The HTML output highlights the semicolon after a break or return
statement in some switch statements in red.  (For example,
LowerADDC_ADDE_SUBC_SUBE in ARMISelLowering.cpp.)  Not really important, but
annoying.
I'm sure I'm sitting on a bug report about this already, but
unfortunately haven't had the time to get around to it.
> 6. On the bright side, when it works, the generated coverage information is
precise and easy to read.
Good to hear.

vedant
> 
> -Eli
> 
> -- 
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170618/5bd6d342/attachment.html>

Xinliang David Li via llvm-dev

2017-Jun-19 05:07 UTC

head link

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

On Fri, Jun 16, 2017 at 6:08 PM, Friedman, Eli via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I've started looking at the state of code coverage recently; we figured
> LLVM itself would be a good test to figure out how mature it is, so I gave
> it a shot.  My experience:
>
> 1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld works; I
> didn't try).  If you link with binutils ld, the program will generate
> broken profile information.  Apparently, the linked binary is missing the
> __llvm_prf_names section.  This took me half a day to figure out.  This
> issue isn't documented anywhere, and the only error message I got was
> "Assertion `!Key.empty()' failed." from llvm-cov.
>
>
I believe the gnu-ld bug is
https://sourceware.org/bugzilla/show_bug.cgi?id=19161 which is fixed in
version 2.26.


> 2. The generated binaries are big and slow.  Comparing to a build without
> coverage, llc becomes 8x larger overall (text section becomes roughly 2x
> larger).  And check-llvm-codegen-arm goes from 3 seconds to 250 seconds.
>
Over last couple of years, the instrumentation and coverage data overhead
has reduced greatly.  FE based instrumentation in general has larger
overhead than IR based instrumentation, but the coverage testing currently
only works with FE instrumentation.

>
> 3. The generated profile information takes up a lot of space: llc
> generates a 90MB profraw file.
>
This looks like in the normal range of raw profile size.


David

>
> 4. When prepare-code-coverage-artifact.py invokes llvm-profdata for the
> profiles generated by "make check", it takes 50GB of memory to
process
> about 1.5GB of profiles.  Is it supposed to use that much?
>
> 5. Using prepare-code-coverage-artifact.py generates "warning: 229
> functions have mismatched data".  I'm not sure what's causing
this... I
> guess it has something to do with merging the profile data for multiple
> binaries?  The error message is not very helpful.
>
> 5. The HTML output highlights the semicolon after a break or return
> statement in some switch statements in red.  (For example,
> LowerADDC_ADDE_SUBC_SUBE in ARMISelLowering.cpp.)  Not really important,
> but annoying.
>
> 6. On the bright side, when it works, the generated coverage information
> is precise and easy to read.
>
> -Eli
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170618/165636a8/attachment.html>

Friedman, Eli via llvm-dev

2017-Jun-19 23:32 UTC

head link

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

On 6/18/2017 3:51 PM, Vedant Kumar wrote:>> My experience:
>>
>> 1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld works; I 
>> didn't try).  If you link with binutils ld, the program will
generate
>> broken profile information.  Apparently, the linked binary is missing 
>> the __llvm_prf_names section.  This took me half a day to figure out. 
>>  This issue isn't documented anywhere, and the only error message I
>> got was "Assertion `!Key.empty()' failed." from llvm-cov.
>
> I expect llvm-cov to print out "Failed to load coverage:
<reason>" in
> this situation. There was some work done to tighten up error reporting 
> in ProfileData and its clients in r270020. If your host toolchain does 
> have these changes, please file a bug, and I'll have it fixed.
Host toolchain is trunk clang... but using system binutils (which is 
2.24 on my Ubuntu 14.04 system... and apparently that's too old per 
David Li's response).  Anyway, filed 
https://bugs.llvm.org/show_bug.cgi?id=33517 .
>
>> 2. The generated binaries are big and slow.  Comparing to a build 
>> without coverage, llc becomes 8x larger overall (text section becomes 
>> roughly 2x larger).  And check-llvm-codegen-arm goes from 3 seconds 
>> to 250 seconds.
>
> The binary size increase comes from coverage mapping data, counter 
> increment instrumentation, and profiling metadata.
>
> The coverage mapping section is highly compressible, but exploiting 
> the compressibility has proven to be tricky. I filed: llvm.org/PR33499 
> <http://llvm.org/PR33499>.
If I'm cross-compiling for a target where the space matters, can I rid 
of the data for the copy on the device using "strip -R __llvm_covmap"
or
something like that, then use llvm-cov on the original?
> Coverage makes use of frontend-based instrumentation, which is much 
> less efficient than the IR-based kind. If we can find a way to map 
> counters inserted by IR PGO to AST nodes, we could improve the 
> situation. I filed: llvm.org/PR33500 <http://llvm.org/PR33500>.
This would be nice... but I assume it's hard. :)
>
> We can reduce testing time by *not* instrumented basic tools like 
> count, not, FileCheck etc. I filed: llvm.org/PR33501 
> <http://llvm.org/PR33501>.
>
>> 3. The generated profile information takes up a lot of space: llc 
>> generates a 90MB profraw file.
>
> I don't have any ideas about how to fix this. You can decrease the 
> space overhead for raw profiles by altering 
> LLVM_PROFILE_MERGE_POOL_SIZE from 4 to a lower value.
Disk space is cheap, but the I/O takes a long time.  I guess it's 
specifically bad for LLVM's "make check", maybe not so bad for
other cases.
>> 4. When prepare-code-coverage-artifact.py invokes llvm-profdata for 
>> the profiles generated by "make check", it takes 50GB of
memory to
>> process about 1.5GB of profiles.  Is it supposed to use that much?
>
> By default, llvm-profdata uses hardware_concurrency() to determine the 
> number of threads to use to merge profiles. You can change the default 
> by passing -j/--num-threads to llvm-profdata. I'm open to changing the 
> 'prep' script to use -j4 or something like that.
>
Oh, so it's using a couple gigabytes per thread multiplied by 24 cores?  
Okay, now I'm not so worried. :)

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/bd954adb/attachment.html>

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Jun 2017 - My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

Seemingly Similar Threads