thr3ads.net - llvm dev - [llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED

If this information is useful, please help other people find it:
Share via:

Friedman, Eli via llvm-dev

2017-Jun-19 23:32 UTC

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

On 6/18/2017 3:51 PM, Vedant Kumar wrote:>> My experience:
>>
>> 1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld works; I 
>> didn't try).  If you link with binutils ld, the program will
generate
>> broken profile information.  Apparently, the linked binary is missing 
>> the __llvm_prf_names section.  This took me half a day to figure out. 
>>  This issue isn't documented anywhere, and the only error message I
>> got was "Assertion `!Key.empty()' failed." from llvm-cov.
>
> I expect llvm-cov to print out "Failed to load coverage:
<reason>" in
> this situation. There was some work done to tighten up error reporting 
> in ProfileData and its clients in r270020. If your host toolchain does 
> have these changes, please file a bug, and I'll have it fixed.
Host toolchain is trunk clang... but using system binutils (which is 
2.24 on my Ubuntu 14.04 system... and apparently that's too old per 
David Li's response).  Anyway, filed 
https://bugs.llvm.org/show_bug.cgi?id=33517 .
>
>> 2. The generated binaries are big and slow.  Comparing to a build 
>> without coverage, llc becomes 8x larger overall (text section becomes 
>> roughly 2x larger).  And check-llvm-codegen-arm goes from 3 seconds 
>> to 250 seconds.
>
> The binary size increase comes from coverage mapping data, counter 
> increment instrumentation, and profiling metadata.
>
> The coverage mapping section is highly compressible, but exploiting 
> the compressibility has proven to be tricky. I filed: llvm.org/PR33499 
> <http://llvm.org/PR33499>.
If I'm cross-compiling for a target where the space matters, can I rid 
of the data for the copy on the device using "strip -R __llvm_covmap"
or
something like that, then use llvm-cov on the original?
> Coverage makes use of frontend-based instrumentation, which is much 
> less efficient than the IR-based kind. If we can find a way to map 
> counters inserted by IR PGO to AST nodes, we could improve the 
> situation. I filed: llvm.org/PR33500 <http://llvm.org/PR33500>.
This would be nice... but I assume it's hard. :)
>
> We can reduce testing time by *not* instrumented basic tools like 
> count, not, FileCheck etc. I filed: llvm.org/PR33501 
> <http://llvm.org/PR33501>.
>
>> 3. The generated profile information takes up a lot of space: llc 
>> generates a 90MB profraw file.
>
> I don't have any ideas about how to fix this. You can decrease the 
> space overhead for raw profiles by altering 
> LLVM_PROFILE_MERGE_POOL_SIZE from 4 to a lower value.
Disk space is cheap, but the I/O takes a long time.  I guess it's 
specifically bad for LLVM's "make check", maybe not so bad for
other cases.
>> 4. When prepare-code-coverage-artifact.py invokes llvm-profdata for 
>> the profiles generated by "make check", it takes 50GB of
memory to
>> process about 1.5GB of profiles.  Is it supposed to use that much?
>
> By default, llvm-profdata uses hardware_concurrency() to determine the 
> number of threads to use to merge profiles. You can change the default 
> by passing -j/--num-threads to llvm-profdata. I'm open to changing the 
> 'prep' script to use -j4 or something like that.
>
Oh, so it's using a couple gigabytes per thread multiplied by 24 cores?  
Okay, now I'm not so worried. :)

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/bd954adb/attachment.html>

Vedant Kumar via llvm-dev

2017-Jun-20 02:29 UTC

head link

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

> On Jun 19, 2017, at 4:32 PM, Friedman, Eli <efriedma at
codeaurora.org> wrote:
> 
> On 6/18/2017 3:51 PM, Vedant Kumar wrote:
>>> My experience:
>>> 
>>> 1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld works;
I didn't try).  If you link with binutils ld, the program will generate
broken profile information.  Apparently, the linked binary is missing the
__llvm_prf_names section.  This took me half a day to figure out.  This issue
isn't documented anywhere, and the only error message I got was
"Assertion `!Key.empty()' failed." from llvm-cov.
>> 
>> I expect llvm-cov to print out "Failed to load coverage:
<reason>" in this situation. There was some work done to tighten up
error reporting in ProfileData and its clients in r270020. If your host
toolchain does have these changes, please file a bug, and I'll have it
fixed.
> 
> Host toolchain is trunk clang... but using system binutils (which is 2.24
on my Ubuntu 14.04 system... and apparently that's too old per David
Li's response).  Anyway, filed https://bugs.llvm.org/show_bug.cgi?id=33517
<https://bugs.llvm.org/show_bug.cgi?id=33517> .
I've updated the clang docs re: 'Source based code coverage' to
reflect this issue. I've also tightened up our error reporting a bit so we
fail earlier with something better than an assertion message (r305765, r305767).
>>> 2. The generated binaries are big and slow.  Comparing to a build
without coverage, llc becomes 8x larger overall (text section becomes roughly 2x
larger).  And check-llvm-codegen-arm goes from 3 seconds to 250 seconds.
>> 
>> The binary size increase comes from coverage mapping data, counter
increment instrumentation, and profiling metadata.
>> 
>> The coverage mapping section is highly compressible, but exploiting the
compressibility has proven to be tricky. I filed: llvm.org/PR33499
<http://llvm.org/PR33499>.
> 
> If I'm cross-compiling for a target where the space matters, can I rid
of the data for the copy on the device using "strip -R __llvm_covmap"
or something like that, then use llvm-cov on the original?
I haven't tried this but I expect it to work. Instrumented programs
don't reference the __llvm_covmap section.
>> Coverage makes use of frontend-based instrumentation, which is much
less efficient than the IR-based kind. If we can find a way to map counters
inserted by IR PGO to AST nodes, we could improve the situation. I filed:
llvm.org/PR33500 <http://llvm.org/PR33500>.
> 
> This would be nice... but I assume it's hard. :)
It seems like it is. At a high level, you'd need some way to associate the
counters placed by IR PGO instrumentation to the counters that clang expects to
see while walking an AST. I don't have a concrete design for this in mind.
>> We can reduce testing time by *not* instrumented basic tools like
count, not, FileCheck etc. I filed: llvm.org/PR33501
<http://llvm.org/PR33501>.
>> 
>>> 3. The generated profile information takes up a lot of space: llc
generates a 90MB profraw file.
>> 
>> I don't have any ideas about how to fix this. You can decrease the
space overhead for raw profiles by altering LLVM_PROFILE_MERGE_POOL_SIZE from 4
to a lower value.
> 
> Disk space is cheap, but the I/O takes a long time.  I guess it's
specifically bad for LLVM's "make check", maybe not so bad for
other cases.
You can speed up "make check" a bit by using non-instrumented versions
of count, not, FileCheck, etc.

vedant
>>> 4. When prepare-code-coverage-artifact.py invokes llvm-profdata for
the profiles generated by "make check", it takes 50GB of memory to
process about 1.5GB of profiles.  Is it supposed to use that much?
>> 
>> By default, llvm-profdata uses hardware_concurrency() to determine the
number of threads to use to merge profiles. You can change the default by
passing -j/--num-threads to llvm-profdata. I'm open to changing the
'prep' script to use -j4 or something like that.
>> 
> 
> Oh, so it's using a couple gigabytes per thread multiplied by 24 cores?
Okay, now I'm not so worried. :)
> -Eli
> -- 
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/e4e29664/attachment.html>

Vedant Kumar via llvm-dev

2017-Jun-20 02:36 UTC

head link

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

> On Jun 19, 2017, at 7:29 PM, Vedant Kumar <vsk at apple.com> wrote:
> 
> 
>> On Jun 19, 2017, at 4:32 PM, Friedman, Eli <efriedma at
codeaurora.org <mailto:efriedma at codeaurora.org>> wrote:
>> 
>> On 6/18/2017 3:51 PM, Vedant Kumar wrote:
>>>> My experience:
>>>> 
>>>> 1. You have to specify -DLLVM_USE_LINKER=gold (or maybe lld
works; I didn't try).  If you link with binutils ld, the program will
generate broken profile information.  Apparently, the linked binary is missing
the __llvm_prf_names section.  This took me half a day to figure out.  This
issue isn't documented anywhere, and the only error message I got was
"Assertion `!Key.empty()' failed." from llvm-cov.
>>> 
>>> I expect llvm-cov to print out "Failed to load coverage:
<reason>" in this situation. There was some work done to tighten up
error reporting in ProfileData and its clients in r270020. If your host
toolchain does have these changes, please file a bug, and I'll have it
fixed.
>> 
>> Host toolchain is trunk clang... but using system binutils (which is
2.24 on my Ubuntu 14.04 system... and apparently that's too old per David
Li's response).  Anyway, filed https://bugs.llvm.org/show_bug.cgi?id=33517
<https://bugs.llvm.org/show_bug.cgi?id=33517> .
> 
> I've updated the clang docs re: 'Source based code coverage' to
reflect this issue. I've also tightened up our error reporting a bit so we
fail earlier with something better than an assertion message (r305765, r305767).
> 
>>>> 2. The generated binaries are big and slow.  Comparing to a
build without coverage, llc becomes 8x larger overall (text section becomes
roughly 2x larger).  And check-llvm-codegen-arm goes from 3 seconds to 250
seconds.
>>> 
>>> The binary size increase comes from coverage mapping data, counter
increment instrumentation, and profiling metadata.
>>> 
>>> The coverage mapping section is highly compressible, but exploiting
the compressibility has proven to be tricky. I filed: llvm.org/PR33499
<http://llvm.org/PR33499>.
>> 
>> If I'm cross-compiling for a target where the space matters, can I
rid of the data for the copy on the device using "strip -R
__llvm_covmap" or something like that, then use llvm-cov on the original?
> 
> I haven't tried this but I expect it to work. Instrumented programs
don't reference the __llvm_covmap section.
> 
>>> Coverage makes use of frontend-based instrumentation, which is much
less efficient than the IR-based kind. If we can find a way to map counters
inserted by IR PGO to AST nodes, we could improve the situation. I filed:
llvm.org/PR33500 <http://llvm.org/PR33500>.
>> 
>> This would be nice... but I assume it's hard. :)
> 
> It seems like it is. At a high level, you'd need some way to associate
the counters placed by IR PGO instrumentation to the counters that clang expects
to see while walking an AST. I don't have a concrete design for this in
mind.
> 
>>> We can reduce testing time by *not* instrumented basic tools like
count, not, FileCheck etc. I filed: llvm.org/PR33501
<http://llvm.org/PR33501>.
>>> 
>>>> 3. The generated profile information takes up a lot of space:
llc generates a 90MB profraw file.
>>> 
>>> I don't have any ideas about how to fix this. You can decrease
the space overhead for raw profiles by altering LLVM_PROFILE_MERGE_POOL_SIZE
from 4 to a lower value.
>> 
>> Disk space is cheap, but the I/O takes a long time.  I guess it's
specifically bad for LLVM's "make check", maybe not so bad for
other cases.
> 
> You can speed up "make check" a bit by using non-instrumented
versions of count, not, FileCheck, etc.
Ah, sorry for mentioning this twice.

On another note, I'm looking into the "N mismatched functions"
warnings issue, and suspect that it happens when there are conflicting
definitions of the same function in different binaries. The issue doesn't
seem to occur when using profiles from just one binary to generate a report for
that binary. I'll dig into this a bit more and update PR33502.

vedant
> 
> vedant
> 
>>>> 4. When prepare-code-coverage-artifact.py invokes llvm-profdata
for the profiles generated by "make check", it takes 50GB of memory to
process about 1.5GB of profiles.  Is it supposed to use that much?
>>> 
>>> By default, llvm-profdata uses hardware_concurrency() to determine
the number of threads to use to merge profiles. You can change the default by
passing -j/--num-threads to llvm-profdata. I'm open to changing the
'prep' script to use -j4 or something like that.
>>> 
>> 
>> Oh, so it's using a couple gigabytes per thread multiplied by 24
cores?  Okay, now I'm not so worried. :)
>> -Eli
>> -- 
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/c822ba05/attachment-0001.html>

Friedman, Eli via llvm-dev

2017-Jun-27 02:24 UTC

head link

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

On 6/19/2017 7:29 PM, Vedant Kumar wrote:>
>>> We can reduce testing time by *not* instrumented basic tools like 
>>> count, not, FileCheck etc. I filed: llvm.org/PR33501 
>>> <http://llvm.org/PR33501>.
>>>
>>>> 3. The generated profile information takes up a lot of space:
llc
>>>> generates a 90MB profraw file.
>>>
>>> I don't have any ideas about how to fix this. You can decrease
the
>>> space overhead for raw profiles by altering 
>>> LLVM_PROFILE_MERGE_POOL_SIZE from 4 to a lower value.
>>
>> Disk space is cheap, but the I/O takes a long time.  I guess it's 
>> specifically bad for LLVM's "make check", maybe not so
bad for other
>> cases.
>
> You can speed up "make check" a bit by using non-instrumented
versions
> of count, not, FileCheck, etc.
I tried looking into this a bit more.  It looks like the profile data 
file generated by llc contains approximately 5MB of counters 
(__llvm_prf_cnts), 10MB of "data" (__llvm_prf_data), and 70MB of 
__llvm_prf_names.  __llvm_prf_data and __llvm_prf_names contain which 
can be read from the original binary, as far as I can tell. The 80MB of 
data wouldn't be a big deal if it were just sitting on disk... but we 
also erase the whole file and rewrite it from scratch after we merge 
profile counters.

Can we do better here?

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170626/6ad7cce6/attachment.html>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Jun 2017 - My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

[llvm-dev] My experience using -DLLVM_BUILD_INSTRUMENTED_COVERAGE to generate coverage

Possibly Parallel Threads