thr3ads.net - llvm dev - [llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Sean Silva via llvm-dev

2017-Jul-01 05:04 UTC

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

On Fri, Jun 30, 2017 at 5:54 PM, via llvm-dev <llvm-dev at lists.llvm.org>
wrote:
> Problem
> -------
>
> Instrumentation for PGO and frontend-based coverage places a large amount
> of
> data in object files, even though the majority of this data is not needed
> at
> run-time. All the data is needlessly duplicated while generating archives,
> and
> again while linking. PGO name data is written out into raw profiles by
> instrumented programs, slowing down the training and code coverage
> workflows.
>
> Here are some numbers from a coverage + RA build of ToT clang:
>
>   * Size of the build directory: 4.3 GB
>
>   * Wall time needed to run "clang -help" with an SSD: 0.5
seconds
>
>   * Size of the clang binary: 725.24 MB
>
>   * Space wasted on duplicate name/coverage data (*.o + *.a): 923.49 MB
>     - Size contributed by __llvm_covmap sections: 1.02 GB
>       \_ Just within clang: 340.48 MB
>
We live with this duplication for debug info. In some sense, if the
overhead is small compared to debug info, should we even bother (i.e., we
assume that users accommodate debug builds, so that is a reasonable bound
on the tolerable build directory size). (I don't know the numbers; this
seems pretty large so maybe it is significant compared to debug info; just
saying that looking at absolute numbers is misleading here; numbers
compared to debug info are a closer measure to the user's perceptions)

In fact, one overall architectural observation I have is that the most
complicated part of all this is simply establishing the workflow to plumb
together data emitted per-TU to a tool that needs that information to do
some post-processing step on the results of running the binary. That sounds
a lot like the role of debug info. In fact, having a debugger open a core
file is precisely equivalent to what llvm-profdata needs to do in this
regard AFAICT.

So it would be best if possible to piggyback on all the effort that has
gone into plumbing that data to make debug info work. For example, I know
that on Darwin there's a fair amount of system-level integration to make
split dwarf "just work" while keeping debug info out of final
binaries.

If there is a not-too-hacky way to piggyback on debug info, that's likely
to be a really slick solution. For example, debug info could in principle
(if it doesn't already) contain information about the name of each counter
in the counter array, so in principle it would be a complete enough
description to identify each counter.

I'm not very familiar with DWARF, but I'm imagining something like
reserving an LLVM vendor-specific DWARF opcode/attribute/whatever and then
stick a blob of data in there. Presumably we have code somewhere in LLDB
that is "here's a binary, find debug info for it", and in
principle we
could factor out that code and lift it into an LLVM library
(libFindDebugInfo) that llvm-profdata could use.

>     - Size contributed by __llvm_prf_names sections: 327.46 MB
>       \_ Just within clang: 106.76 MB
>
>     => Space wasted within the clang binary: 447.24 MB
>
> Running an instrumented clang binary triggers a 143MB raw profile write
> which
> is slow even with an SSD. This problem is particularly bad for
> frontend-based
> coverage because it generates a lot of extra name data: however, the
> situation
> can also be improved for PGO instrumentation.
>
> Proposal
> --------
>
> Place PGO name data and coverage data outside of object files. This would
> eliminate data duplication in *.a/*.o files, shrink binaries, shrink raw
> profiles, and speed up instrumented programs.
>
> In more detail:
>
> 1. The frontends get a new `-fprofile-metadata-dir=<path>` option.
This
> lets
> users specify where llvm will store profile metadata. If the metadata
> starts to
> take up too much space, there's just one directory to clean.
>
> 2. The frontends continue emitting PGO name data and coverage data in the
> same
> llvm::Module. So does LLVM's IR-based PGO implementation. No change
here.
>
> 3. If the InstrProf lowering pass sees that a metadata directory is
> available,
> it constructs a new module, copies the name/coverage data into it, hashes
> the
> module, and attempts to write that module to:
>
>   <metadata-dir>/<module-hash>.bc   (the metadata module)
>
> If this write operation fails, it scraps the new module: it keeps all the
> metadata in the original module, and there are no changes from the current
> process. I.e with this proposal we preserve backwards compatibility.
>
Based at my experience with Clang's implicit modules, I'm *extremely*
wary
of anything that might cause the compiler to emit a file that the build
system cannot guess the name of. In fact, having the compiler emit a file
that is not explicitly listed on the command line is basically just as bad
in practice (in terms of feasibility of informing the build system about
it).

As a simple example, ninja simply cannot represent a dependency of this
type, so if you delete a <metadata-dir>/<module-hash>.bc it
won't know
things need to be rebuilt (and it won't know how to clean it, etc.).

So I would really strongly recommend against doing this.

Again, these problems of system integration (in particular build system
integration) are nasty, and if you can bypass this and piggyback on debug
info then everything will "just work" because the folks that care
about
making sure that debugging "just works" already did the work for you.
It might be more work in the short term to do the debug info approach (if
it is feasible at all), but I can tell you based on the experience with
implicit modules (and I'm sure you have some experience of your own) that
there's just going to be a neverending tail of hitches and ways that things
don't work (or work poorly) due to not having the build system / overall
system integration right, so it will be worth it in the long run.

-- Sean Silva

>
> 4. Once the metadata module is written, the name/coverage data are entirely
> stripped out of the original module. They are replaced by a path to the
> metadata module:
>
>   @__llvm_profiling_metadata =
"<metadata-dir>/<module-hash>.bc",
>                                section "__llvm_prf_link"
>
> This allows incremental builds to work properly, which is an important use
> case
> for code coverage users. When an object is rebuilt, it gets a fresh link
> to a
> fresh profiling metadata file. Although stale files can accumulate in the
> metadata directory, the stale files cannot ever be used.
>
> In an IDE like Xcode, since there's just one target binary per scheme,
it's
> possible to clean the metadata directory by removing the modules which
> aren't
> referenced by the target binary.
>
> 5. The raw profile format is updated so that links to metadata files are
> written
> out in each profile. This makes it possible for all existing llvm-profdata
> and
> llvm-cov commands to work, seamlessly.
>
> The indexed profile format will *not* be updated: i.e, it will contain a
> full
> symbol table, and no links. This simplifies the coverage mapping reader,
> because
> a full symbol table is guaranteed to exist before any function records are
> parsed. It also reduces the amount of coding, and makes it easier to
> preserve
> backwards compatibility :).
>
> 6. The raw profile reader will learn how to read links, open up the
> metadata
> modules it finds links to, and collect name data from those modules.
>
> 7. The coverage reader will learn how to read the __llvm_prf_link section,
> open
> up metadata modules, and lazily read coverage mapping data.
>
> Alternate Solutions
> -------------------
>
> 1. Instead of copying name data into an external metadata module, just
> copy the
> coverage mapping data.
>
> I've actually prototyped this. This might be a good way to split up
> patches,
> although I don't see why we wouldn't want to tackle the name data
problem
> eventually.
>
> 2. Instead of emitting links to external metadata modules, modify llvm-cov
> and
> llvm-profdata so that they require a path to the metadata directory.
>
> The issue with this is that it's way too easy to read stale metadata.
It's
> also
> less user-friendly, which hurts adoption.
>
> 3. Use something other than llvm bitcode for the metadata module format.
>
> Since we're mostly writing large binary blobs (compressed name data or
> pre-encoded source range mapping info), using bitcode shouldn't be too
> slow, and
> we're not likely to get better compression with a different format.
>
> Bitcode is also convenient, and is nice for backwards compatibility.
>
> ------------------------------------------------------------
> --------------------
>
> If you've made it this far, thanks for taking a look! I'd
appreciate any
> feedback.
>
> vedant
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170630/941a9b5b/attachment-0001.html>

Friedman, Eli via llvm-dev

2017-Jul-03 18:44 UTC

head link

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

On 7/3/2017 11:19 AM, Mehdi AMINI via llvm-dev wrote:>
>
> 2017-06-30 22:04 GMT-07:00 Sean Silva via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>>:
>
>
>
>     On Fri, Jun 30, 2017 at 5:54 PM, via llvm-dev
>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>         Problem
>         -------
>
>         Instrumentation for PGO and frontend-based coverage places a
>         large amount of
>         data in object files, even though the majority of this data is
>         not needed at
>         run-time. All the data is needlessly duplicated while
>         generating archives, and
>         again while linking. PGO name data is written out into raw
>         profiles by
>         instrumented programs, slowing down the training and code
>         coverage workflows.
>
>         Here are some numbers from a coverage + RA build of ToT clang:
>
>           * Size of the build directory: 4.3 GB
>
>           * Wall time needed to run "clang -help" with an SSD:
0.5 seconds
>
>           * Size of the clang binary: 725.24 MB
>
>           * Space wasted on duplicate name/coverage data (*.o + *.a):
>         923.49 MB
>             - Size contributed by __llvm_covmap sections: 1.02 GB
>               \_ Just within clang: 340.48 MB
>
>
>     We live with this duplication for debug info. In some sense, if
>     the overhead is small compared to debug info, should we even
>     bother (i.e., we assume that users accommodate debug builds, so
>     that is a reasonable bound on the tolerable build directory size).
>     (I don't know the numbers; this seems pretty large so maybe it is
>     significant compared to debug info; just saying that looking at
>     absolute numbers is misleading here; numbers compared to debug
>     info are a closer measure to the user's perceptions)
>
>
> From a build directory point of view, I agree. However when deploying 
> on embedded device with "limited" space/memory you can strip the
debug
> info and keep them locally because they're not needed on the device 
> for running (or remote-debugging), is it the case with the profile infos?
__llvm_prf_names and __llvm_prf_data can't be stripped at the moment: 
in-process profile writing code copies them into the profile file.  
Changing that is part of this proposal (but it could be fixed with a 
narrower change).

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170703/02ca0bd4/attachment.html>

Sean Silva via llvm-dev

2017-Jul-04 20:03 UTC

head link

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

On Mon, Jul 3, 2017 at 11:29 PM, Vedant Kumar <vsk at apple.com> wrote:
>
> On Jun 30, 2017, at 10:04 PM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>
>
> On Fri, Jun 30, 2017 at 5:54 PM, via llvm-dev <llvm-dev at
lists.llvm.org>
> wrote:
>
>> Problem
>> -------
>>
>> Instrumentation for PGO and frontend-based coverage places a large
amount
>> of
>> data in object files, even though the majority of this data is not
needed
>> at
>> run-time. All the data is needlessly duplicated while generating
>> archives, and
>> again while linking. PGO name data is written out into raw profiles by
>> instrumented programs, slowing down the training and code coverage
>> workflows.
>>
>> Here are some numbers from a coverage + RA build of ToT clang:
>>
>>   * Size of the build directory: 4.3 GB
>>
>>   * Wall time needed to run "clang -help" with an SSD: 0.5
seconds
>>
>>   * Size of the clang binary: 725.24 MB
>>
>>   * Space wasted on duplicate name/coverage data (*.o + *.a): 923.49 MB
>>     - Size contributed by __llvm_covmap sections: 1.02 GB
>>       \_ Just within clang: 340.48 MB
>>
>
> We live with this duplication for debug info. In some sense, if the
> overhead is small compared to debug info, should we even bother (i.e., we
> assume that users accommodate debug builds, so that is a reasonable bound
> on the tolerable build directory size). (I don't know the numbers; this
> seems pretty large so maybe it is significant compared to debug info; just
> saying that looking at absolute numbers is misleading here; numbers
> compared to debug info are a closer measure to the user's perceptions)
>
>
> The size of a RelWithDebInfo build directory for the same checkout is 9 GB
> (I'm still just building clang, this time without instrumentation).
>
So it sounds like the 4.3GB build directory you quoted in the OP is
substantially less, so your comment below doesn't make sense. Or was 4.3GB
just the build directory needed for building nothing but the clang binary
and its dependencies? Can you get an apples to apples number (it sounds
like it must have been much more due if the coverage bots had to be turned
off, but it would be useful to get some breakdown and an apples-to-apples
number)

> We (more or less) get away with this because the debug info isn't
copied
> into the final binary [1]. We're not getting away with this with
coverage.
> E.g we usually store bot artifacts for a while, but we had to shut this
> functionality off almost immediately for our coverage bots because the
> uploads were horrific.
>
> In fact, one overall architectural observation I have is that the most
> complicated part of all this is simply establishing the workflow to plumb
> together data emitted per-TU to a tool that needs that information to do
> some post-processing step on the results of running the binary. That sounds
> a lot like the role of debug info. In fact, having a debugger open a core
> file is precisely equivalent to what llvm-profdata needs to do in this
> regard AFAICT.
>
> So it would be best if possible to piggyback on all the effort that has
> gone into plumbing that data to make debug info work. For example, I know
> that on Darwin there's a fair amount of system-level integration to
make
> split dwarf "just work" while keeping debug info out of final
binaries.
>
> If there is a not-too-hacky way to piggyback on debug info, that's
likely
> to be a really slick solution. For example, debug info could in principle
> (if it doesn't already) contain information about the name of each
counter
> in the counter array, so in principle it would be a complete enough
> description to identify each counter.
>
>
> We don't emit debug info for this currently. Is there a reason to?
>
Probably not. My suspicion is that the most feasible solution would be one
where we just store a blob of opaque coverage data in the debug info
section. In theory (but probably not in practice) we could lower the
coverage mapping data to some form of debug info (that's what it really is,
after all; it's basically a very precise sort of debug info that is allowed
to impede optimizations to remain precise), but the effort needed to
harmonize the needs of actual debug info with "debug info lowered from
coverage data" would probably be too messy, if it was possible at all.

-- Sean Silva

>
> I'm not very familiar with DWARF, but I'm imagining something like
> reserving an LLVM vendor-specific DWARF opcode/attribute/whatever and then
> stick a blob of data in there. Presumably we have code somewhere in LLDB
> that is "here's a binary, find debug info for it", and in
principle we
> could factor out that code and lift it into an LLVM library
> (libFindDebugInfo) that llvm-profdata could use.
>
>
> This could work for the coverage/name data. There are some really nice
> pieces of Darwin integration (e.g search-with-Spotlight, findDsymForUUID).
> I'll look into this.
>
>     - Size contributed by __llvm_prf_names sections: 327.46 MB
>>       \_ Just within clang: 106.76 MB
>>
>>     => Space wasted within the clang binary: 447.24 MB
>>
>> Running an instrumented clang binary triggers a 143MB raw profile write
>> which
>> is slow even with an SSD. This problem is particularly bad for
>> frontend-based
>> coverage because it generates a lot of extra name data: however, the
>> situation
>> can also be improved for PGO instrumentation.
>>
>> Proposal
>> --------
>>
>> Place PGO name data and coverage data outside of object files. This
would
>> eliminate data duplication in *.a/*.o files, shrink binaries, shrink
raw
>> profiles, and speed up instrumented programs.
>>
>> In more detail:
>>
>> 1. The frontends get a new `-fprofile-metadata-dir=<path>`
option. This
>> lets
>> users specify where llvm will store profile metadata. If the metadata
>> starts to
>> take up too much space, there's just one directory to clean.
>>
>> 2. The frontends continue emitting PGO name data and coverage data in
the
>> same
>> llvm::Module. So does LLVM's IR-based PGO implementation. No change
here.
>>
>> 3. If the InstrProf lowering pass sees that a metadata directory is
>> available,
>> it constructs a new module, copies the name/coverage data into it,
hashes
>> the
>> module, and attempts to write that module to:
>>
>>   <metadata-dir>/<module-hash>.bc   (the metadata module)
>>
>> If this write operation fails, it scraps the new module: it keeps all
the
>> metadata in the original module, and there are no changes from the
current
>> process. I.e with this proposal we preserve backwards compatibility.
>>
>
> Based at my experience with Clang's implicit modules, I'm
*extremely* wary
> of anything that might cause the compiler to emit a file that the build
> system cannot guess the name of. In fact, having the compiler emit a file
> that is not explicitly listed on the command line is basically just as bad
> in practice (in terms of feasibility of informing the build system about
> it).
>
> As a simple example, ninja simply cannot represent a dependency of this
> type, so if you delete a <metadata-dir>/<module-hash>.bc it
won't know
> things need to be rebuilt (and it won't know how to clean it, etc.).
>
> So I would really strongly recommend against doing this.
>
>
> Again, these problems of system integration (in particular build system
> integration) are nasty, and if you can bypass this and piggyback on debug
> info then everything will "just work" because the folks that care
about
> making sure that debugging "just works" already did the work for
you.
> It might be more work in the short term to do the debug info approach (if
> it is feasible at all), but I can tell you based on the experience with
> implicit modules (and I'm sure you have some experience of your own)
that
> there's just going to be a neverending tail of hitches and ways that
things
> don't work (or work poorly) due to not having the build system /
overall
> system integration right, so it will be worth it in the long run.
>
>
> Thanks, this makes a lot of sense. The build system should keep track of
> where to externalize profile metadata (regardless of whether or not it
> piggybacks on debug info). In addition to the advantages you've listed,
> this would make testing easier.
>
> vedant
>
> [1] ld64:
> 2561       if ( strcmp(sect->segname(), "__DWARF") == 0 ) {
>
>
> 2562         // note that .o file has dwarf
>
>
> 2563         _file->_debugInfoKind =
ld::relocatable::File::kDebugInfoDwarf;
>
>
> 2564         // save off iteresting dwarf sections
>
>
> ...
>
> 2571         else if ( strcmp(sect->sectname(), "__debug_str")
== 0 )
>
>
> 2572           _file->_dwarfDebugStringSect = sect;
>
>
> 2573         // linker does not propagate dwarf sections to output file
>
>
> 2574         continue;
>
>
>
>>
>> 4. Once the metadata module is written, the name/coverage data are
>> entirely
>> stripped out of the original module. They are replaced by a path to the
>> metadata module:
>>
>>   @__llvm_profiling_metadata =
"<metadata-dir>/<module-hash>.bc",
>>                                section "__llvm_prf_link"
>>
>> This allows incremental builds to work properly, which is an important
>> use case
>> for code coverage users. When an object is rebuilt, it gets a fresh
link
>> to a
>> fresh profiling metadata file. Although stale files can accumulate in
the
>> metadata directory, the stale files cannot ever be used.
>>
>> In an IDE like Xcode, since there's just one target binary per
scheme,
>> it's
>> possible to clean the metadata directory by removing the modules which
>> aren't
>> referenced by the target binary.
>>
>> 5. The raw profile format is updated so that links to metadata files
are
>> written
>> out in each profile. This makes it possible for all existing
>> llvm-profdata and
>> llvm-cov commands to work, seamlessly.
>>
>> The indexed profile format will *not* be updated: i.e, it will contain
a
>> full
>> symbol table, and no links. This simplifies the coverage mapping
reader,
>> because
>> a full symbol table is guaranteed to exist before any function records
are
>> parsed. It also reduces the amount of coding, and makes it easier to
>> preserve
>> backwards compatibility :).
>>
>> 6. The raw profile reader will learn how to read links, open up the
>> metadata
>> modules it finds links to, and collect name data from those modules.
>>
>> 7. The coverage reader will learn how to read the __llvm_prf_link
>> section, open
>> up metadata modules, and lazily read coverage mapping data.
>>
>> Alternate Solutions
>> -------------------
>>
>> 1. Instead of copying name data into an external metadata module, just
>> copy the
>> coverage mapping data.
>>
>> I've actually prototyped this. This might be a good way to split up
>> patches,
>> although I don't see why we wouldn't want to tackle the name
data problem
>> eventually.
>>
>> 2. Instead of emitting links to external metadata modules, modify
>> llvm-cov and
>> llvm-profdata so that they require a path to the metadata directory.
>>
>> The issue with this is that it's way too easy to read stale
metadata.
>> It's also
>> less user-friendly, which hurts adoption.
>>
>> 3. Use something other than llvm bitcode for the metadata module
format.
>>
>> Since we're mostly writing large binary blobs (compressed name data
or
>> pre-encoded source range mapping info), using bitcode shouldn't be
too
>> slow, and
>> we're not likely to get better compression with a different format.
>>
>> Bitcode is also convenient, and is nice for backwards compatibility.
>>
>> ------------------------------------------------------------
>> --------------------
>>
>> If you've made it this far, thanks for taking a look! I'd
appreciate any
>> feedback.
>>
>> vedant
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170704/2f10bc48/attachment-0001.html>

David Blaikie via llvm-dev

2017-Jul-05 15:07 UTC

head link

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

Could someone summarize the % size costs in object and executables, release
versus unoptimized and debug V no-debug builds? (maybe that's too much of a
hassle, but thought it might provide some clarity about the tradeoffs, pain
points, etc)

Also, added dberris here, since if I recall correctly, the XRay work has
some similar aspects - where certain mapping structures are kept in the
binary and consulted when interpreting XRay traces. In that case it may
also be useful to avoid putting those structures into the final binary in
some cases for the same sort of size tradeoff reasons.

& then even more worth looking at a generalized solution for these sort of
things.

- Dave

On Wed, Jul 5, 2017 at 7:58 AM Vedant Kumar via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Jun 30, 2017, at 10:04 PM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>
>
> On Fri, Jun 30, 2017 at 5:54 PM, via llvm-dev <llvm-dev at
lists.llvm.org>
> wrote:
>
>> Problem
>> -------
>>
>> Instrumentation for PGO and frontend-based coverage places a large
amount
>> of
>> data in object files, even though the majority of this data is not
needed
>> at
>> run-time. All the data is needlessly duplicated while generating
>> archives, and
>> again while linking. PGO name data is written out into raw profiles by
>> instrumented programs, slowing down the training and code coverage
>> workflows.
>>
>> Here are some numbers from a coverage + RA build of ToT clang:
>>
>>   * Size of the build directory: 4.3 GB
>>
>>   * Wall time needed to run "clang -help" with an SSD: 0.5
seconds
>>
>>   * Size of the clang binary: 725.24 MB
>>
>>   * Space wasted on duplicate name/coverage data (*.o + *.a): 923.49 MB
>>     - Size contributed by __llvm_covmap sections: 1.02 GB
>>       \_ Just within clang: 340.48 MB
>>
>
> We live with this duplication for debug info. In some sense, if the
> overhead is small compared to debug info, should we even bother (i.e., we
> assume that users accommodate debug builds, so that is a reasonable bound
> on the tolerable build directory size). (I don't know the numbers; this
> seems pretty large so maybe it is significant compared to debug info; just
> saying that looking at absolute numbers is misleading here; numbers
> compared to debug info are a closer measure to the user's perceptions)
>
>
> The size of a RelWithDebInfo build directory for the same checkout is 9 GB
> (I'm still just building clang, this time without instrumentation). We
> (more or less) get away with this because the debug info isn't copied
into
> the final binary [1]. We're not getting away with this with coverage.
E.g
> we usually store bot artifacts for a while, but we had to shut this
> functionality off almost immediately for our coverage bots because the
> uploads were horrific.
>
> In fact, one overall architectural observation I have is that the most
> complicated part of all this is simply establishing the workflow to plumb
> together data emitted per-TU to a tool that needs that information to do
> some post-processing step on the results of running the binary. That sounds
> a lot like the role of debug info. In fact, having a debugger open a core
> file is precisely equivalent to what llvm-profdata needs to do in this
> regard AFAICT.
>
> So it would be best if possible to piggyback on all the effort that has
> gone into plumbing that data to make debug info work. For example, I know
> that on Darwin there's a fair amount of system-level integration to
make
> split dwarf "just work" while keeping debug info out of final
binaries.
>
> If there is a not-too-hacky way to piggyback on debug info, that's
likely
> to be a really slick solution. For example, debug info could in principle
> (if it doesn't already) contain information about the name of each
counter
> in the counter array, so in principle it would be a complete enough
> description to identify each counter.
>
>
> We don't emit debug info for this currently. Is there a reason to?
>
> I'm not very familiar with DWARF, but I'm imagining something like
> reserving an LLVM vendor-specific DWARF opcode/attribute/whatever and then
> stick a blob of data in there. Presumably we have code somewhere in LLDB
> that is "here's a binary, find debug info for it", and in
principle we
> could factor out that code and lift it into an LLVM library
> (libFindDebugInfo) that llvm-profdata could use.
>
>
> This could work for the coverage/name data. There are some really nice
> pieces of Darwin integration (e.g search-with-Spotlight, findDsymForUUID).
> I'll look into this.
>
>     - Size contributed by __llvm_prf_names sections: 327.46 MB
>>       \_ Just within clang: 106.76 MB
>>
>>     => Space wasted within the clang binary: 447.24 MB
>>
>> Running an instrumented clang binary triggers a 143MB raw profile write
>> which
>> is slow even with an SSD. This problem is particularly bad for
>> frontend-based
>> coverage because it generates a lot of extra name data: however, the
>> situation
>> can also be improved for PGO instrumentation.
>>
>> Proposal
>> --------
>>
>> Place PGO name data and coverage data outside of object files. This
would
>> eliminate data duplication in *.a/*.o files, shrink binaries, shrink
raw
>> profiles, and speed up instrumented programs.
>>
>> In more detail:
>>
>> 1. The frontends get a new `-fprofile-metadata-dir=<path>`
option. This
>> lets
>> users specify where llvm will store profile metadata. If the metadata
>> starts to
>> take up too much space, there's just one directory to clean.
>>
>> 2. The frontends continue emitting PGO name data and coverage data in
the
>> same
>> llvm::Module. So does LLVM's IR-based PGO implementation. No change
here.
>>
>> 3. If the InstrProf lowering pass sees that a metadata directory is
>> available,
>> it constructs a new module, copies the name/coverage data into it,
hashes
>> the
>> module, and attempts to write that module to:
>>
>>   <metadata-dir>/<module-hash>.bc   (the metadata module)
>>
>> If this write operation fails, it scraps the new module: it keeps all
the
>> metadata in the original module, and there are no changes from the
current
>> process. I.e with this proposal we preserve backwards compatibility.
>>
>
> Based at my experience with Clang's implicit modules, I'm
*extremely* wary
> of anything that might cause the compiler to emit a file that the build
> system cannot guess the name of. In fact, having the compiler emit a file
> that is not explicitly listed on the command line is basically just as bad
> in practice (in terms of feasibility of informing the build system about
> it).
>
> As a simple example, ninja simply cannot represent a dependency of this
> type, so if you delete a <metadata-dir>/<module-hash>.bc it
won't know
> things need to be rebuilt (and it won't know how to clean it, etc.).
>
> So I would really strongly recommend against doing this.
>
>
> Again, these problems of system integration (in particular build system
> integration) are nasty, and if you can bypass this and piggyback on debug
> info then everything will "just work" because the folks that care
about
> making sure that debugging "just works" already did the work for
you.
> It might be more work in the short term to do the debug info approach (if
> it is feasible at all), but I can tell you based on the experience with
> implicit modules (and I'm sure you have some experience of your own)
that
> there's just going to be a neverending tail of hitches and ways that
things
> don't work (or work poorly) due to not having the build system /
overall
> system integration right, so it will be worth it in the long run.
>
>
> Thanks, this makes a lot of sense. The build system should keep track of
> where to externalize profile metadata (regardless of whether or not it
> piggybacks on debug info). In addition to the advantages you've listed,
> this would make testing easier.
>
> vedant
>
> [1] ld64:
> 2561       if ( strcmp(sect->segname(), "__DWARF") == 0 ) {
>
>
> 2562         // note that .o file has dwarf
>
>
> 2563         _file->_debugInfoKind >
ld::relocatable::File::kDebugInfoDwarf;
>
>
> 2564         // save off iteresting dwarf sections
>
>
> ...
>
> 2571         else if ( strcmp(sect->sectname(), "__debug_str")
== 0 )
>
>
> 2572           _file->_dwarfDebugStringSect = sect;
>
>
> 2573         // linker does not propagate dwarf sections to output file
>
>
> 2574         continue;
>
>
>
>>
>> 4. Once the metadata module is written, the name/coverage data are
>> entirely
>> stripped out of the original module. They are replaced by a path to the
>> metadata module:
>>
>>   @__llvm_profiling_metadata =
"<metadata-dir>/<module-hash>.bc",
>>                                section "__llvm_prf_link"
>>
>> This allows incremental builds to work properly, which is an important
>> use case
>> for code coverage users. When an object is rebuilt, it gets a fresh
link
>> to a
>> fresh profiling metadata file. Although stale files can accumulate in
the
>> metadata directory, the stale files cannot ever be used.
>>
>> In an IDE like Xcode, since there's just one target binary per
scheme,
>> it's
>> possible to clean the metadata directory by removing the modules which
>> aren't
>> referenced by the target binary.
>>
>> 5. The raw profile format is updated so that links to metadata files
are
>> written
>> out in each profile. This makes it possible for all existing
>> llvm-profdata and
>> llvm-cov commands to work, seamlessly.
>>
>> The indexed profile format will *not* be updated: i.e, it will contain
a
>> full
>> symbol table, and no links. This simplifies the coverage mapping
reader,
>> because
>> a full symbol table is guaranteed to exist before any function records
are
>> parsed. It also reduces the amount of coding, and makes it easier to
>> preserve
>> backwards compatibility :).
>>
>> 6. The raw profile reader will learn how to read links, open up the
>> metadata
>> modules it finds links to, and collect name data from those modules.
>>
>> 7. The coverage reader will learn how to read the __llvm_prf_link
>> section, open
>> up metadata modules, and lazily read coverage mapping data.
>>
>> Alternate Solutions
>> -------------------
>>
>> 1. Instead of copying name data into an external metadata module, just
>> copy the
>> coverage mapping data.
>>
>> I've actually prototyped this. This might be a good way to split up
>> patches,
>> although I don't see why we wouldn't want to tackle the name
data problem
>> eventually.
>>
>> 2. Instead of emitting links to external metadata modules, modify
>> llvm-cov and
>> llvm-profdata so that they require a path to the metadata directory.
>>
>> The issue with this is that it's way too easy to read stale
metadata.
>> It's also
>> less user-friendly, which hurts adoption.
>>
>> 3. Use something other than llvm bitcode for the metadata module
format.
>>
>> Since we're mostly writing large binary blobs (compressed name data
or
>> pre-encoded source range mapping info), using bitcode shouldn't be
too
>> slow, and
>> we're not likely to get better compression with a different format.
>>
>> Bitcode is also convenient, and is nice for backwards compatibility.
>>
>>
>>
--------------------------------------------------------------------------------
>>
>> If you've made it this far, thanks for taking a look! I'd
appreciate any
>> feedback.
>>
>> vedant
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170705/c207f088/attachment-0001.html>

Mehdi AMINI via llvm-dev

2017-Jul-05 19:03 UTC

head link

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

2017-07-03 23:29 GMT-07:00 Vedant Kumar via llvm-dev <
llvm-dev at lists.llvm.org>:
>
> On Jun 30, 2017, at 10:04 PM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>
>
> On Fri, Jun 30, 2017 at 5:54 PM, via llvm-dev <llvm-dev at
lists.llvm.org>
> wrote:
>
>> Problem
>> -------
>>
>> Instrumentation for PGO and frontend-based coverage places a large
amount
>> of
>> data in object files, even though the majority of this data is not
needed
>> at
>> run-time. All the data is needlessly duplicated while generating
>> archives, and
>> again while linking. PGO name data is written out into raw profiles by
>> instrumented programs, slowing down the training and code coverage
>> workflows.
>>
>> Here are some numbers from a coverage + RA build of ToT clang:
>>
>>   * Size of the build directory: 4.3 GB
>>
>>   * Wall time needed to run "clang -help" with an SSD: 0.5
seconds
>>
>>   * Size of the clang binary: 725.24 MB
>>
>>   * Space wasted on duplicate name/coverage data (*.o + *.a): 923.49 MB
>>     - Size contributed by __llvm_covmap sections: 1.02 GB
>>       \_ Just within clang: 340.48 MB
>>
>
> We live with this duplication for debug info. In some sense, if the
> overhead is small compared to debug info, should we even bother (i.e., we
> assume that users accommodate debug builds, so that is a reasonable bound
> on the tolerable build directory size). (I don't know the numbers; this
> seems pretty large so maybe it is significant compared to debug info; just
> saying that looking at absolute numbers is misleading here; numbers
> compared to debug info are a closer measure to the user's perceptions)
>
>
> The size of a RelWithDebInfo build directory for the same checkout is 9 GB
> (I'm still just building clang, this time without instrumentation). We
> (more or less) get away with this because the debug info isn't copied
into
> the final binary [1].
>
This is the case on Darwin but not on Linux I believe (without
debug-fission which is still quite "rare" I believe)

-- 
Mehdi



> We're not getting away with this with coverage. E.g we usually store
bot
> artifacts for a while, but we had to shut this functionality off almost
> immediately for our coverage bots because the uploads were horrific.
>
> In fact, one overall architectural observation I have is that the most
> complicated part of all this is simply establishing the workflow to plumb
> together data emitted per-TU to a tool that needs that information to do
> some post-processing step on the results of running the binary. That sounds
> a lot like the role of debug info. In fact, having a debugger open a core
> file is precisely equivalent to what llvm-profdata needs to do in this
> regard AFAICT.
>
> So it would be best if possible to piggyback on all the effort that has
> gone into plumbing that data to make debug info work. For example, I know
> that on Darwin there's a fair amount of system-level integration to
make
> split dwarf "just work" while keeping debug info out of final
binaries.
>
> If there is a not-too-hacky way to piggyback on debug info, that's
likely
> to be a really slick solution. For example, debug info could in principle
> (if it doesn't already) contain information about the name of each
counter
> in the counter array, so in principle it would be a complete enough
> description to identify each counter.
>
>
> We don't emit debug info for this currently. Is there a reason to?
>
> I'm not very familiar with DWARF, but I'm imagining something like
> reserving an LLVM vendor-specific DWARF opcode/attribute/whatever and then
> stick a blob of data in there. Presumably we have code somewhere in LLDB
> that is "here's a binary, find debug info for it", and in
principle we
> could factor out that code and lift it into an LLVM library
> (libFindDebugInfo) that llvm-profdata could use.
>
>
> This could work for the coverage/name data. There are some really nice
> pieces of Darwin integration (e.g search-with-Spotlight, findDsymForUUID).
> I'll look into this.
>
>     - Size contributed by __llvm_prf_names sections: 327.46 MB
>>       \_ Just within clang: 106.76 MB
>>
>>     => Space wasted within the clang binary: 447.24 MB
>>
>> Running an instrumented clang binary triggers a 143MB raw profile write
>> which
>> is slow even with an SSD. This problem is particularly bad for
>> frontend-based
>> coverage because it generates a lot of extra name data: however, the
>> situation
>> can also be improved for PGO instrumentation.
>>
>> Proposal
>> --------
>>
>> Place PGO name data and coverage data outside of object files. This
would
>> eliminate data duplication in *.a/*.o files, shrink binaries, shrink
raw
>> profiles, and speed up instrumented programs.
>>
>> In more detail:
>>
>> 1. The frontends get a new `-fprofile-metadata-dir=<path>`
option. This
>> lets
>> users specify where llvm will store profile metadata. If the metadata
>> starts to
>> take up too much space, there's just one directory to clean.
>>
>> 2. The frontends continue emitting PGO name data and coverage data in
the
>> same
>> llvm::Module. So does LLVM's IR-based PGO implementation. No change
here.
>>
>> 3. If the InstrProf lowering pass sees that a metadata directory is
>> available,
>> it constructs a new module, copies the name/coverage data into it,
hashes
>> the
>> module, and attempts to write that module to:
>>
>>   <metadata-dir>/<module-hash>.bc   (the metadata module)
>>
>> If this write operation fails, it scraps the new module: it keeps all
the
>> metadata in the original module, and there are no changes from the
current
>> process. I.e with this proposal we preserve backwards compatibility.
>>
>
> Based at my experience with Clang's implicit modules, I'm
*extremely* wary
> of anything that might cause the compiler to emit a file that the build
> system cannot guess the name of. In fact, having the compiler emit a file
> that is not explicitly listed on the command line is basically just as bad
> in practice (in terms of feasibility of informing the build system about
> it).
>
> As a simple example, ninja simply cannot represent a dependency of this
> type, so if you delete a <metadata-dir>/<module-hash>.bc it
won't know
> things need to be rebuilt (and it won't know how to clean it, etc.).
>
> So I would really strongly recommend against doing this.
>
>
> Again, these problems of system integration (in particular build system
> integration) are nasty, and if you can bypass this and piggyback on debug
> info then everything will "just work" because the folks that care
about
> making sure that debugging "just works" already did the work for
you.
> It might be more work in the short term to do the debug info approach (if
> it is feasible at all), but I can tell you based on the experience with
> implicit modules (and I'm sure you have some experience of your own)
that
> there's just going to be a neverending tail of hitches and ways that
things
> don't work (or work poorly) due to not having the build system /
overall
> system integration right, so it will be worth it in the long run.
>
>
> Thanks, this makes a lot of sense. The build system should keep track of
> where to externalize profile metadata (regardless of whether or not it
> piggybacks on debug info). In addition to the advantages you've listed,
> this would make testing easier.
>
> vedant
>
> [1] ld64:
> 2561       if ( strcmp(sect->segname(), "__DWARF") == 0 ) {
>
>
> 2562         // note that .o file has dwarf
>
>
> 2563         _file->_debugInfoKind =
ld::relocatable::File::kDebugInfoDwarf;
>
>
> 2564         // save off iteresting dwarf sections
>
>
> ...
>
> 2571         else if ( strcmp(sect->sectname(), "__debug_str")
== 0 )
>
>
> 2572           _file->_dwarfDebugStringSect = sect;
>
>
> 2573         // linker does not propagate dwarf sections to output file
>
>
> 2574         continue;
>
>
>
>>
>> 4. Once the metadata module is written, the name/coverage data are
>> entirely
>> stripped out of the original module. They are replaced by a path to the
>> metadata module:
>>
>>   @__llvm_profiling_metadata =
"<metadata-dir>/<module-hash>.bc",
>>                                section "__llvm_prf_link"
>>
>> This allows incremental builds to work properly, which is an important
>> use case
>> for code coverage users. When an object is rebuilt, it gets a fresh
link
>> to a
>> fresh profiling metadata file. Although stale files can accumulate in
the
>> metadata directory, the stale files cannot ever be used.
>>
>> In an IDE like Xcode, since there's just one target binary per
scheme,
>> it's
>> possible to clean the metadata directory by removing the modules which
>> aren't
>> referenced by the target binary.
>>
>> 5. The raw profile format is updated so that links to metadata files
are
>> written
>> out in each profile. This makes it possible for all existing
>> llvm-profdata and
>> llvm-cov commands to work, seamlessly.
>>
>> The indexed profile format will *not* be updated: i.e, it will contain
a
>> full
>> symbol table, and no links. This simplifies the coverage mapping
reader,
>> because
>> a full symbol table is guaranteed to exist before any function records
are
>> parsed. It also reduces the amount of coding, and makes it easier to
>> preserve
>> backwards compatibility :).
>>
>> 6. The raw profile reader will learn how to read links, open up the
>> metadata
>> modules it finds links to, and collect name data from those modules.
>>
>> 7. The coverage reader will learn how to read the __llvm_prf_link
>> section, open
>> up metadata modules, and lazily read coverage mapping data.
>>
>> Alternate Solutions
>> -------------------
>>
>> 1. Instead of copying name data into an external metadata module, just
>> copy the
>> coverage mapping data.
>>
>> I've actually prototyped this. This might be a good way to split up
>> patches,
>> although I don't see why we wouldn't want to tackle the name
data problem
>> eventually.
>>
>> 2. Instead of emitting links to external metadata modules, modify
>> llvm-cov and
>> llvm-profdata so that they require a path to the metadata directory.
>>
>> The issue with this is that it's way too easy to read stale
metadata.
>> It's also
>> less user-friendly, which hurts adoption.
>>
>> 3. Use something other than llvm bitcode for the metadata module
format.
>>
>> Since we're mostly writing large binary blobs (compressed name data
or
>> pre-encoded source range mapping info), using bitcode shouldn't be
too
>> slow, and
>> we're not likely to get better compression with a different format.
>>
>> Bitcode is also convenient, and is nice for backwards compatibility.
>>
>> ------------------------------------------------------------
>> --------------------
>>
>> If you've made it this far, thanks for taking a look! I'd
appreciate any
>> feedback.
>>
>> vedant
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170705/f1ca0d32/attachment.html>

llvm dev - Jul 2017 - [RFC] Placing profile name data, and coverage data, outside of object files

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files

[llvm-dev] [RFC] Placing profile name data, and coverage data, outside of object files