thr3ads.net - llvm dev - [llvm-dev] Using source-based code coverage on baremetal [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Martin J. O'Riordan via llvm-dev

2017-Sep-11 09:37 UTC

[llvm-dev] Using source-based code coverage on baremetal

I think that this proposal would be very useful, and I will describe our
experiences of trying to do this for our embedded bare-metal target.

Recently we implemented support for just the '-fprofile-instr-generate'
option and the 'compiler-rt/lib/profile' sources, and added the
following to our LD scripts:

      /* Append the LLVM profiling sections */
      . = ALIGN(4);
      PROVIDE(__start___llvm_prf_cnts = .);
      *(__llvm_prf_cnts)
      PROVIDE(__stop___llvm_prf_cnts = .);

      . = ALIGN(4);
      PROVIDE(__start___llvm_prf_data = .);
      *(__llvm_prf_data)
      PROVIDE(__stop___llvm_prf_data = .);

      . = ALIGN(4);
      PROVIDE(__start___llvm_prf_names = .);
      *(__llvm_prf_names)
      PROVIDE(__stop___llvm_prf_names = .);

      . = ALIGN(4);
      PROVIDE(__start___llvm_prf_vnds = .);
      *(__llvm_prf_vnds)
      PROVIDE(__stop___llvm_prf_vnds = .);

This removed the need for the '.ctors' model for registering functions
(which also reduces the run-time cost) and enabled our target to use the model
described in 'InstrProfilingPlatformLinux.cpp' instead of
'InstrProfilingPlatformOther.cpp', adding our triple to
'lib/Transforms/Instrumentation/InstrProfiling.cpp'.

We use Newlib for our LibC so we have a reasonably complete ISO C library, but
we do not have a file-system so the FILE based I/O cannot work.  And as there is
no environment, dependence on environment variables is also meaningless.  I
won't even bother discussing memory-mapped files ;-)

We also have to ensure that the basic instrumentation initialisation process
normally handled by 'RegistrationRuntime Registration' is performed
before the program is allowed execute, and that the data is subsequently dumped
(taken off-chip) after execution.  This is done with a bit of smoke and mirrors
as many programs in the embedded environment to not have support for running the
'.ctors' functions before and the 'atexit' functions after
execution (especially C programs).

But the Compiler-RT profile library also integrates the automatic merging and
collation of data from multiple runs within the library implementation itself,
and this is a really significant problem for base-metal system with no OS and no
file-system.  It does this using "patterns" in the file name (derived
from the environment), and the data collation performed by the system being
profiled.

I think that to better facilitate bare-metal systems, this process of collating
the results of multiple runs would be best provided by a separate stand-alone
utility on the host system that would perform this logic offline rather than
having it integrated online as it is currently defined.

Our implementation can now gather data for a single run
('default.profraw') but does not (yet) have the capability of collating
the results from more than one profiling run.

We have not yet started supporting the other instrumentation such as coverage
and ubsan, but hope to do so now that we have figured out how to do basic
profiling for PGO.

	MartinO

-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Jonathan
Roelofs via llvm-dev
Sent: 06 September 2017 22:27
To: Friedman, Eli <efriedma at codeaurora.org>; Vedant Kumar <vsk at
apple.com>; weimingz at codeaurora.org; llvm-dev <llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] Using source-based code coverage on baremetal

On 9/5/17 7:55 PM, Friedman, Eli via llvm-dev wrote:
> 
> Areas that required LLVM changes:
> 
> 1. The copy of libclangrt_profile.a for the target.  Given that we 
> already were using builtins from compiler-rt, the primary changes 
> required are enabling the profile library and excluding a bunch of 
> files from the build (since baremetal doesn't have a filesystem, 
> system calls, etc.).  I'll look into posting patches when I have time, 
> but it might take me a little while for me to figure out how to 
> cleanly modify the build, and verify everything actually works on 
> trunk.  It looks like there's a CMake variable 
> COMPILER_RT_BAREMETAL_BUILD which is supposed to be turned on for this sort
of environment?
Yes, that's exactly what that variable is for.

See also: clang/cmake/caches/BaremetalARM.cmake. I haven't taught this how
to do the rest of the runtime bits (unwinder/libcxxabi/libcxx), but plan to at
some point.
> 
> 2. Changing the compiler and compiler-rt to use __start and __end 
> symbols to find the sections, rather than .init code.  This isn't 
> strictly necessary, but our linker supports __start and __end, and 
> this was easier than changing the baremetal image to handle a .init
section.
> See needsRuntimeRegistrationOfSectionRange in 
> lib/Transforms/Instrumentation/InstrProfiling.cpp; we currently only 
> whitelist a few platforms.  Not sure what would be appropriate here; 
> maybe we could assume any *-none-* triple supports __start and __end 
> symbols?  Or maybe control it with a flag somehow? Or something else 
> I'm not thinking of?
A flag for this sounds great.

Jon

--
Jon Roelofs
jonathan at codesourcery.com
CodeSourcery / Mentor Embedded / Siemens
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Friedman, Eli via llvm-dev

2017-Sep-11 19:55 UTC

head link

[llvm-dev] Using source-based code coverage on baremetal

On 9/11/2017 2:37 AM, Martin J. O'Riordan wrote:> We also have to ensure that the basic instrumentation initialisation
process normally handled by 'RegistrationRuntime Registration' is
performed before the program is allowed execute, and that the data is
subsequently dumped (taken off-chip) after execution.  This is done with a bit
of smoke and mirrors as many programs in the embedded environment to not have
support for running the '.ctors' functions before and the
'atexit' functions after execution (especially C programs).
I don't think RegisterRuntime provides any relevant functionality unless 
you're writing to a file (I didn't even realize that existed before
now).

We've been modifying the source code to write out profile data because 
our images never actually "exit".  Maybe there's something more
clever
we can do in some cases.
> But the Compiler-RT profile library also integrates the automatic merging
and collation of data from multiple runs within the library implementation
itself, and this is a really significant problem for base-metal system with no
OS and no file-system.  It does this using "patterns" in the file name
(derived from the environment), and the data collation performed by the system
being profiled.
>
> I think that to better facilitate bare-metal systems, this process of
collating the results of multiple runs would be best provided by a separate
stand-alone utility on the host system that would perform this logic offline
rather than having it integrated online as it is currently defined.
The logic for merging raw profiles already exists in llvm-profdata (see 
https://llvm.org/docs/CommandGuide/llvm-profdata.html#profdata-merge ).

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

ORiordan, Martin via llvm-dev

2017-Sep-12 10:14 UTC

head link

[llvm-dev] Using source-based code coverage on baremetal

Hi Eli,

What we have done is inserted calls to:

   lprofSetupValueProfiler()
   __llvm_profile_initialize_file()

before execution, and:

   __llvm_profile_write_file()

after execution to ensure that the functionality of the
'RegisterRuntime' and 'atexit' are preserved.  I must admit I
did not drill deeper to see if this was necessary at all, we are still at the
prototype stage of this work, although at an advanced stage (it does actually
work ;-).  I will have to experiment to see if these calls are necessary at all,
and if not simplify our solution.  We do however "fake" some file
functionality to get the profiling working, but a non-file based approach would
be way better.

Rather than inserting code to call these, we are using a combination of
'objcopy' to rename the entry-point to the program, and use a shim with
the original name which calls the profile initialisation functions before
calling the user's original renamed entry-point.  Similarly another shim for
termination.  This is what I meant by "smoke and mirrors", but it is
effective and does not require special alteration to the program source.

Needless to say, a proper bona-fide mechanism for doing this in the embedded
space would be way preferred.

The essential elements (as I see it) for the various compiler-rt instrumentation
support is:

1.  A simple hook to ensure that the data structures
    structures involved in the instrumentation are
    properly initialised
2.  Another hook to allow the system to offload the
    instrumentation data
3.  Offline tools/utilities on the host system to
    aggregate and collate the data produced by
    multiple runs - this could possibly be achieved
    by extending the functionality of 'llvm-profile'.
    I didn't realise that 'llvm-profile' already had
    support for merging multiple data sets, I will
    have to learn how to do that - thanks for the tip

We still haven't experimented with incrementally offloading the
instrumentation data during execution, and that would be a very neat capability.

Thanks,

    MartinO

-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
Friedman, Eli via llvm-dev
Sent: Monday, September 11, 2017 8:56 PM
To: Martin J. O'Riordan <MartinO at theheart.ie>; 'Jonathan
Roelofs' <jonathan at codesourcery.com>; 'Vedant Kumar'
<vsk at apple.com>; weimingz at codeaurora.org; 'LLVM Developers'
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] Using source-based code coverage on baremetal

On 9/11/2017 2:37 AM, Martin J. O'Riordan wrote:> We also have to ensure that the basic instrumentation initialisation
process normally handled by 'RegistrationRuntime Registration' is
performed before the program is allowed execute, and that the data is
subsequently dumped (taken off-chip) after execution.  This is done with a bit
of smoke and mirrors as many programs in the embedded environment to not have
support for running the '.ctors' functions before and the
'atexit' functions after execution (especially C programs).
I don't think RegisterRuntime provides any relevant functionality unless
you're writing to a file (I didn't even realize that existed before
now).

We've been modifying the source code to write out profile data because our
images never actually "exit".  Maybe there's something more clever
we can do in some cases.
> But the Compiler-RT profile library also integrates the automatic merging
and collation of data from multiple runs within the library implementation
itself, and this is a really significant problem for base-metal system with no
OS and no file-system.  It does this using "patterns" in the file name
(derived from the environment), and the data collation performed by the system
being profiled.
>
> I think that to better facilitate bare-metal systems, this process of
collating the results of multiple runs would be best provided by a separate
stand-alone utility on the host system that would perform this logic offline
rather than having it integrated online as it is currently defined.
The logic for merging raw profiles already exists in llvm-profdata (see
https://llvm.org/docs/CommandGuide/llvm-profdata.html#profdata-merge ).

-Eli

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--------------------------------------------------------------
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263

This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.

llvm dev - Sep 2017 - Using source-based code coverage on baremetal

[llvm-dev] Using source-based code coverage on baremetal

[llvm-dev] Using source-based code coverage on baremetal

[llvm-dev] Using source-based code coverage on baremetal