thr3ads.net - llvm dev - [llvm-dev] Adding support for LLVM Branch Condition Coverage [Jan 2020]

If this information is useful, please help other people find it:
Share via:

Phipps, Alan via llvm-dev

2020-Jan-24 00:09 UTC

[llvm-dev] Adding support for LLVM Branch Condition Coverage

Vedant Kumar asked me to post my design thoughts concerning branch coverage at
llvm-dev since there is general interest.

My team at Texas Instruments is developing an embedded ARM C/C++ compiler with
LLVM.  I would like to enhance LLVM's code coverage capability with branch
condition coverage (for C/C++), similar to GCC/GCOV support for branch coverage.
This is useful for TI, and I think this will be a useful feature enhancement to
LLVM that I can upstream.

In a nutshell, the functionality boils down to tracking how many times a
generated "branch" instruction (based on a source code condition) is
taken or not taken (i.e. evaluated into "True" and "False").
This applies to decision points in control flow (if, for, while, ...) as well as
individual conditions on logical operators ("&&",
"||") in Boolean expressions.

In sketching out a design, there are three primary areas in the design that I am
proposing:


1.)    Add a new CounterMappingRegion kind for branch conditions

a.      This new region kind would track two counters, one for the
"True" branch taken count of a branch condition, and one for the
"False" branch taken count.

                                                    i.     Alternatively, I
could use two separate CounterMappingRegions to track individual counters since
this is how the class was originally written to be used.  However, using a
single region kind to represent a single branch condition that ties all of the
pertinent counter information together seems like a cleaner design.

                                                   ii.     Just as for all
counters, the two branch condition counters can represent a reference to an
instrumentation counter or to a counter expression.  The two counters are
encoded along with the MappingRegions and distinguished based on the region
kind.

                                                  iii.     All other
CounterMappingRegion kinds simply ignore the second counter; nothing changes in
how they're encoded, which preserves format backward compatibility.

b.      I think this change also requires an adjustment to the class
SourceMappingRegion to support branch conditions that can be generated into
CounterMappingRegion instances.



2.)    Counter Instrumentation

a.      We can reuse most of the existing profile instrumentation counters that
are emitted as part of profiling/coverage to calculate branch condition counts
(True/False).

                                                    i.     This assumption
leverages the fact that logical operators in C are "short-circuit"
operators.  For example, the "False-taken" count for the
left-hand-side condition in a logical-or expression (e.g. condition
"C1" in "C1 || C2") can be derived from the execution count
we already track for the right-hand-side (condition "C2" in "C1
|| C2").

b.      There does exist a case when evaluating the right-hand-side condition of
a logical operator that isn't part of a control-flow statement (e.g.
condition "C2" in "x = C1 || C2;") that will require
instrumenting a new counter in order to properly derive that condition's
"true" count and "false" count.

c.      I'll avoid going too deep into detail here, but my goal is to ensure
we reuse existing profile counters as much as possible.



3.)    Visualization using llvm-cov

a.      The notion of CoverageSegment needs to be extended to comprehend the
branch condition data represented by a CounterMappingRegion above.  But then
llvm-cov can treat the segment distinctly when displaying True/False counts for
each branch condition as well as tracking total missed branches.

b.      We can also add a BranchCoverageInfo class to track branch coverage
data, similar to LineCoverageInfo and RegionCoverageInfo.

c.      The text output could look something like GCOV but with more detail that
we know (I prototyped this using logical-or):


    9|       |int main(int argc, char *argv[])
   10|      3|{
   11|      3|    if (argc == 1)
Branch (11:9): [True: 1, False: 2]
   12|      1|    {
   13|      1|        return 0;
   14|      1|    }
. . .

  23|      2|    if (a == 0 || b == 2 || b == 34 || a == b)
Branch (23:9):  [True: 1, False: 1]
Branch (23:19): [True: 1, False: 0]
Branch (23:29): [True: 0, False: 0]
Branch (23:40): [True: 0, False: 0]
. . .

  31|      2|    b = a || c;
Branch (31:9):  [True: 1, False: 1]
Branch (31:14): [True: 1, False: 0]


d.      I thought about extending the "region-count" carat markers in
the text display, but it could get messy.  For the HTML output, we can get a bit
more fancy.

e.      Branch miss percentages/totals will be added to the coverage report.



Additional Notes

-        I'm aware that constant condition folding in
CodeGenFunction::EmitBranchOnBoolExpr() needs to be taken into account.  Is
there anything else related to branch optimization that I ought to be aware of?





Please let me know if these design thoughts look reasonable and if this would be
useful.  The goal is to start full implementation soon and upstream in a few
months.



Thanks!

Alan Phipps

Texas Instruments, Inc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200124/fe43b752/attachment.html>

Finkel, Hal J. via llvm-dev

2020-Jan-24 17:01 UTC

head link

[llvm-dev] Adding support for LLVM Branch Condition Coverage

Thanks, Alan. This certainly seems useful. Can you please provide a quick
overview on how this relates to our other infrastructure for coverage, for
profiling, and what's used for fuzz testing?

 -Hal

On 1/23/20 6:09 PM, Phipps, Alan via llvm-dev wrote:
Vedant Kumar asked me to post my design thoughts concerning branch coverage at
llvm-dev since there is general interest.

My team at Texas Instruments is developing an embedded ARM C/C++ compiler with
LLVM.  I would like to enhance LLVM’s code coverage capability with branch
condition coverage (for C/C++), similar to GCC/GCOV support for branch coverage.
This is useful for TI, and I think this will be a useful feature enhancement to
LLVM that I can upstream.

In a nutshell, the functionality boils down to tracking how many times a
generated “branch” instruction (based on a source code condition) is taken or
not taken (i.e. evaluated into “True” and “False”).  This applies to decision
points in control flow (if, for, while, …) as well as individual conditions on
logical operators (“&&”, “||”) in Boolean expressions.

In sketching out a design, there are three primary areas in the design that I am
proposing:


1.)    Add a new CounterMappingRegion kind for branch conditions

a.      This new region kind would track two counters, one for the “True” branch
taken count of a branch condition, and one for the “False” branch taken count.

                                                    i.     Alternatively, I
could use two separate CounterMappingRegions to track individual counters since
this is how the class was originally written to be used.  However, using a
single region kind to represent a single branch condition that ties all of the
pertinent counter information together seems like a cleaner design.

                                                   ii.     Just as for all
counters, the two branch condition counters can represent a reference to an
instrumentation counter or to a counter expression.  The two counters are
encoded along with the MappingRegions and distinguished based on the region
kind.

                                                  iii.     All other
CounterMappingRegion kinds simply ignore the second counter; nothing changes in
how they’re encoded, which preserves format backward compatibility.

b.      I think this change also requires an adjustment to the class
SourceMappingRegion to support branch conditions that can be generated into
CounterMappingRegion instances.



2.)    Counter Instrumentation

a.      We can reuse most of the existing profile instrumentation counters that
are emitted as part of profiling/coverage to calculate branch condition counts
(True/False).

                                                    i.     This assumption
leverages the fact that logical operators in C are “short-circuit” operators. 
For example, the “False-taken” count for the left-hand-side condition in a
logical-or expression (e.g. condition “C1” in “C1 || C2”) can be derived from
the execution count we already track for the right-hand-side (condition “C2” in
“C1 || C2”).

b.      There does exist a case when evaluating the right-hand-side condition of
a logical operator that isn’t part of a control-flow statement (e.g. condition
“C2” in “x = C1 || C2;”) that will require instrumenting a new counter in order
to properly derive that condition’s “true” count and “false” count.

c.      I’ll avoid going too deep into detail here, but my goal is to ensure we
reuse existing profile counters as much as possible.



3.)    Visualization using llvm-cov

a.      The notion of CoverageSegment needs to be extended to comprehend the
branch condition data represented by a CounterMappingRegion above.  But then
llvm-cov can treat the segment distinctly when displaying True/False counts for
each branch condition as well as tracking total missed branches.

b.      We can also add a BranchCoverageInfo class to track branch coverage
data, similar to LineCoverageInfo and RegionCoverageInfo.

c.      The text output could look something like GCOV but with more detail that
we know (I prototyped this using logical-or):


    9|       |int main(int argc, char *argv[])
   10|      3|{
   11|      3|    if (argc == 1)
Branch (11:9): [True: 1, False: 2]
   12|      1|    {
   13|      1|        return 0;
   14|      1|    }
. . .

  23|      2|    if (a == 0 || b == 2 || b == 34 || a == b)
Branch (23:9):  [True: 1, False: 1]
Branch (23:19): [True: 1, False: 0]
Branch (23:29): [True: 0, False: 0]
Branch (23:40): [True: 0, False: 0]
. . .

  31|      2|    b = a || c;
Branch (31:9):  [True: 1, False: 1]
Branch (31:14): [True: 1, False: 0]


d.      I thought about extending the “region-count” carat markers in the text
display, but it could get messy.  For the HTML output, we can get a bit more
fancy.

e.      Branch miss percentages/totals will be added to the coverage report.



Additional Notes

-        I’m aware that constant condition folding in
CodeGenFunction::EmitBranchOnBoolExpr() needs to be taken into account.  Is
there anything else related to branch optimization that I ought to be aware of?





Please let me know if these design thoughts look reasonable and if this would be
useful.  The goal is to start full implementation soon and upstream in a few
months.



Thanks!

Alan Phipps

Texas Instruments, Inc.




_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200124/ba49d9d0/attachment-0001.html>

Phipps, Alan via llvm-dev

2020-Jan-24 18:56 UTC

head link

[llvm-dev] Adding support for LLVM Branch Condition Coverage

+ Vedant

Hi Hal, thanks.

I apologize if my answers aren't as thorough as you would like; what I'm
proposing is simply an extension to the existing infrastructure, so it would be
enabled automatically as part of code coverage.  Mapping of branch regions would
be done in CoverageMappingGen and instrumented using the same profiling
instrumentation mechanism under CodeGenPGO::mapRegionCounters() and around
CodeGenFunction::EmitBranchOnBoolExpr() .  In fact, as I mention below, we'd
largely be reusing the same profiling counters (except in at least one exception
case that I described in my email).  The existing functionality of coverage and
profiling would still work exactly as it has.  Further, I can add a switch to
llvm-cov to enable/disable branch coverage visualization and whether it's
included in the coverage report.

With respect to fuzzing, to be sure I don't misunderstand you, are you
referring to testing the branch coverage capability itself using fuzzing, or are
you referring to the leveraging of coverage by a fuzzer itself (i.e.
coverage-guided fuzzing)?  For the latter, I could look into libFuzzer and see
how this might impact it.  For the former, I haven't thought much about
using fuzzing to test coverage although I am certainly open to suggestions.

-Alan

From: Finkel, Hal J. [mailto:hfinkel at anl.gov]
Sent: Friday, January 24, 2020 11:02 AM
To: Phipps, Alan; llvm-dev at lists.llvm.org
Subject: [EXTERNAL] Re: [llvm-dev] Adding support for LLVM Branch Condition
Coverage


Thanks, Alan. This certainly seems useful. Can you please provide a quick
overview on how this relates to our other infrastructure for coverage, for
profiling, and what's used for fuzz testing?

 -Hal
On 1/23/20 6:09 PM, Phipps, Alan via llvm-dev wrote:
Vedant Kumar asked me to post my design thoughts concerning branch coverage at
llvm-dev since there is general interest.

My team at Texas Instruments is developing an embedded ARM C/C++ compiler with
LLVM.  I would like to enhance LLVM's code coverage capability with branch
condition coverage (for C/C++), similar to GCC/GCOV support for branch coverage.
This is useful for TI, and I think this will be a useful feature enhancement to
LLVM that I can upstream.

In a nutshell, the functionality boils down to tracking how many times a
generated "branch" instruction (based on a source code condition) is
taken or not taken (i.e. evaluated into "True" and "False").
This applies to decision points in control flow (if, for, while, ...) as well as
individual conditions on logical operators ("&&",
"||") in Boolean expressions.

In sketching out a design, there are three primary areas in the design that I am
proposing:


1.)    Add a new CounterMappingRegion kind for branch conditions

a.      This new region kind would track two counters, one for the
"True" branch taken count of a branch condition, and one for the
"False" branch taken count.

                                                                        i.    
Alternatively, I could use two separate CounterMappingRegions to track
individual counters since this is how the class was originally written to be
used.  However, using a single region kind to represent a single branch
condition that ties all of the pertinent counter information together seems like
a cleaner design.

                                                                       ii.    
Just as for all counters, the two branch condition counters can represent a
reference to an instrumentation counter or to a counter expression.  The two
counters are encoded along with the MappingRegions and distinguished based on
the region kind.

                                                                      iii.    
All other CounterMappingRegion kinds simply ignore the second counter; nothing
changes in how they're encoded, which preserves format backward
compatibility.

b.      I think this change also requires an adjustment to the class
SourceMappingRegion to support branch conditions that can be generated into
CounterMappingRegion instances.



2.)    Counter Instrumentation

a.      We can reuse most of the existing profile instrumentation counters that
are emitted as part of profiling/coverage to calculate branch condition counts
(True/False).

                                                                        i.    
This assumption leverages the fact that logical operators in C are
"short-circuit" operators.  For example, the "False-taken"
count for the left-hand-side condition in a logical-or expression (e.g.
condition "C1" in "C1 || C2") can be derived from the
execution count we already track for the right-hand-side (condition
"C2" in "C1 || C2").

b.      There does exist a case when evaluating the right-hand-side condition of
a logical operator that isn't part of a control-flow statement (e.g.
condition "C2" in "x = C1 || C2;") that will require
instrumenting a new counter in order to properly derive that condition's
"true" count and "false" count.

c.      I'll avoid going too deep into detail here, but my goal is to ensure
we reuse existing profile counters as much as possible.



3.)    Visualization using llvm-cov

a.      The notion of CoverageSegment needs to be extended to comprehend the
branch condition data represented by a CounterMappingRegion above.  But then
llvm-cov can treat the segment distinctly when displaying True/False counts for
each branch condition as well as tracking total missed branches.

b.      We can also add a BranchCoverageInfo class to track branch coverage
data, similar to LineCoverageInfo and RegionCoverageInfo.

c.      The text output could look something like GCOV but with more detail that
we know (I prototyped this using logical-or):


    9|       |int main(int argc, char *argv[])
   10|      3|{
   11|      3|    if (argc == 1)
Branch (11:9): [True: 1, False: 2]
   12|      1|    {
   13|      1|        return 0;
   14|      1|    }
. . .

  23|      2|    if (a == 0 || b == 2 || b == 34 || a == b)
Branch (23:9):  [True: 1, False: 1]
Branch (23:19): [True: 1, False: 0]
Branch (23:29): [True: 0, False: 0]
Branch (23:40): [True: 0, False: 0]
. . .

  31|      2|    b = a || c;
Branch (31:9):  [True: 1, False: 1]
Branch (31:14): [True: 1, False: 0]


d.      I thought about extending the "region-count" carat markers in
the text display, but it could get messy.  For the HTML output, we can get a bit
more fancy.

e.      Branch miss percentages/totals will be added to the coverage report.



Additional Notes

-        I'm aware that constant condition folding in
CodeGenFunction::EmitBranchOnBoolExpr() needs to be taken into account.  Is
there anything else related to branch optimization that I ought to be aware of?





Please let me know if these design thoughts look reasonable and if this would be
useful.  The goal is to start full implementation soon and upstream in a few
months.



Thanks!

Alan Phipps

Texas Instruments, Inc.




_______________________________________________

LLVM Developers mailing list

llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200124/f52fb70e/attachment-0001.html>

Vedant Kumar via llvm-dev

2020-Jan-24 19:35 UTC

head link

[llvm-dev] Adding support for LLVM Branch Condition Coverage

I've heard interest expressed in a branch condition coverage feature both at
the 2017 coverage BoF, and from internal users at Apple. I'm excited to see
this go forward.
> On Jan 23, 2020, at 4:09 PM, Phipps, Alan via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Vedant Kumar asked me to post my design thoughts concerning branch coverage
at llvm-dev since there is general interest.
>  
> My team at Texas Instruments is developing an embedded ARM C/C++ compiler
with LLVM.  I would like to enhance LLVM’s code coverage capability with branch
condition coverage (for C/C++), similar to GCC/GCOV support for branch coverage.
This is useful for TI, and I think this will be a useful feature enhancement to
LLVM that I can upstream.
>  
> In a nutshell, the functionality boils down to tracking how many times a
generated “branch” instruction (based on a source code condition) is taken or
not taken (i.e. evaluated into “True” and “False”).  This applies to decision
points in control flow (if, for, while, …) as well as individual conditions on
logical operators (“&&”, “||”) in Boolean expressions.
>  
> In sketching out a design, there are three primary areas in the design that
I am proposing:
>  
> 1.)    Add a new CounterMappingRegion kind for branch conditions
> a.      This new region kind would track two counters, one for the “True”
branch taken count of a branch condition, and one for the “False” branch taken
count.
>                                                     i.     Alternatively, I
could use two separate CounterMappingRegions to track individual counters since
this is how the class was originally written to be used.  However, using a
single region kind to represent a single branch condition that ties all of the
pertinent counter information together seems like a cleaner design.
>                                                    ii.     Just as for all
counters, the two branch condition counters can represent a reference to an
instrumentation counter or to a counter expression.  The two counters are
encoded along with the MappingRegions and distinguished based on the region
kind.
>                                                   iii.     All other
CounterMappingRegion kinds simply ignore the second counter; nothing changes in
how they’re encoded, which preserves format backward compatibility.
> b.      I think this change also requires an adjustment to the class
SourceMappingRegion to support branch conditions that can be generated into
CounterMappingRegion instances.
This would take an additional counter to represent the true/false paths. Is an
additional source location also needed to record where the condition is
evaluated (say, the precise location of the `||` operator)?
> 2.)    Counter Instrumentation
> a.      We can reuse most of the existing profile instrumentation counters
that are emitted as part of profiling/coverage to calculate branch condition
counts (True/False).
>                                                     i.     This assumption
leverages the fact that logical operators in C are “short-circuit” operators. 
For example, the “False-taken” count for the left-hand-side condition in a
logical-or expression (e.g. condition “C1” in “C1 || C2”) can be derived from
the execution count we already track for the right-hand-side (condition “C2” in
“C1 || C2”).
> b.      There does exist a case when evaluating the right-hand-side
condition of a logical operator that isn’t part of a control-flow statement
(e.g. condition “C2” in “x = C1 || C2;”) that will require instrumenting a new
counter in order to properly derive that condition’s “true” count and “false”
count.
Ah, right. The current counter instrumentation only tracks how often C2 is
evaluated, but not in a way that discriminates between C2 being true/false. But
what makes conditions that aren't a part of a control flow statement
special?
> c.      I’ll avoid going too deep into detail here, but my goal is to
ensure we reuse existing profile counters as much as possible.
>  
> 3.)    Visualization using llvm-cov
> a.      The notion of CoverageSegment needs to be extended to comprehend
the branch condition data represented by a CounterMappingRegion above.  But then
llvm-cov can treat the segment distinctly when displaying True/False counts for
each branch condition as well as tracking total missed branches.
> b.      We can also add a BranchCoverageInfo class to track branch coverage
data, similar to LineCoverageInfo and RegionCoverageInfo.
> c.      The text output could look something like GCOV but with more detail
that we know (I prototyped this using logical-or):
>  
>     9|       |int main(int argc, char *argv[])
>    10|      3|{
>    11|      3|    if (argc == 1)
> Branch (11:9): [True: 1, False: 2]
>    12|      1|    {
>    13|      1|        return 0;
>    14|      1|    }
> . . .
>  
>   23|      2|    if (a == 0 || b == 2 || b == 34 || a == b)
> Branch (23:9):  [True: 1, False: 1]
> Branch (23:19): [True: 1, False: 0]
> Branch (23:29): [True: 0, False: 0]
> Branch (23:40): [True: 0, False: 0]
> . . .
>  
>   31|      2|    b = a || c;  
> Branch (31:9):  [True: 1, False: 1]
> Branch (31:14): [True: 1, False: 0]
This looks reasonable to me.
>  
> d.      I thought about extending the “region-count” carat markers in the
text display, but it could get messy.  For the HTML output, we can get a bit
more fancy.
Yes. E.g. I think it'd be nice to add a tooltip over conditions and
conditional operators, to show how often they are true/false.
> e.      Branch miss percentages/totals will be added to the coverage
report.
>  
> Additional Notes
> -        I’m aware that constant condition folding in
CodeGenFunction::EmitBranchOnBoolExpr() needs to be taken into account.  Is
there anything else related to branch optimization that I ought to be aware of?
>  
>  
> Please let me know if these design thoughts look reasonable and if this
would be useful.  The goal is to start full implementation soon and upstream in
a few months.
+ 1 on my part.

vedant
>  
> Thanks!
> Alan Phipps
> Texas Instruments, Inc.
>  
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200124/ef0e3358/attachment.html>

Phipps, Alan via llvm-dev

2020-Jan-24 20:26 UTC

head link

[llvm-dev] [EXTERNAL] Re: Adding support for LLVM Branch Condition Coverage

Thanks, Vedant!

I’ll respond to your questions inline below.

-Alan

From: vsk at apple.com [mailto:vsk at apple.com] On Behalf Of Vedant Kumar
Sent: Friday, January 24, 2020 1:36 PM
To: Phipps, Alan
Cc: llvm-dev at lists.llvm.org
Subject: [EXTERNAL] Re: [llvm-dev] Adding support for LLVM Branch Condition
Coverage

I've heard interest expressed in a branch condition coverage feature both at
the 2017 coverage BoF, and from internal users at Apple. I'm excited to see
this go forward.

On Jan 23, 2020, at 4:09 PM, Phipps, Alan via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

Vedant Kumar asked me to post my design thoughts concerning branch coverage at
llvm-dev since there is general interest.

My team at Texas Instruments is developing an embedded ARM C/C++ compiler with
LLVM.  I would like to enhance LLVM’s code coverage capability with branch
condition coverage (for C/C++), similar to GCC/GCOV support for branch coverage.
This is useful for TI, and I think this will be a useful feature enhancement to
LLVM that I can upstream.

In a nutshell, the functionality boils down to tracking how many times a
generated “branch” instruction (based on a source code condition) is taken or
not taken (i.e. evaluated into “True” and “False”).  This applies to decision
points in control flow (if, for, while, …) as well as individual conditions on
logical operators (“&&”, “||”) in Boolean expressions.

In sketching out a design, there are three primary areas in the design that I am
proposing:

1.)    Add a new CounterMappingRegion kind for branch conditions
a.      This new region kind would track two counters, one for the “True” branch
taken count of a branch condition, and one for the “False” branch taken count.
                                                    i.     Alternatively, I
could use two separate CounterMappingRegions to track individual counters since
this is how the class was originally written to be used.  However, using a
single region kind to represent a single branch condition that ties all of the
pertinent counter information together seems like a cleaner design.
                                                   ii.     Just as for all
counters, the two branch condition counters can represent a reference to an
instrumentation counter or to a counter expression.  The two counters are
encoded along with the MappingRegions and distinguished based on the region
kind.
                                                  iii.     All other
CounterMappingRegion kinds simply ignore the second counter; nothing changes in
how they’re encoded, which preserves format backward compatibility.
b.      I think this change also requires an adjustment to the class
SourceMappingRegion to support branch conditions that can be generated into
CounterMappingRegion instances.

This would take an additional counter to represent the true/false paths. Is an
additional source location also needed to record where the condition is
evaluated (say, the precise location of the `||` operator)?

[AP] I don’t think an additional source location is needed – if I’m
understanding your question; the source location would simply record the
location of the condition itself. One CounterMappingRegion (with two counters)
for the left-hand-side, and one (two counters) for the right-hand-side.

2.)    Counter Instrumentation
a.      We can reuse most of the existing profile instrumentation counters that
are emitted as part of profiling/coverage to calculate branch condition counts
(True/False).
                                                    i.     This assumption
leverages the fact that logical operators in C are “short-circuit” operators. 
For example, the “False-taken” count for the left-hand-side condition in a
logical-or expression (e.g. condition “C1” in “C1 || C2”) can be derived from
the execution count we already track for the right-hand-side (condition “C2” in
“C1 || C2”).
b.      There does exist a case when evaluating the right-hand-side condition of
a logical operator that isn’t part of a control-flow statement (e.g. condition
“C2” in “x = C1 || C2;”) that will require instrumenting a new counter in order
to properly derive that condition’s “true” count and “false” count.

Ah, right. The current counter instrumentation only tracks how often C2 is
evaluated, but not in a way that discriminates between C2 being true/false. But
what makes conditions that aren't a part of a control flow statement
special?

[AP] E.g. if we have an if-stmt (“if (C1 || C2) { … }”), there is an associated
“ThenCount” counter we can leverage that reflects the number of times the
if-stmt decision evaluated to true.  We could (potentially) leverage that
counter to calculate how many times C2 evaluates to true (for “||”, by
subtracting how many times we know C1 to be true).  Without that additional
counter, we effectively need to add it.  An important point, though, is that
even for control-flow, the expressions can get complicated (e.g. “if ( (C1 ||
C2) && (C3 || C4) ) {…}”), and so instrumenting a counter for the
right-hand-side of these expressions might always be useful to keep the design
sane and maintainable  (admittedly, that’s how I prototyped it).

c.      I’ll avoid going too deep into detail here, but my goal is to ensure we
reuse existing profile counters as much as possible.

3.)    Visualization using llvm-cov
a.      The notion of CoverageSegment needs to be extended to comprehend the
branch condition data represented by a CounterMappingRegion above.  But then
llvm-cov can treat the segment distinctly when displaying True/False counts for
each branch condition as well as tracking total missed branches.
b.      We can also add a BranchCoverageInfo class to track branch coverage
data, similar to LineCoverageInfo and RegionCoverageInfo.
c.      The text output could look something like GCOV but with more detail that
we know (I prototyped this using logical-or):

    9|       |int main(int argc, char *argv[])
   10|      3|{
   11|      3|    if (argc == 1)
Branch (11:9): [True: 1, False: 2]
   12|      1|    {
   13|      1|        return 0;
   14|      1|    }
. . .

  23|      2|    if (a == 0 || b == 2 || b == 34 || a == b)
Branch (23:9):  [True: 1, False: 1]
Branch (23:19): [True: 1, False: 0]
Branch (23:29): [True: 0, False: 0]
Branch (23:40): [True: 0, False: 0]
. . .

  31|      2|    b = a || c;
Branch (31:9):  [True: 1, False: 1]
Branch (31:14): [True: 1, False: 0]

This looks reasonable to me.

d.      I thought about extending the “region-count” carat markers in the text
display, but it could get messy.  For the HTML output, we can get a bit more
fancy.

Yes. E.g. I think it'd be nice to add a tooltip over conditions and
conditional operators, to show how often they are true/false.

[AP] Yes, that is exactly what I was thinking.

e.      Branch miss percentages/totals will be added to the coverage report.

Additional Notes
-        I’m aware that constant condition folding in
CodeGenFunction::EmitBranchOnBoolExpr() needs to be taken into account.  Is
there anything else related to branch optimization that I ought to be aware of?

Please let me know if these design thoughts look reasonable and if this would be
useful.  The goal is to start full implementation soon and upstream in a few
months.

+ 1 on my part.

vedant

Thanks!
Alan Phipps
Texas Instruments, Inc.

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200124/541e3d34/attachment.html>

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jan 2020 - Adding support for LLVM Branch Condition Coverage

[llvm-dev] Adding support for LLVM Branch Condition Coverage

[llvm-dev] Adding support for LLVM Branch Condition Coverage

[llvm-dev] Adding support for LLVM Branch Condition Coverage

[llvm-dev] Adding support for LLVM Branch Condition Coverage

[llvm-dev] [EXTERNAL] Re: Adding support for LLVM Branch Condition Coverage

Reasonably Related Threads