thr3ads.net - Lustre discuss - [Lustre-discuss] coverage measurement at 2012 09 15 [Sep 2012]

If this information is useful, please help other people find it:
Share via:

Roman Grigoryev

2012-Sep-20 13:21 UTC

[Lustre-discuss] coverage measurement at 2012 09 15

Hi,

next coverage measurement published,
please see
http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915

Entrance page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage


Thanks,
	Roman

Dilger, Andreas

2012-Sep-29 12:24 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

Hi Roman,
The coverage data is interesting. It would be even more useful to be able to
compare it to the previous code coverage run, if they used the same method for
measuring coverage (the new report states that the method has changed and
reduced coverage).

Are the percentages if code coverage getting better or worse?  Are there
particular areas of the code that have poor coverage that could benefit from
some focussed attention with new tests?

I can definitely imagine that many error handling code paths (e.g. checking for
allocation failures) would not be exercised without specific changes (see e.g.
my unlanded patch to fix the OBD_ALLOC() failure injection code).

Running a test with periodic random allication failures enabled and fixing the
resulting bugs would improve coverage, though not in a systematic way that could
be measured/repeated. Still, this would find a class if hard-to-find bugs.

Similarly, running racer for extended periods is a good form of coverage
generation, even if not systematic/repeatable. I think the racer code could be
improved/extended by adding racet scripts that are Lustre-specific or exercise
new functionality (e.g. "lfs setstripe", setfattr, getfattr, setfacl,
getfacl). Running multiple racer instances on multiple clients/mounts and
throwing recovery into the mix would definitely find new bugs.

In general, having the code coverage is a good starting point, but it
isn''t necessarily useful if nothing is done to improve the coverage of
the tests as a result.

Cheers, Andreas

On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at xyratex.com>
wrote:
> Hi,
> 
> next coverage measurement published,
> please see
> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915
> 
> Entrance page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage
> 
> 
> Thanks,
>    Roman
> _______________________________________________
> discuss mailing list
> discuss at lists.opensfs.org
> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org

Cory Spitz

2012-Oct-02 20:19 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

Hi,
> Are the percentages if code coverage getting better or worse?I don''t know exactly, but based on the information that Robert Read
shared at LUG ''09, sanity was netting "60-70% coverage of core
Lustre
modules" (http://wiki.lustre.org/images/4/4f/RobertReadTalk1.pdf).
> I can definitely imagine that many error handling code paths (e.g. checking
for allocation failures) would not be exercised without specific changes (see
e.g. my unlanded patch to fix the OBD_ALLOC() failure injection code).Cray has started looking at testing w/forced memory allocation failures
from the Linux fault injection framework
(http://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt).
 As we make progress we''ll open tickets and push patches.  I expect to
find problems ;)  Andreas, were you talking about
http://review.whamcloud.com/#change,3037?  If not, what ticket were you
referring to?

Thanks,
-Cory


On 09/29/2012 07:24 AM, Dilger, Andreas wrote:> Hi Roman,
> The coverage data is interesting. It would be even more useful to be able
to compare it to the previous code coverage run, if they used the same method
for measuring coverage (the new report states that the method has changed and
reduced coverage).
> 
> Are the percentages if code coverage getting better or worse?  Are there
particular areas of the code that have poor coverage that could benefit from
some focussed attention with new tests?
> 
> I can definitely imagine that many error handling code paths (e.g. checking
for allocation failures) would not be exercised without specific changes (see
e.g. my unlanded patch to fix the OBD_ALLOC() failure injection code).
> 
> Running a test with periodic random allication failures enabled and fixing
the resulting bugs would improve coverage, though not in a systematic way that
could be measured/repeated. Still, this would find a class if hard-to-find bugs.
> 
> Similarly, running racer for extended periods is a good form of coverage
generation, even if not systematic/repeatable. I think the racer code could be
improved/extended by adding racet scripts that are Lustre-specific or exercise
new functionality (e.g. "lfs setstripe", setfattr, getfattr, setfacl,
getfacl). Running multiple racer instances on multiple clients/mounts and
throwing recovery into the mix would definitely find new bugs.
> 
> In general, having the code coverage is a good starting point, but it
isn''t necessarily useful if nothing is done to improve the coverage of
the tests as a result.
> 
> Cheers, Andreas
> 
> On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at
xyratex.com> wrote:
> 
>> Hi,
>>
>> next coverage measurement published,
>> please see
>> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915
>>
>> Entrance page
http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage
>>
>>
>> Thanks,
>>    Roman
>> _______________________________________________
>> discuss mailing list
>> discuss at lists.opensfs.org
>> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org
> _______________________________________________
> discuss mailing list
> discuss at lists.opensfs.org
> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org

Christopher J. Morrone

2012-Oct-02 20:33 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

On 10/02/2012 01:19 PM, Cory Spitz wrote:> Cray has started looking at testing w/forced memory allocation failures
> from the Linux fault injection framework
>
(http://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt).
>   As we make progress we''ll open tickets and push patches.  I
expect to
> find problems ;)
Cool!

Andreas Dilger

2012-Oct-02 22:44 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

On 2012-10-02, at 2:19 PM, Cory Spitz wrote:>> Are the percentages if code coverage getting better or worse?
> 
> I don''t know exactly, but based on the information that Robert
Read
> shared at LUG ''09, sanity was netting "60-70% coverage of
core Lustre
> modules" (http://wiki.lustre.org/images/4/4f/RobertReadTalk1.pdf).
I was wondering that also, but according to the original URL from Roman, the
mechanism for measuring code coverage was changed in the recent runs, so I
don''t know if it is possible to do head-to-head comparisons.
>> I can definitely imagine that many error handling code paths (e.g.
checking for allocation failures) would not be exercised without specific
changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure injection
code).
> 
> Cray has started looking at testing w/forced memory allocation failures
> from the Linux fault injection framework
>
(http://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt).
I''ve seen this, but hadn''t actually had time to look into it. 
I''m happy to see you taking the initiative to try out this new avenue
for testing.

Another related (though different) set of tests would be to run on a client or
server booted with a smaller amount of RAM (say 512MB-1GB) and see what problems
appear.  I suspect there are a lot of hash tables, constants, etc. and such that
do not properly scale with RAM size.
> As we make progress we''ll open tickets and push patches.  I expect
to
> find problems ;)
Yes, no doubt.  It is probably worthwhile to check the CEA Coverity patches
before submitting anything new, in case those failures are already fixed there.

It is probably also worthwhile to submit a patch that removes the equivalent
fault-injection code from the Lustre code paths, since it is pure runtime
overhead for every memory allocation at this point.
> Andreas, were you talking about http://review.whamcloud.com/#change,3037? 
If not, what ticket were you referring to?
Yes, that was it.  This patch has a few minor fixes that I found in my testing,
and fixes the error messages, but there is no point in fixing the fault
injection code anymore.

Cheers, Andreas
> On 09/29/2012 07:24 AM, Dilger, Andreas wrote:
>> Hi Roman,
>> The coverage data is interesting. It would be even more useful to be
able to compare it to the previous code coverage run, if they used the same
method for measuring coverage (the new report states that the method has changed
and reduced coverage).
>> 
>> Are the percentages if code coverage getting better or worse?  Are
there particular areas of the code that have poor coverage that could benefit
from some focussed attention with new tests?
>> 
>> I can definitely imagine that many error handling code paths (e.g.
checking for allocation failures) would not be exercised without specific
changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure injection
code).
>> 
>> Running a test with periodic random allication failures enabled and
fixing the resulting bugs would improve coverage, though not in a systematic way
that could be measured/repeated. Still, this would find a class if hard-to-find
bugs.
>> 
>> Similarly, running racer for extended periods is a good form of
coverage generation, even if not systematic/repeatable. I think the racer code
could be improved/extended by adding racet scripts that are Lustre-specific or
exercise new functionality (e.g. "lfs setstripe", setfattr, getfattr,
setfacl, getfacl). Running multiple racer instances on multiple clients/mounts
and throwing recovery into the mix would definitely find new bugs.
>> 
>> In general, having the code coverage is a good starting point, but it
isn''t necessarily useful if nothing is done to improve the coverage of
the tests as a result.
>> 
>> Cheers, Andreas
>> 
>> On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at
xyratex.com> wrote:
>> 
>>> Hi,
>>> 
>>> next coverage measurement published,
>>> please see
>>> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915
>>> 
>>> Entrance page
http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage
>>> 
>>> 
>>> Thanks,
>>>   Roman
>>> _______________________________________________
>>> discuss mailing list
>>> discuss at lists.opensfs.org
>>> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org
>> _______________________________________________
>> discuss mailing list
>> discuss at lists.opensfs.org
>> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Lustre Engineer            http://www.whamcloud.com/

Sébastien Buisson

2012-Oct-03 06:50 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

Le 03/10/2012 00:44, Andreas Dilger a ?crit :> Yes, no doubt.  It is probably worthwhile to check the CEA Coverity patches
before submitting anything new, in case those failures are already fixed there.
I am sure everyone of you would have rectified, but just to make it 
clear, it is Bull that is carrying out the work with Coverity :)

Cheers,
Sebastien.

Roman Grigoryev

2012-Oct-03 09:35 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

Hi Andreas,

On 09/29/2012 04:24 PM, Dilger, Andreas wrote:> Hi Roman,
> The coverage data is interesting. It would be even more useful to be able
> to compare it to the previous code coverage run, if they used the same
> method for measuring coverage (the new report states that the method has
> changed and reduced coverage).On page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage I
trying to
make something like to a history. The page is maintained manually. In my
measurements
I observe mainly pretty small coverage changes by lustre/tests code
updates. The most of 
coverage difference is going from improving collecting process by me.
Next steps ,which I
want to do, is removing from my report testing binaries (for example
lustre/tests) and include more
suites to execution.

For publishing regular measurements we (xyratex, maybe other) also
should solve some technical issues:
 - where/how deploy results?
 - how make history diagrams?
 - publishing or not raw coverage results (if yes -where?)
internally Jenkins and http sharing serve us in these tasks.
> Are the percentages if code coverage getting better or worse?  Are there
> particular areas of the code that have poor coverage that could benefit
> from some focussed attention with new tests?it is possible to answer(more or less precisely) on the last question
via looking current coverage report.
>
> I can definitely imagine that many error handling code paths (e.g.
> checking for allocation failures) would not be exercised without specific
> changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure
> injection code). 

Absolutely agree that some paths could not bee executed in a regular
environment. Often it is error or constrain
processing code (call it"error-processing" code) . I think, metric
like
"non-error-processing"  code could
be interesting and useful  and could be interpreted  of coverage of
"often-used" or "positive" code.

In terms of quality, I prefer to set higher priority to a exist
not-detected bug in "non-error-processing" code in comparing
with same bug in "error-processing" code. Maybe it is good idea for
marking somehow "error-processing" or "hard-to-execute"
 code and have report with excluded this code.

In more modern languages this code often in "catch" block in
exceptions
and these code block could be tested via
unit test. There is question: where should it be tested - in unit or
functional tests? Testing this code in unit tests often is simpler.
>
> Running a test with periodic random allication failures enabled and fixing
> the resulting bugs would improve coverage, though not in a systematic way
> that could be measured/repeated. Still, this would find a class if
> hard-to-find bugs.
>
> Similarly, running racer for extended periods is a good form of coverage
> generation, even if not systematic/repeatable. I think the racer code
> could be improved/extended by adding racet scripts that are
> Lustre-specific or exercise new functionality (e.g. "lfs
setstripe",
> setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on
> multiple clients/mounts and throwing recovery into the mix would
> definitely find new bugs.There emerge a tricky thing. We could have a test which generate not
regularly repeatable coverage.
Should we or not include the test to regular report? I think , no.
Because we want to have repeatable
result for continuously evaluate coverage. But, i think,  the test
coverage could rare evaluated separately and
 we could create some prediction of his coverage and include it to full
report.

Thanks,
    Roman>
> In general, having the code coverage is a good starting point, but it
> isn''t necessarily useful if nothing is done to improve the
coverage of the
> tests as a result. 
>
> Cheers, Andreas
>
> On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at
xyratex.com>
> wrote:
>
>> Hi,
>>
>> next coverage measurement published,
>> please see
>> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915
>>
>> Entrance page
> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage
>>
>> Thanks,
>>    Roman
>> _______________________________________________
>> discuss mailing list
>> discuss at lists.opensfs.org
>> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org

Dilger, Andreas

2012-Oct-03 17:51 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

On Oct 3, 2012, at 12:50 AM, S?bastien Buisson wrote:> Le 03/10/2012 00:44, Andreas Dilger a ?crit :
>> Yes, no doubt.  It is probably worthwhile to check the CEA Coverity
patches before submitting anything new, in case those failures are already fixed
there.
> 
> I am sure everyone of you would have rectified, but just to make it clear,
it is Bull that is carrying out the work with Coverity :)
My bad, too many patches flying around lately.

I definitely think that tools like Coverity and other static code analysis, and
fault injection are important to improving Lustre code quality, and want to make
sure that credit goes where it is due.  Thanks for correcting my mistake.

Cheers, Andreas
--
Andreas Dilger
Lustre Software Architect
Intel Corporation

Roman Grigoryev

2012-Oct-04 08:38 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

Hi Andreas,

On 09/29/2012 04:24 PM, Dilger, Andreas wrote:> Hi Roman,
> The coverage data is interesting. It would be even more useful to be able
> to compare it to the previous code coverage run, if they used the same
> method for measuring coverage (the new report states that the method has
> changed and reduced coverage).On page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage I
trying to
make something like to a history. The page is maintained manually. In my
measurements
I observe mainly pretty small coverage changes by lustre/tests code
updates. The most of coverage difference is going from improving
collecting process by me.
Next steps ,which I
want to do, is removing from my report testing binaries (for example
lustre/tests) and include more
suites to execution.

For publishing regular measurements we (xyratex, maybe other) also
should solve some technical issues:
 - where/how deploy results?
 - how make history diagrams?
 - publishing or not raw coverage results (if yes -where?)
internally Jenkins and http sharing serve us in these tasks.
> Are the percentages if code coverage getting better or worse?  Are there
> particular areas of the code that have poor coverage that could benefit
> from some focussed attention with new tests?it is possible to answer(more or less precisely) on the last question
via looking current coverage report.
>
> I can definitely imagine that many error handling code paths (e.g.
> checking for allocation failures) would not be exercised without specific
> changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure
> injection code). 

Absolutely agree that some paths could not bee executed in a regular
environment. Often it is error or constrain
processing code (call it"error-processing" code) . I think, metric
like
"non-error-processing"  code could
be interesting and useful  and could be interpreted  of coverage of
"often-used" or "positive" code.

In terms of quality, I prefer to set higher priority to a exist
not-detected bug in "non-error-processing" code in comparing
with same bug in "error-processing" code. Maybe it is good idea for
marking somehow "error-processing" or "hard-to-execute"
 code and have report with excluded this code.

In more modern languages this code often in "catch" block in
exceptions
and these code block could be tested via
unit test. There is question: where should it be tested - in unit or
functional tests? Testing this code in unit tests often is simpler.
>
> Running a test with periodic random allication failures enabled and fixing
> the resulting bugs would improve coverage, though not in a systematic way
> that could be measured/repeated. Still, this would find a class if
> hard-to-find bugs.
>
> Similarly, running racer for extended periods is a good form of coverage
> generation, even if not systematic/repeatable. I think the racer code
> could be improved/extended by adding racet scripts that are
> Lustre-specific or exercise new functionality (e.g. "lfs
setstripe",
> setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on
> multiple clients/mounts and throwing recovery into the mix would
> definitely find new bugs.There emerge a tricky thing. We could have a test which generate not
regularly repeatable coverage.
Should we or not include the test to regular report? I think , no.
Because we want to have repeatable
result for continuously evaluate coverage. But, i think,  the test
coverage could rare evaluated separately and
 we could create some prediction of his coverage and include it to full
report.

Thanks,
    Roman>
> In general, having the code coverage is a good starting point, but it
> isn''t necessarily useful if nothing is done to improve the
coverage of the
> tests as a result. 
>
> Cheers, Andreas
>
> On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at
xyratex.com>
> wrote:
>
>> Hi,
>>
>> next coverage measurement published,
>> please see
>> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915
>>
>> Entrance page
> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage
>>
>> Thanks,
>>    Roman
>> _______________________________________________
>> discuss mailing list
>> discuss at lists.opensfs.org
>> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org

Roman Grigoryev

2012-Oct-08 14:12 UTC

head link

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

Hi Andreas,

On 09/29/2012 04:24 PM, Dilger, Andreas wrote:> Hi Roman,
> The coverage data is interesting. It would be even more useful to be able
> to compare it to the previous code coverage run, if they used the same
> method for measuring coverage (the new report states that the method has
> changed and reduced coverage).On page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage I
trying to
make something like to a history. The page is maintained manually. In my
measurements
I observe mainly pretty small coverage changes by lustre/tests code
updates. The most of coverage difference is going from improving
collecting process by me.
Next steps ,which I
want to do, is removing from my report testing binaries (for example
lustre/tests) and include more
suites to execution.

For publishing regular measurements we (xyratex, maybe other) also
should solve some technical issues:
 - where/how deploy results?
 - how make history diagrams?
 - publishing or not raw coverage results (if yes -where?)
internally Jenkins and http sharing serve us in these tasks.
> Are the percentages if code coverage getting better or worse?  Are there
> particular areas of the code that have poor coverage that could benefit
> from some focussed attention with new tests?it is possible to answer(more or less precisely) on the last question
via looking current coverage report.
>
> I can definitely imagine that many error handling code paths (e.g.
> checking for allocation failures) would not be exercised without specific
> changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure
> injection code). 

Absolutely agree that some paths could not bee executed in a regular
environment. Often it is error or constrain
processing code (call it"error-processing" code) . I think, metric
like
"non-error-processing"  code could
be interesting and useful  and could be interpreted  of coverage of
"often-used" or "positive" code.

In terms of quality, I prefer to set higher priority to a exist
not-detected bug in "non-error-processing" code in comparing
with same bug in "error-processing" code. Maybe it is good idea for
marking somehow "error-processing" or "hard-to-execute"
 code and have report with excluded this code.

In more modern languages this code often in "catch" block in
exceptions
and these code block could be tested via
unit test. There is question: where should it be tested - in unit or
functional tests? Testing this code in unit tests often is simpler.
>
> Running a test with periodic random allication failures enabled and fixing
> the resulting bugs would improve coverage, though not in a systematic way
> that could be measured/repeated. Still, this would find a class if
> hard-to-find bugs.
>
> Similarly, running racer for extended periods is a good form of coverage
> generation, even if not systematic/repeatable. I think the racer code
> could be improved/extended by adding racet scripts that are
> Lustre-specific or exercise new functionality (e.g. "lfs
setstripe",
> setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on
> multiple clients/mounts and throwing recovery into the mix would
> definitely find new bugs.There emerge a tricky thing. We could have a test which generate not
regularly repeatable coverage.
Should we or not include the test to regular report? I think , no.
Because we want to have repeatable
result for continuously evaluate coverage. But, i think,  the test
coverage could rare evaluated separately and
 we could create some prediction of his coverage and include it to full
report.

Thanks,
    Roman>
> In general, having the code coverage is a good starting point, but it
> isn''t necessarily useful if nothing is done to improve the
coverage of the
> tests as a result. 
>
> Cheers, Andreas
>
> On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at
xyratex.com>
> wrote:
>
>> Hi,
>>
>> next coverage measurement published,
>> please see
>> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915
>>
>> Entrance page
> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage
>>
>> Thanks,
>>    Roman
>> _______________________________________________
>> discuss mailing list
>> discuss at lists.opensfs.org
>> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org

Lustre discuss - Sep 2012 - coverage measurement at 2012 09 15

[Lustre-discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15

[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15