Hi, next coverage measurement published, please see http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915 Entrance page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage Thanks, Roman
Dilger, Andreas
2012-Sep-29 12:24 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
Hi Roman, The coverage data is interesting. It would be even more useful to be able to compare it to the previous code coverage run, if they used the same method for measuring coverage (the new report states that the method has changed and reduced coverage). Are the percentages if code coverage getting better or worse? Are there particular areas of the code that have poor coverage that could benefit from some focussed attention with new tests? I can definitely imagine that many error handling code paths (e.g. checking for allocation failures) would not be exercised without specific changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure injection code). Running a test with periodic random allication failures enabled and fixing the resulting bugs would improve coverage, though not in a systematic way that could be measured/repeated. Still, this would find a class if hard-to-find bugs. Similarly, running racer for extended periods is a good form of coverage generation, even if not systematic/repeatable. I think the racer code could be improved/extended by adding racet scripts that are Lustre-specific or exercise new functionality (e.g. "lfs setstripe", setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on multiple clients/mounts and throwing recovery into the mix would definitely find new bugs. In general, having the code coverage is a good starting point, but it isn''t necessarily useful if nothing is done to improve the coverage of the tests as a result. Cheers, Andreas On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at xyratex.com> wrote:> Hi, > > next coverage measurement published, > please see > http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915 > > Entrance page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage > > > Thanks, > Roman > _______________________________________________ > discuss mailing list > discuss at lists.opensfs.org > http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org
Cory Spitz
2012-Oct-02 20:19 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
Hi,> Are the percentages if code coverage getting better or worse?I don''t know exactly, but based on the information that Robert Read shared at LUG ''09, sanity was netting "60-70% coverage of core Lustre modules" (http://wiki.lustre.org/images/4/4f/RobertReadTalk1.pdf).> I can definitely imagine that many error handling code paths (e.g. checking for allocation failures) would not be exercised without specific changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure injection code).Cray has started looking at testing w/forced memory allocation failures from the Linux fault injection framework (http://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt). As we make progress we''ll open tickets and push patches. I expect to find problems ;) Andreas, were you talking about http://review.whamcloud.com/#change,3037? If not, what ticket were you referring to? Thanks, -Cory On 09/29/2012 07:24 AM, Dilger, Andreas wrote:> Hi Roman, > The coverage data is interesting. It would be even more useful to be able to compare it to the previous code coverage run, if they used the same method for measuring coverage (the new report states that the method has changed and reduced coverage). > > Are the percentages if code coverage getting better or worse? Are there particular areas of the code that have poor coverage that could benefit from some focussed attention with new tests? > > I can definitely imagine that many error handling code paths (e.g. checking for allocation failures) would not be exercised without specific changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure injection code). > > Running a test with periodic random allication failures enabled and fixing the resulting bugs would improve coverage, though not in a systematic way that could be measured/repeated. Still, this would find a class if hard-to-find bugs. > > Similarly, running racer for extended periods is a good form of coverage generation, even if not systematic/repeatable. I think the racer code could be improved/extended by adding racet scripts that are Lustre-specific or exercise new functionality (e.g. "lfs setstripe", setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on multiple clients/mounts and throwing recovery into the mix would definitely find new bugs. > > In general, having the code coverage is a good starting point, but it isn''t necessarily useful if nothing is done to improve the coverage of the tests as a result. > > Cheers, Andreas > > On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at xyratex.com> wrote: > >> Hi, >> >> next coverage measurement published, >> please see >> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915 >> >> Entrance page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage >> >> >> Thanks, >> Roman >> _______________________________________________ >> discuss mailing list >> discuss at lists.opensfs.org >> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org > _______________________________________________ > discuss mailing list > discuss at lists.opensfs.org > http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org
Christopher J. Morrone
2012-Oct-02 20:33 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
On 10/02/2012 01:19 PM, Cory Spitz wrote:> Cray has started looking at testing w/forced memory allocation failures > from the Linux fault injection framework > (http://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt). > As we make progress we''ll open tickets and push patches. I expect to > find problems ;)Cool!
Andreas Dilger
2012-Oct-02 22:44 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
On 2012-10-02, at 2:19 PM, Cory Spitz wrote:>> Are the percentages if code coverage getting better or worse? > > I don''t know exactly, but based on the information that Robert Read > shared at LUG ''09, sanity was netting "60-70% coverage of core Lustre > modules" (http://wiki.lustre.org/images/4/4f/RobertReadTalk1.pdf).I was wondering that also, but according to the original URL from Roman, the mechanism for measuring code coverage was changed in the recent runs, so I don''t know if it is possible to do head-to-head comparisons.>> I can definitely imagine that many error handling code paths (e.g. checking for allocation failures) would not be exercised without specific changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure injection code). > > Cray has started looking at testing w/forced memory allocation failures > from the Linux fault injection framework > (http://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt).I''ve seen this, but hadn''t actually had time to look into it. I''m happy to see you taking the initiative to try out this new avenue for testing. Another related (though different) set of tests would be to run on a client or server booted with a smaller amount of RAM (say 512MB-1GB) and see what problems appear. I suspect there are a lot of hash tables, constants, etc. and such that do not properly scale with RAM size.> As we make progress we''ll open tickets and push patches. I expect to > find problems ;)Yes, no doubt. It is probably worthwhile to check the CEA Coverity patches before submitting anything new, in case those failures are already fixed there. It is probably also worthwhile to submit a patch that removes the equivalent fault-injection code from the Lustre code paths, since it is pure runtime overhead for every memory allocation at this point.> Andreas, were you talking about http://review.whamcloud.com/#change,3037? If not, what ticket were you referring to?Yes, that was it. This patch has a few minor fixes that I found in my testing, and fixes the error messages, but there is no point in fixing the fault injection code anymore. Cheers, Andreas> On 09/29/2012 07:24 AM, Dilger, Andreas wrote: >> Hi Roman, >> The coverage data is interesting. It would be even more useful to be able to compare it to the previous code coverage run, if they used the same method for measuring coverage (the new report states that the method has changed and reduced coverage). >> >> Are the percentages if code coverage getting better or worse? Are there particular areas of the code that have poor coverage that could benefit from some focussed attention with new tests? >> >> I can definitely imagine that many error handling code paths (e.g. checking for allocation failures) would not be exercised without specific changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure injection code). >> >> Running a test with periodic random allication failures enabled and fixing the resulting bugs would improve coverage, though not in a systematic way that could be measured/repeated. Still, this would find a class if hard-to-find bugs. >> >> Similarly, running racer for extended periods is a good form of coverage generation, even if not systematic/repeatable. I think the racer code could be improved/extended by adding racet scripts that are Lustre-specific or exercise new functionality (e.g. "lfs setstripe", setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on multiple clients/mounts and throwing recovery into the mix would definitely find new bugs. >> >> In general, having the code coverage is a good starting point, but it isn''t necessarily useful if nothing is done to improve the coverage of the tests as a result. >> >> Cheers, Andreas >> >> On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at xyratex.com> wrote: >> >>> Hi, >>> >>> next coverage measurement published, >>> please see >>> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915 >>> >>> Entrance page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage >>> >>> >>> Thanks, >>> Roman >>> _______________________________________________ >>> discuss mailing list >>> discuss at lists.opensfs.org >>> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org >> _______________________________________________ >> discuss mailing list >> discuss at lists.opensfs.org >> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Whamcloud, Inc. Principal Lustre Engineer http://www.whamcloud.com/
Sébastien Buisson
2012-Oct-03 06:50 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
Le 03/10/2012 00:44, Andreas Dilger a ?crit :> Yes, no doubt. It is probably worthwhile to check the CEA Coverity patches before submitting anything new, in case those failures are already fixed there.I am sure everyone of you would have rectified, but just to make it clear, it is Bull that is carrying out the work with Coverity :) Cheers, Sebastien.
Roman Grigoryev
2012-Oct-03 09:35 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
Hi Andreas, On 09/29/2012 04:24 PM, Dilger, Andreas wrote:> Hi Roman, > The coverage data is interesting. It would be even more useful to be able > to compare it to the previous code coverage run, if they used the same > method for measuring coverage (the new report states that the method has > changed and reduced coverage).On page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage I trying to make something like to a history. The page is maintained manually. In my measurements I observe mainly pretty small coverage changes by lustre/tests code updates. The most of coverage difference is going from improving collecting process by me. Next steps ,which I want to do, is removing from my report testing binaries (for example lustre/tests) and include more suites to execution. For publishing regular measurements we (xyratex, maybe other) also should solve some technical issues: - where/how deploy results? - how make history diagrams? - publishing or not raw coverage results (if yes -where?) internally Jenkins and http sharing serve us in these tasks.> Are the percentages if code coverage getting better or worse? Are there > particular areas of the code that have poor coverage that could benefit > from some focussed attention with new tests?it is possible to answer(more or less precisely) on the last question via looking current coverage report.> > I can definitely imagine that many error handling code paths (e.g. > checking for allocation failures) would not be exercised without specific > changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure > injection code).Absolutely agree that some paths could not bee executed in a regular environment. Often it is error or constrain processing code (call it"error-processing" code) . I think, metric like "non-error-processing" code could be interesting and useful and could be interpreted of coverage of "often-used" or "positive" code. In terms of quality, I prefer to set higher priority to a exist not-detected bug in "non-error-processing" code in comparing with same bug in "error-processing" code. Maybe it is good idea for marking somehow "error-processing" or "hard-to-execute" code and have report with excluded this code. In more modern languages this code often in "catch" block in exceptions and these code block could be tested via unit test. There is question: where should it be tested - in unit or functional tests? Testing this code in unit tests often is simpler.> > Running a test with periodic random allication failures enabled and fixing > the resulting bugs would improve coverage, though not in a systematic way > that could be measured/repeated. Still, this would find a class if > hard-to-find bugs. > > Similarly, running racer for extended periods is a good form of coverage > generation, even if not systematic/repeatable. I think the racer code > could be improved/extended by adding racet scripts that are > Lustre-specific or exercise new functionality (e.g. "lfs setstripe", > setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on > multiple clients/mounts and throwing recovery into the mix would > definitely find new bugs.There emerge a tricky thing. We could have a test which generate not regularly repeatable coverage. Should we or not include the test to regular report? I think , no. Because we want to have repeatable result for continuously evaluate coverage. But, i think, the test coverage could rare evaluated separately and we could create some prediction of his coverage and include it to full report. Thanks, Roman> > In general, having the code coverage is a good starting point, but it > isn''t necessarily useful if nothing is done to improve the coverage of the > tests as a result. > > Cheers, Andreas > > On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at xyratex.com> > wrote: > >> Hi, >> >> next coverage measurement published, >> please see >> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915 >> >> Entrance page > http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage >> >> Thanks, >> Roman >> _______________________________________________ >> discuss mailing list >> discuss at lists.opensfs.org >> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org
Dilger, Andreas
2012-Oct-03 17:51 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
On Oct 3, 2012, at 12:50 AM, S?bastien Buisson wrote:> Le 03/10/2012 00:44, Andreas Dilger a ?crit : >> Yes, no doubt. It is probably worthwhile to check the CEA Coverity patches before submitting anything new, in case those failures are already fixed there. > > I am sure everyone of you would have rectified, but just to make it clear, it is Bull that is carrying out the work with Coverity :)My bad, too many patches flying around lately. I definitely think that tools like Coverity and other static code analysis, and fault injection are important to improving Lustre code quality, and want to make sure that credit goes where it is due. Thanks for correcting my mistake. Cheers, Andreas -- Andreas Dilger Lustre Software Architect Intel Corporation
Roman Grigoryev
2012-Oct-04 08:38 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
Hi Andreas, On 09/29/2012 04:24 PM, Dilger, Andreas wrote:> Hi Roman, > The coverage data is interesting. It would be even more useful to be able > to compare it to the previous code coverage run, if they used the same > method for measuring coverage (the new report states that the method has > changed and reduced coverage).On page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage I trying to make something like to a history. The page is maintained manually. In my measurements I observe mainly pretty small coverage changes by lustre/tests code updates. The most of coverage difference is going from improving collecting process by me. Next steps ,which I want to do, is removing from my report testing binaries (for example lustre/tests) and include more suites to execution. For publishing regular measurements we (xyratex, maybe other) also should solve some technical issues: - where/how deploy results? - how make history diagrams? - publishing or not raw coverage results (if yes -where?) internally Jenkins and http sharing serve us in these tasks.> Are the percentages if code coverage getting better or worse? Are there > particular areas of the code that have poor coverage that could benefit > from some focussed attention with new tests?it is possible to answer(more or less precisely) on the last question via looking current coverage report.> > I can definitely imagine that many error handling code paths (e.g. > checking for allocation failures) would not be exercised without specific > changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure > injection code).Absolutely agree that some paths could not bee executed in a regular environment. Often it is error or constrain processing code (call it"error-processing" code) . I think, metric like "non-error-processing" code could be interesting and useful and could be interpreted of coverage of "often-used" or "positive" code. In terms of quality, I prefer to set higher priority to a exist not-detected bug in "non-error-processing" code in comparing with same bug in "error-processing" code. Maybe it is good idea for marking somehow "error-processing" or "hard-to-execute" code and have report with excluded this code. In more modern languages this code often in "catch" block in exceptions and these code block could be tested via unit test. There is question: where should it be tested - in unit or functional tests? Testing this code in unit tests often is simpler.> > Running a test with periodic random allication failures enabled and fixing > the resulting bugs would improve coverage, though not in a systematic way > that could be measured/repeated. Still, this would find a class if > hard-to-find bugs. > > Similarly, running racer for extended periods is a good form of coverage > generation, even if not systematic/repeatable. I think the racer code > could be improved/extended by adding racet scripts that are > Lustre-specific or exercise new functionality (e.g. "lfs setstripe", > setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on > multiple clients/mounts and throwing recovery into the mix would > definitely find new bugs.There emerge a tricky thing. We could have a test which generate not regularly repeatable coverage. Should we or not include the test to regular report? I think , no. Because we want to have repeatable result for continuously evaluate coverage. But, i think, the test coverage could rare evaluated separately and we could create some prediction of his coverage and include it to full report. Thanks, Roman> > In general, having the code coverage is a good starting point, but it > isn''t necessarily useful if nothing is done to improve the coverage of the > tests as a result. > > Cheers, Andreas > > On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at xyratex.com> > wrote: > >> Hi, >> >> next coverage measurement published, >> please see >> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915 >> >> Entrance page > http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage >> >> Thanks, >> Roman >> _______________________________________________ >> discuss mailing list >> discuss at lists.opensfs.org >> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org
Roman Grigoryev
2012-Oct-08 14:12 UTC
[Lustre-discuss] [Discuss] coverage measurement at 2012 09 15
Hi Andreas, On 09/29/2012 04:24 PM, Dilger, Andreas wrote:> Hi Roman, > The coverage data is interesting. It would be even more useful to be able > to compare it to the previous code coverage run, if they used the same > method for measuring coverage (the new report states that the method has > changed and reduced coverage).On page http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage I trying to make something like to a history. The page is maintained manually. In my measurements I observe mainly pretty small coverage changes by lustre/tests code updates. The most of coverage difference is going from improving collecting process by me. Next steps ,which I want to do, is removing from my report testing binaries (for example lustre/tests) and include more suites to execution. For publishing regular measurements we (xyratex, maybe other) also should solve some technical issues: - where/how deploy results? - how make history diagrams? - publishing or not raw coverage results (if yes -where?) internally Jenkins and http sharing serve us in these tasks.> Are the percentages if code coverage getting better or worse? Are there > particular areas of the code that have poor coverage that could benefit > from some focussed attention with new tests?it is possible to answer(more or less precisely) on the last question via looking current coverage report.> > I can definitely imagine that many error handling code paths (e.g. > checking for allocation failures) would not be exercised without specific > changes (see e.g. my unlanded patch to fix the OBD_ALLOC() failure > injection code).Absolutely agree that some paths could not bee executed in a regular environment. Often it is error or constrain processing code (call it"error-processing" code) . I think, metric like "non-error-processing" code could be interesting and useful and could be interpreted of coverage of "often-used" or "positive" code. In terms of quality, I prefer to set higher priority to a exist not-detected bug in "non-error-processing" code in comparing with same bug in "error-processing" code. Maybe it is good idea for marking somehow "error-processing" or "hard-to-execute" code and have report with excluded this code. In more modern languages this code often in "catch" block in exceptions and these code block could be tested via unit test. There is question: where should it be tested - in unit or functional tests? Testing this code in unit tests often is simpler.> > Running a test with periodic random allication failures enabled and fixing > the resulting bugs would improve coverage, though not in a systematic way > that could be measured/repeated. Still, this would find a class if > hard-to-find bugs. > > Similarly, running racer for extended periods is a good form of coverage > generation, even if not systematic/repeatable. I think the racer code > could be improved/extended by adding racet scripts that are > Lustre-specific or exercise new functionality (e.g. "lfs setstripe", > setfattr, getfattr, setfacl, getfacl). Running multiple racer instances on > multiple clients/mounts and throwing recovery into the mix would > definitely find new bugs.There emerge a tricky thing. We could have a test which generate not regularly repeatable coverage. Should we or not include the test to regular report? I think , no. Because we want to have repeatable result for continuously evaluate coverage. But, i think, the test coverage could rare evaluated separately and we could create some prediction of his coverage and include it to full report. Thanks, Roman> > In general, having the code coverage is a good starting point, but it > isn''t necessarily useful if nothing is done to improve the coverage of the > tests as a result. > > Cheers, Andreas > > On 2012-09-20, at 7:21, Roman Grigoryev <Roman_Grigoryev at xyratex.com> > wrote: > >> Hi, >> >> next coverage measurement published, >> please see >> http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage20120915 >> >> Entrance page > http://www.opensfs.org/foswiki/bin/view/Lustre/CodeCoverage >> >> Thanks, >> Roman >> _______________________________________________ >> discuss mailing list >> discuss at lists.opensfs.org >> http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org