Harp, Thom via llvm-dev
2018-Feb-01 20:03 UTC
[llvm-dev] Customizing SBCC for lcov workflows
I’m working to implement Source Based Code Coverage in a workflow that uses lcov for report generation. We’ve customized our llvm-cov to add a command to convert the SBCC counter data to lcov’s ‘.info’ format. The problem is that the region-based counter definitions in SBCC can span source code regions that can contain blank lines (or lines with only comments). Converting this to lcov’s line-based format skews our line and hit counts wildly compared to reports we generate from gcov data. E.g, a ‘for’ loop might span 10 lines in the source code, but if 7 of those are comments or blank lines, gcov counts only the 3 lines that are executable while SBCC counts all 10 because at least one counter will span the whole block. llvm-cov has no visibility into the source code so we can’t reliably clean this up from there. Instead I’m thinking of modifying the coverage mapping generator (clang/lib/CodeGen/CoverageMappingGen.cpp) to (optionally) record the line numbers for source lines that have real code on them and emit this informaiton alongside the coverage-mapping data in the object files. In llvm-cov I can then use the extra info so SBCC counts blank lines the same way gcov does. I’m writing to solicit feedback on this idea. To identify the interesting line numbers, I record the line numbers for statements that have no children while the coverage mapping generator walks the AST. These statements are leaf nodes and correspond to single, concrete items in the source code (effectively filtering out statements that can span multiple lines of code). Writing the collected line numbers to the __llvm_covmap section is easy enough. I’m still working through the coverage mapping reader code so I can take advantage of the new information in llvm-cov while generating the lcov .info files. I’d like to eventually push this change upstream (I can’t be the only one wanting to use lcov and SBCC together), so I’m seeking feedback before I invest too much time in my current direction. Is my approach sound? Is there a better way to get source code information from llvm-cov? Thanks. -thom -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180201/02ccc809/attachment.html>
Vedant Kumar via llvm-dev
2018-Feb-02 21:50 UTC
[llvm-dev] Customizing SBCC for lcov workflows
Hi Thom,> On Feb 1, 2018, at 12:03 PM, Harp, Thom via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I’m working to implement Source Based Code Coverage in a workflow that uses lcov for report generation. We’ve customized our llvm-cov to add a command to convert the SBCC counter data to lcov’s ‘.info’ format. The problem is that the region-based counter definitions in SBCC can span source code regions that can contain blank lines (or lines with only comments). Converting this to lcov’s line-based format skews our line and hit counts wildly compared to reports we generate from gcov data. E.g, a ‘for’ loop might span 10 lines in the source code, but if 7 of those are comments or blank lines, gcov counts only the 3 lines that are executable while SBCC counts all 10 because at least one counter will span the whole block. llvm-cov has no visibility into the source code so we can’t reliably clean this up from there. > > Instead I’m thinking of modifying the coverage mapping generator (clang/lib/CodeGen/CoverageMappingGen.cpp) to (optionally) record the line numbers for source lines that have real code on them and emit this informaiton alongside the coverage-mapping data in the object files. In llvm-cov I can then use the extra info so SBCC counts blank lines the same way gcov does. I’m writing to solicit feedback on this idea. > > To identify the interesting line numbers, I record the line numbers for statements that have no children while the coverage mapping generator walks the AST. These statements are leaf nodes and correspond to single, concrete items in the source code (effectively filtering out statements that can span multiple lines of code). Writing the collected line numbers to the __llvm_covmap section is easy enough. I’m still working through the coverage mapping reader code so I can take advantage of the new information in llvm-cov while generating the lcov .info files.This sounds like a reasonable plan -- I don't think it needs to be an opt-in behavior. Once you've built up the list of source ranges that should be skipped you can feed it into gatherSkippedRegions(). I anticipate that most of the work will be in updating test cases in clang and compiler-rt.> I’d like to eventually push this change upstream (I can’t be the only one wanting to use lcov and SBCC together), so I’m seeking feedback before I invest too much time in my current direction. Is my approach sound?That sounds great. My advice is to upstream your changes in small self-contained pieces. Concretely, it'd be nice to have the .info generation before any clang changes. Feel free to add me as a reviewer on Phab (reviews.llvm.org <http://reviews.llvm.org/>). best, vedant> Is there a better way to get source code information from llvm-cov? > > Thanks. > -thom > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180202/d7a36ac9/attachment.html>
Maybe Matching Threads
- llvm-dev Digest, Vol 164, Issue 6
- Unable to generate lcov test coverage reports (Out of memory error)
- Xen GCOV Patches for latest Xen Unbstable and linux 2.6.18.8 kernel(32/64bit)
- Xen GCOV Patches for latest Xen Unbstable and linux 2.6.18.8 kernel(32/64bit)
- [LLVMdev] LCOV per commit