Hi Not sure if this is a clang or llvm related question so I'm sending to both mailing lists. Anyways, I have few questions regarding size and execution time of instrumented code: We are trying to run code coverage on memory limited hardware and investigating both (generating gcov output using -coverage and the llvm's own way using -fprofile-instr-generate -fcoverage-mapping clang flags) In my questions I refer to the two methods as llgcov-way and llcovprof-way respectively. Q1- How come size of instrumented code in llgcov-way turns bigger than llcovprof-way? I would imagine the other way because mapping information would go to *.gcno files. Q2- The instrumented executable size of both llgcov-way and llcovprof-way (with and without optimization) is bloated 2x to 10x or in some cases 50x depending on the program. Here is output of size command for the variations on a simple test program that I wrote: text data bss dec hex filename 5625 700 1696 8021 1f55 simpletest.-O0-g.llcovprof. 12838 776 1808 15422 3c3e simpletest.-O0-g.llgcov. 1481 492 1616 3589 e05 simpletest.-O0.none 5337 700 1696 7733 1e35 simpletest.-O1-g.llcovprof. 12246 776 1792 14814 39de simpletest.-O1-g.llgcov. 1345 492 1616 3453 d7d simpletest.-O1.none I was wondering if there is any suggestion for reducing the size either through more optimization or by compromising some feature. Q3- in llcovprof-way since the runtime profile data is collected in a single file, the file system will serialize multi threaded writes, hence increasing the execution time. Is there a way to avoid this? Thanks Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150409/d45d4f1c/attachment.html>
"Moshtaghi, Alireza" <Alireza.Moshtaghi at netapp.com> writes:> Hi > Not sure if this is a clang or llvm related question so I’m sending to both > mailing lists. > Anyways, I have few questions regarding size and execution time of > instrumented code: > We are trying to run code coverage on memory limited hardware and > investigating both (generating gcov output using —coverage and the llvm’s own > way using -fprofile-instr-generate -fcoverage-mapping clang flags) In my > questions I refer to the two methods as llgcov-way and llcovprof-way > respectively. > > Q1- How come size of instrumented code in llgcov-way turns bigger than > llcovprof-way? I would imagine the other way because mapping information would > go to *.gcno files.I haven't measured, but I can make a guess. The gcov way isn't very smart about minimizing the number of counters needed, so it tends to grow the text section of the instrumented binary by quite a bit. More counters also means that the counters get in the way of optimizations more, so it's harder to reduce the binary size.> Q2- The instrumented executable size of both llgcov-way and llcovprof-way > (with and without optimization) is bloated 2x to 10x or in some cases 50x > depending on the program. Here is output of size command for the variations on > a simple test program that I wrote: > text data bss dec hex filename > 5625 700 1696 8021 1f55 simpletest.-O0-g.llcovprof. > 12838 776 1808 15422 3c3e simpletest.-O0-g.llgcov. > 1481 492 1616 3589 e05 simpletest.-O0.none > 5337 700 1696 7733 1e35 simpletest.-O1-g.llcovprof. > 12246 776 1792 14814 39de simpletest.-O1-g.llgcov. > 1345 492 1616 3453 d7d simpletest.-O1.none > I was wondering if there is any suggestion for reducing the size either > through more optimization or by compromising some feature.The coverage data that the instrprof coverage adds to the binary isn't actually necessary for runtime, so you could probably get away with building the binary twice (once with -fcoverage-mapping, once with only -fprofile-instr-generate) and using one for execution and the other for data collection. The binaries will still be larger, due to the profiling itself, but it might help. Let me know if this works - it might be worth adding an option to emit the coverage data into a separate file if this is valuable.> Q3- in llcovprof-way since the runtime profile data is collected in a single > file, the file system will serialize multi threaded writes, hence increasing > the execution time. Is there a way to avoid this?There isn't anything in place to help with this right now.
Hi Justin, Thanks for your response, Your suggestion works, though as you have also suggested, would be great value if an option is added to put the coverage data into a separate file. Our build considers some of the hardware limitations and would not build a full object file with both instructions and mapping data if they get too big. In addition, building two large systems which do the same thing would be a big waste. I would think that if an option is added to the compiler, it would also require an absolute filepath which will be appended by the compiler with the coverage information from each new compilation unit right? Or you are thinking of another way? Hypothetically if the coverage info is saved in elf format, llvm-cov could read it without modification right? Nevertheless, It might be a useful thing for others as well, so would it be possible that you add it to the tree? Or if you give me some pointers, I can implement them and send the patch. An alternative would probably be tweaking the linker to create two files instead of one which would be kind of a hack IMHO. Thanks Ali On 4/9/15, 2:51 PM, "Justin Bogner" <mail at justinbogner.com> wrote:>>Q2- The instrumented executable size of both llgcov-way and llcovprof-way >>(with and without optimization) is bloated 2x to 10x or in some cases 50x >>depending on the program. Here is output of size command for the >>variations on >>a simple test program that I wrote: >> text data bss dec hex filename >> 5625 700 1696 8021 1f55 simpletest.-O0-g.llcovprof. >> 12838 776 1808 15422 3c3e simpletest.-O0-g.llgcov. >> 1481 492 1616 3589 e05 simpletest.-O0.none >> 5337 700 1696 7733 1e35 simpletest.-O1-g.llcovprof. >> 12246 776 1792 14814 39de simpletest.-O1-g.llgcov. >> 1345 492 1616 3453 d7d simpletest.-O1.none >>I was wondering if there is any suggestion for reducing the size either >>through more optimization or by compromising some feature. > >The coverage data that the instrprof coverage adds to the binary isn't >actually necessary for runtime, so you could probably get away with >building the binary twice (once with -fcoverage-mapping, once with only >-fprofile-instr-generate) and using one for execution and the other for >data collection. The binaries will still be larger, due to the profiling >itself, but it might help. > >Let me know if this works - it might be worth adding an option to emit >the coverage data into a separate file if this is valuable. >