Linhai Song via llvm-dev
2018-Jan-26  06:04 UTC
[llvm-dev] count how many basic block executed
Hello everyone,
I am writing a pass to instrument program and count how many basic block
executed. What I have tried is to instrument a local counter inside each
function, add 1 to the local counter inside each basic block, and save the
counter value to a global counter. The current runtime overhead is around 25%.
Is there any way I can try to lower the overhead? Like keeping the local counter
inside a register or applying the path profiling algorithm?
Thanks a lot!
Best,
                         Linhai
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/60684008/attachment.html>
John Criswell via llvm-dev
2018-Jan-27  21:11 UTC
[llvm-dev] count how many basic block executed
On 1/26/18 1:04 AM, Linhai Song via llvm-dev wrote:> > Hello everyone, > > > I am writing a pass to instrument program and count how many basic > block executed. What I have tried is to instrument a local counter > inside each function, add 1 to the local counter inside each basic > block, and save the counter value to a global counter. The current > runtime overhead is around 25%. Is there any way I can try to lower > the overhead? Like keeping the local counter inside a register or > applying the path profiling algorithm? >By "local counter," I assume you mean that you created an alloca instruction that allocates memory and that you increment the value in this alloca'ed memory using a load, add, and store instruction. Is that correct? If so, have you tried using the mem2reg pass to convert the local counter into a SSA virtual register? That may speed it up a bit. After that, other LLVM optimizations may be able to remove redundant instructions or combine additions. If that isn't enough, then you'll probably need to make your instrumentation smarter. LLVM has passes that you can use to locate loops; if the loop has the right structure, you can increment the count at the end of the loop. Likewise, if you can find control equivalent basic blocks, you only need to increment the counter in one of them. Regards, John Criswell> > Thanks a lot! > > > Best, > > > Linhai > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- John Criswell Assistant Professor Department of Computer Science, University of Rochester http://www.cs.rochester.edu/u/criswell -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180127/623a95a3/attachment.html>
Linhai Song via llvm-dev
2018-Jan-28  05:06 UTC
[llvm-dev] count how many basic block executed
Hi John,
Thanks a lot for the reply! I try mem2reg opt and also implement the algorithm
proposed in "Efficiently Counting Program Events with Support for On-line
Queries" to place the local counter smarter. If I build the executable by
using -O0, the overhead would be 20% - 30%. But if I build the executable by
using -O2, the overhead would be more than 3X. I feel instrumenting counter will
disable some optimization. Any other suggestions I could try?
Thanks a lot!
Best,
                         Linhai
________________________________
From: John Criswell <jtcriswel at gmail.com>
Sent: Saturday, January 27, 2018 3:11:50 PM
To: Linhai Song; Llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] count how many basic block executed
On 1/26/18 1:04 AM, Linhai Song via llvm-dev wrote:
Hello everyone,
I am writing a pass to instrument program and count how many basic block
executed. What I have tried is to instrument a local counter inside each
function, add 1 to the local counter inside each basic block, and save the
counter value to a global counter. The current runtime overhead is around 25%.
Is there any way I can try to lower the overhead? Like keeping the local counter
inside a register or applying the path profiling algorithm?
By "local counter," I assume you mean that you created an alloca
instruction that allocates memory and that you increment the value in this
alloca'ed memory using a load, add, and store instruction.  Is that correct?
If so, have you tried using the mem2reg pass to convert the local counter into a
SSA virtual register?  That may speed it up a bit.  After that, other LLVM
optimizations may be able to remove redundant instructions or combine additions.
If that isn't enough, then you'll probably need to make your
instrumentation smarter.  LLVM has passes that you can use to locate loops; if
the loop has the right structure, you can increment the count at the end of the
loop.  Likewise, if you can find control equivalent basic blocks, you only need
to increment the counter in one of them.
Regards,
John Criswell
Thanks a lot!
Best,
                         Linhai
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180128/7edff93e/attachment.html>
Maybe Matching Threads
- count how many basic block executed
- count how many basic block executed
- [LLVMdev] Getting basic block address offset from its parent function
- [LLVMdev] Getting basic block address offset from its parent function
- [LLVMdev] Getting basic block address offset from its parent function