thr3ads.net - llvm dev - [LLVMdev] Profiling in LLVM Patch Followup 4 [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Andreas Neustifter

2009-Jul-01 09:36 UTC

[LLVMdev] Profiling in LLVM Patch

Hi Daniel,

Daniel Dunbar wrote:> Hi Andreas,
> 
> First, thanks again for undertaking this work and submitting it back. There
is a
> lot of good stuff here and it would be great to see it get back into the
tree.Thanks for taking the time to review this, I know its a huge patch. I still have
a few questions on how you would like this patch to be re-factored and split up.
> [...]
> 
> 1. The intent of ProfileInfo is that it is the public interface used by the
rest
> of LLVM for all "profiling" related analyses. As such, we want it
to have a
> thin, well-defined interface which can be implemented by multiple analysis
> providers, just like we have with the current alias analysis interface.
Your
> current patch has a large number of changes to ProfileInfo which should
instead
> be inside a subclass of ProfileInfo, where most of the implementation
details
> are private.
> 
> I think the first task should be to separate out the changes which are
> appropriate for the ProfileInfo class itself and apply those to the tree. I
> believe that the important changes which are worth reflecting in the API
are
> changed the weights to return a double instead of an int, and defining -1
as a
> sentinel, "don't know", value.So you want the ProfileInfo to be untouched except for the double weights and
the MissingValue? Then mark all methods virtual and do a subclass that has this
sort of heavy implementation that I am using? Do you have any name suggestions
for this subclass?
One thing I would like to patch in the current ProfileInfo implementation is the
consistent use of the (0,entry) edge which is necessary to profile functions
with only one block (namely the entry block).
> This is the most important thing to separate out, and to do early, because
it is
> the main interface that is shared with LLVM. It is also important for my
GSoC
> student, who was going to work on static profiling for LLVM and will depend
on
> any changes to this interface.Yes, I thought so, I will try to do this quick.
> The current ProfileInfo functions should also become virtual, and all other
> implementation details should be left to the subclass.
> 
> Finally, I don't think the ignoreMissing parameter to getExecutionCount
makes
> sense, instead the provider should always return -1 if it doesn't have
the
> value, and clients should "do the right thing.Maybe, its of course always possibile to have custom wrappers that perform this
sort of task if necessary...
> 2. We shouldn't worry too much about preserving the functionality of
the current
> llvm-prof tool, which uses the ProfileInfoLoader directly. llvm-prof should
be
> rewritten to just use the public API provided by ProfileInfo. This should
be
> done early so there are less dependencies on the ProfileInfoLoader
interface.Okay, thats no problem. I guess it makes sense to rewrite llvm-prof but I was
not sure if this sort of changes would be appreciated.
> In fact, it might even make sense to just get rid of llvm-prof and instead
> implement a module level pass which consumes profiling information and
prints it
> out as part of the pass. This would allow us to debug the preservation of
> profiling information once we start modifying optimization passes to update
it.Good point.
> 3. The static profiling estimator should be implemented as just another
> ProfileInfo implementation. I believe this can be a separate patch.
Similarly
> the profile verifier is just another consumer of ProfileInfo and can be a
> separate patch.The ProfileEstimator is already a subclass of ProfileInfo and implements that
interface. Yes, these two are easily split into separate patches.
> 4. Finally, the new optimal edge instrumentation & ProfileInfo
implementation
> can be brought in.
> 
> Does this sound like a reasonable plan? Although it is a bit of work, I
don't
> think it will be that difficult and I am happy to help out with splitting
up the
> patch / refactoring the existing code.Yes, sounds good. I will try to split it up myself first and ask when there are
decisions to be made I'm not sure about.
> In addition, I have a number of "nitpicks" mostly related to LLVM
style:Well, they were to be expected :-) One never gets those right the first time
contributing to a new project...
>  [...]
>> +  /// getExitEdges - Return all pairs of
(_inside_block_,_outside_block_).
>> +  /// (Modelled after getExitingBlocks().)
>> +  typedef std::pair<BlockT*,BlockT*> Edge;
>> +  void getExitEdges(SmallVectorImpl<Edge> &ExitEdges) const
{
>> +    // Sort the blocks vector so that we can use binary search to do
quick
>> +    // lookups.
>> +    SmallVector<BlockT*, 128> LoopBBs(block_begin(),
block_end());
>> +    std::sort(LoopBBs.begin(), LoopBBs.end());
> 
> Would it be better to use a DenseSet for this lookup?Well, its the same that is used in getExitBlocks() where I got the
implementation from. I really don't know but I think the current
implementation is okay from what I gather form the LLVM Programmer's Manual.
>> --- llvm-van/include/llvm/Analysis/MaximumSpanningTree.h     
1970-01-01 01:00:00.000000000 +0100
>> +++ llvm-c7/include/llvm/Analysis/MaximumSpanningTree.h      
2009-06-26 16:47:01.000000000 +0200
> ...
>> +#include "llvm/Support/Streams.h"
> 
> Please don't use Support/Streams.h in a header, it pollutes any
translation unit
> which imports this header.Okay.
>> +    static void print(MaxSpanTree MST) {
>> +      cerr<<"{";
>> +      for ( MaxSpanTree::iterator ei = MST.begin(), ee = MST.end();
>> +            ei!=ee; ++ei ) {
>> +       
cerr<<"("<<((*ei).first?(*ei).first->getName():"0")<<",";
>> +        cerr<<(*ei).second->getName()<<")";
>> +      }
>> +      cerr<<"}\n";
>> +    }
>> +  };
> 
> To be more in line with LLVM I would recommend making this a nullary const
> method on the MST, and rename it to dump. Also, please use
Support/raw_ostream.h
> for output instead of C++ iostreams.Okay.
> 
>> --- llvm-van/include/llvm/Analysis/Passes.h   2009-06-29
13:49:13.000000000 +0200
>> +++ llvm-c7/include/llvm/Analysis/Passes.h    2009-06-26
16:48:02.000000000 +0200
>>    // createProfileLoaderPass - This pass loads information from a
profile dump
>> -  // file.
>> +  // file. Since its possible to use the ProfileInfoLoaderPass not
only from
>> +  // opt but also e.g. from llvm-prof and with different profile info
filenames
>> +  // a creator is provided where the toolname and profile info
filename can be
>> +  // provided. (The toolname is used to create proper error messages.)
>>    //
>>    ModulePass *createProfileLoaderPass();
>> +  ModulePass *createProfileLoaderPass(const std::string &, const
std::string &);
> 
> Please name the arguments so it is clear which is the toolname and which is
the
> profile info name (bonus points for switching to doxygen comments which
> reference the parameters).I guess I will go for the bonus points then.
>> --- llvm-van/include/llvm/Analysis/ProfileInfo.h      2009-05-12
10:37:52.000000000 +0200
>> +++ llvm-c7/include/llvm/Analysis/ProfileInfo.h       2009-06-26
16:47:01.000000000 +0200
> 
> Keep in mind that most comments here are assuming this functionality is
moved to
> a subclass specific to optimal edge profiling.
> 
>>    protected:
>> -    // EdgeCounts - Count the number of times a transition between two
blocks is
>> -    // executed.  As a special case, we also hold an edge from the
null
>> -    // BasicBlock to the entry block to indicate how many times the
function was
>> -    // entered.
>> -    std::map<std::pair<BasicBlock*, BasicBlock*>,
unsigned> EdgeCounts;
>> +    // EdgeInformation - Count the number of times a transition
between two
>> +    // blocks is executed.  As a special case, we also hold an edge
from the
>> +    // null BasicBlock to the entry block (henceforth (0,entry) to
indicate
>> +    // how many times the function was entered. (This is especially
necessary
>> +    // if there is only the entry block.)
>> +    std::map<Function*, EdgeCounts> EdgeInformation;
>> +
>> +    // BlockInformation - Count the number of times a block is
executed
>> +    std::map<Function*, BlockCounts> BlockInformation;
>> +
>> +    // FunctionInformation - Count the number of times a function is
executed
>> +    std::map<Function*, double> FunctionInformation;
> 
> Instead of multiple maps from Function* ->, would it make sense to
collect these
> three (or four) maps inside a per-function structure?Yes, makes sense, its then easier to factor out the retrieval and handling of
this structure.
>> +    // getFunctionForEdge() - This returns the Function for a given
Edge. This
>> +    // requires special handling for (0,entry) so its centralised
here.
>> +    static inline Function* getFunctionForEdge(Edge E) {
>> +      BasicBlock *BB = E.first;
>> +      if ( BB == 0 ) { BB = E.second; };
>> +      assert ( BB != 0 && "both BasicBlocks of Edge are
NULL, can not "
>> +                          "determine function");
>> +      return BB->getParent();
>> +    }
> 
> The edge destination should always be non-null, correct? Why not just
> --
>    static inline Function* getFunctionForEdge(Edge E) {
>      assert(E.second && "Invalid profile edge!");
>      return E.second->getParent();
>    }
> --Nice one. Thanks!
>> +    // getFunctionForBlock() - Same as for Edges without special
handling, for
>> +    // the sake of cleaner code.
>> +    static inline Function* getFunctionForBlock(BasicBlock* BB) {
>> +      return BB->getParent();
>> +    }
> 
> I would prefer this wasn't used, its "common knowledge" in
the LLVM code that a
> basic blocks parent is its Function.I was not sure about this one anyways, so yes, it will go.
>> +    EdgeCounts getEdgeCounts(Function *F) {
>> +      std::map<Function*, EdgeCounts>::const_iterator J >>
+        EdgeInformation.find(F);
>> +      if ( J != EdgeInformation.end() ) {
>> +        return J->second;
>> +      }
>> +      return EdgeCounts(); // Return empty EdgeCounts in case none
found.
>> +    }
> 
> This is way too expensive, returning a deep copy of the edge counts map
isn't
> necessary. The simplest alternative is to return a const reference and
insert
> into the map for missing functions. Something like:
> --
>> +    const EdgeCounts& getEdgeCounts(Function *F) {
>> +      return EdgeInformation[F];
>> +    }
> --
> Unless we really care about not inserting empty entries this seems
reasonable to
> me.Here my somewhat laking C++ skills failed me. I didn't see that this is
essentially doing a copy, of course this is not
necessary!> 
>> +    void forceSynthesis(void) {
>> +      for (std::map<Function*, EdgeCounts>::iterator
>> +           FI = EdgeInformation.begin(), FIE = EdgeInformation.end();
>> +           FI != FIE; ++FI) {
>> +        Function *F = FI->first;
>> +        for (Function::iterator BB = F->begin(), BBE = F->end();
>> +             BB != BBE; ++BB) {
>> +          getExecutionCount(BB,false);
>> +        }
>> +        getExecutionCount(F,false);
>> +      }
>> +    }
> 
> It doesn't make sense to have this function here, clients should just
do the
> right thing if the count isn't present. It may be worth providing a
ProfileInfo
> level API to query whether any information is present for a particular
function
> so that clients don't perform needless queries.I have to look into this one, I guess its possible to do this kind of check on
demand and calculate BlockCounts from EdgeCounts and FunctionCounts form
BlockCounts when necessary.
>> --- llvm-van/lib/Analysis/MaximumSpanningTree.cpp     1970-01-01
01:00:00.000000000 +0100
>> +++ llvm-c7/lib/Analysis/MaximumSpanningTree.cpp      2009-06-26
16:47:01.000000000 +0200
> ...
>> +#define WOOD_REP(M,V1,V2) \
> 
> Please use "forest" instead of "wood".Oh, sorry, thats a false friend, in German "forest" is
"Wald".
>> +  // compare two weighted edges
>> +  struct VISIBILITY_HIDDEN EdgeWeightCompare {
>> +    bool operator()(const ProfileInfo::EdgeWeight X,
>> +                    const ProfileInfo::EdgeWeight Y) const {
>> +      if ( X.second > Y.second ) return true;
>> +      if ( X.second < Y.second ) return false;
>> +      if ( X.first.first != 0 && Y.first.first == 0 ) return
true;
>> +      if ( X.first.first == 0 && Y.first.first != 0 ) return
false;
>> +      if ( X.first.first == 0 && Y.first.first == 0 ) return
false;
>> +      int c1 = 
X.first.first->getName().compare(Y.first.first->getName());
>> +      if ( c1 > 0 ) return true;
>> +      if ( c1 < 0 ) return false;
>> +      int c2 = 
X.first.second->getName().compare(Y.first.second->getName());
>> +      if ( c2 > 0 ) return true;
>> +      return false;
>> +    }
>> +  };
> 
> Comparing edges based on the basic block name is not a good idea, in some
common
> cases these will all be empty. If the ordering really needs to be
deterministic
> then you will need to number the basic blocks, but is there a compelling
reason
> to not order based on the basic block address?This slipped my notice and shouldn't have made it into production code. Its
sufficient to order based on weights.
(This implementation was necessary when the MST was also needed for reading the
profiling information, but since now I handle this with the Constant-Initializer
that provides this information directly in the profile, its only necessary to
compare the weight.)
> [...]
>> +// countSuccessors() - This gives an estimate of the number of
successors of a
>> +// basic block. Returns 0, 1 or 2 depending on the block having no,
one or more
>> +// than one successors respectively.
>> +static int countSuccessors(BasicBlock* BB) {
>> +  succ_iterator SI = succ_begin(BB), SE = succ_end(BB);
>> +  if ( SI != SE ) {       // at least one successor
>> +    ++SI;
>> +    if ( SI == SE ) {     // exactly on successor
>> +      return 1;
>> +    } else {              // more than one successor
>> +      return 2;
>> +    }
>> +  } else {
>> +    return 0;
>> +  }
>> +}
> 
> This routine is only used in one place, and its result is compared against
0. I
> think it should be removed and the code should just check (succ_begin(BB)
=> succ_end(BB)), or for better readability add a succ_empty to CFG.h and use
that.Okay.
>> --- llvm-van/lib/Analysis/ProfileEstimatorPass.cpp    1970-01-01
01:00:00.000000000 +0100
>> +++ llvm-c7/lib/Analysis/ProfileEstimatorPass.cpp     2009-06-26
16:47:01.000000000 +0200
> ...
>> +bool OptimalEdgeProfiler::runOnModule(Module &M) {
>> +  Function *Main = M.getFunction("main");
>> +  if (Main == 0) {
>> +    cerr << "WARNING: cannot insert edge profiling into a
module"
>> +         << " with no main function!\n";
>> +    return false;  // No main, no instrumentation!
>> +  }
>> +
>> +  std::set<BasicBlock*> BlocksToInstrument;
>> +  unsigned NumEdges = 0;
>> +  for (Module::iterator F = M.begin(), E = M.end(); F != E; ++F) {
>> +    if (F->isDeclaration()) continue;
>> +    // use one counter for the edge (0,entry) for each function
>> +    ++NumEdges;
>> +    for (Function::iterator BB = F->begin(), E = F->end(); BB !=
E; ++BB) {
>> +      // Keep track of which blocks need to be instrumented.  We
don't want to
>> +      // instrument blocks that are added as the result of breaking
critical
>> +      // edges!
>> +      BlocksToInstrument.insert(BB);
>> +      NumEdges += BB->getTerminator()->getNumSuccessors();
>> +    }
>> +  }
>> +
>> +  const ArrayType *ATy = ArrayType::get(Type::Int32Ty, NumEdges);
>> +  GlobalVariable *Counters >> +    new GlobalVariable(ATy,
false, GlobalValue::InternalLinkage,
>> +                       0, "OptimalEdgeProfCounters",
&M);
>> +
>> +  std::vector<Constant*> Initializer =
std::vector<Constant*>(NumEdges);
> 
> This unnecessarily copies the vector, please use:
>  std::vector<Constant*> Initializer(NumEdges);Okay.
>> +  APInt zero(32,0);      Constant* zeroc = ConstantInt::get(zero);
>> +  APInt minusone(32,-1); Constant* minusonec =
ConstantInt::get(minusone);
> 
> Please use:
>> +  Constant* zeroc = ConstantInt::get(llvm::Type::Int32Ty, 0);
>> +  Constant* onec = ConstantInt::getSigned(llvm::Type::Int32Ty, -1);Okay.

Thank you so much for taking the time and reviewing this giving me very valuable
comments.

Andi

Daniel Dunbar

2009-Jul-02 04:04 UTC

head link

[LLVMdev] Profiling in LLVM Patch

On Wed, Jul 1, 2009 at 2:36 AM, Andreas
Neustifter<e0325716 at student.tuwien.ac.at>
wrote:>> [...]
>>
>> 1. The intent of ProfileInfo is that it is the public interface used by
the rest
>> of LLVM for all "profiling" related analyses. As such, we
want it to have a
>> thin, well-defined interface which can be implemented by multiple
analysis
>> providers, just like we have with the current alias analysis interface.
Your
>> current patch has a large number of changes to ProfileInfo which should
instead
>> be inside a subclass of ProfileInfo, where most of the implementation
details
>> are private.
>>
>> I think the first task should be to separate out the changes which are
>> appropriate for the ProfileInfo class itself and apply those to the
tree. I
>> believe that the important changes which are worth reflecting in the
API are
>> changed the weights to return a double instead of an int, and defining
-1 as a
>> sentinel, "don't know", value.
> So you want the ProfileInfo to be untouched except for the double weights
and the MissingValue? Then mark all methods virtual and do a subclass that has
this sort of heavy implementation that I am using?
Exactly.
> Do you have any name suggestions for this subclass?
Not really. As with other interfaces, the actual implementation will
be completely hidden, so the only public part of the name is the
create...Pass call (and the options which end up being exposed to
'opt', etc). I would imagine something like DynamicProfileInfo,
OptimalEdgeDynamicProfileInfo, and StaticProfileInfoEstimator for the
main implementations.
> One thing I would like to patch in the current ProfileInfo implementation
is the consistent use of the (0,entry) edge which is necessary to profile
functions with only one block (namely the entry block).
Yes, I agree, that should be part of the ProfileInfo API changes.

Eventually we will want to change the ProfileInfo API so that it is
easier to chain various ProfileInfo implementations. That way a client
can use dynamic profile information, but still get useful results for
code which wasn't profiled using the static estimator. Currently the
interfaces can't really accommodate this, because they return absolute
counts and there is no way to interpret results from two separate
profile info providers. However, I think for now its more important to
get stuff working than to worry about this problem.
>> This is the most important thing to separate out, and to do early,
because it is
>> the main interface that is shared with LLVM. It is also important for
my GSoC
>> student, who was going to work on static profiling for LLVM and will
depend on
>> any changes to this interface.
> Yes, I thought so, I will try to do this quick.
Great, thanks!

...
>> 2. We shouldn't worry too much about preserving the functionality
of the current
>> llvm-prof tool, which uses the ProfileInfoLoader directly. llvm-prof
should be
>> rewritten to just use the public API provided by ProfileInfo. This
should be
>> done early so there are less dependencies on the ProfileInfoLoader
interface.
> Okay, thats no problem. I guess it makes sense to rewrite llvm-prof but I
was not sure if this sort of changes would be appreciated.
I'm pretty sure that no one uses llvm-prof, and I'm almost positive
that no one depends on it. I see its purpose as primarily to aid
developers working on profiling instrumentation -- it should be
useful, but not constraint the API or hinder development.
>> In fact, it might even make sense to just get rid of llvm-prof and
instead
>> implement a module level pass which consumes profiling information and
prints it
>> out as part of the pass. This would allow us to debug the preservation
of
>> profiling information once we start modifying optimization passes to
update it.
> Good point.
Unless anyone objects I may end up just doing this if a rainy day
presents itself.
>> 3. The static profiling estimator should be implemented as just another
>> ProfileInfo implementation. I believe this can be a separate patch.
Similarly
>> the profile verifier is just another consumer of ProfileInfo and can be
a
>> separate patch.
> The ProfileEstimator is already a subclass of ProfileInfo and implements
that interface. Yes, these two are easily split into separate patches.
Right. One nice benefit of having the ProfileInterface be completely
abstract is that it should be straightforward to move the
ProfileEstimator to being lazy and only computing information when a
client requests it.
>> 4. Finally, the new optimal edge instrumentation & ProfileInfo
implementation
>> can be brought in.
>>
>> Does this sound like a reasonable plan? Although it is a bit of work, I
don't
>> think it will be that difficult and I am happy to help out with
splitting up the
>> patch / refactoring the existing code.
> Yes, sounds good. I will try to split it up myself first and ask when there
are decisions to be made I'm not sure about.
Ok!
>>> +  /// getExitEdges - Return all pairs of
(_inside_block_,_outside_block_).
>>> +  /// (Modelled after getExitingBlocks().)
>>> +  typedef std::pair<BlockT*,BlockT*> Edge;
>>> +  void getExitEdges(SmallVectorImpl<Edge> &ExitEdges)
const {
>>> +    // Sort the blocks vector so that we can use binary search to
do quick
>>> +    // lookups.
>>> +    SmallVector<BlockT*, 128> LoopBBs(block_begin(),
block_end());
>>> +    std::sort(LoopBBs.begin(), LoopBBs.end());
>>
>> Would it be better to use a DenseSet for this lookup?
> Well, its the same that is used in getExitBlocks() where I got the
implementation from. I really don't know but I think the current
implementation is okay from what I gather form the LLVM Programmer's Manual.
There is nothing wrong with it, but using a DenseSet has better
asymptotic performance and shortens the code. OTOH it may have a
higher construction cost in most practical scenarios, some I'm not
convinced about which is really best.

Thanks,
 - Daniel

Carter Cheng

2009-Jul-02 07:09 UTC

head link

[LLVMdev] Resolving extern "C" functions declared in executible

Hi,

I am having some difficulties getting the LLVM JIT to resolve extern
"C" functions which I have defined in source file and invoking them
via EE::runFunction() after generating a Function prototype for it. Is this
possible or do I need to generate a .so for my functions are link against it?

Thanks in advanced,

Carter.

Andreas Neustifter

2009-Jul-02 15:23 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 1

Hi,

this is the first in a series of patches to cleanup and improve the LLVM
Profiling Infrastructure.

First and foremost this patch removes duplicate functionality from
ProfileInfoLoader and ProfileInfo:
The ProfileInfoLoader performed not only the loading of the profile information
but also some synthesis of block and function execution counts from edge
profiling information. Since the ProfileInfo performs this synthesis anyways,
the ProfileInfoLoader was demoted to be really only the interface to the profile
information file.

Since the llvm-prof was the only client of the synthesis portion of the
ProfileInfoLoader it had to be changed to fetch this kind of information from
the ProfileInfo with the help of the ProfileInfoLoaderPass.

The next step are to
*) add proper handling of an edge (0,entry) for each function to the Profiling
Infrastructure
*) provide a dedicated return value if the profiling information for a certain
element is not available
*) change the profiling values itself to double as preparation for future passes
that require floating point support from ProfileInfo

Greetings, Andi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-r74697.profileinfo.cleanup.patch
Type: text/x-patch
Size: 26619 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090702/e558d4fe/attachment.bin>

Andreas Neustifter

2009-Jul-03 06:48 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 2

Hi,

this is the second in a series of patches to cleanup and improve the
LLVM Profiling Infrastructure. It depends on the previous patch from
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023569.html

The ProfileInfoLoaderPass was only dealing with edge profiling
information, this patch adds support for block and function profiling
information.

Grettings, Andi

Andreas Neustifter

2009-Jul-03 17:22 UTC

head link

[LLVMdev] Profiling in LLVM Patch Flowup 3

Hi,

this is the third in a series of patches to cleanup and improve the
LLVM Profiling Infrastructure. It depends on the previous patches from
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023569.html
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023602.html

This patch cleans up the ProfilingInfo by:
*) supplying a MissingValue marker for communicating to clients that a
certain piece of profiling information is not available
*) change the data type that profile info is stored in from unsigned to
double (needed for future changes)
*) add caching of synthesised values (e.g. when block information is
calculated from edge information)
*) support in the ProfileInfoLoaderPass for loading also block and
function profiles (up until now only edge profiles were loaded)
*) adding support for a (0,entry) edge to all profile related code

The next part is to implement a static profile estimator to support the
final goal of having optimal edge profiling.

To all the US-Folks: have a nice holiday weekend. To the rest of the
world: have a nice weekend too!

Andi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-r74779.zero-entry-edge.double.cleanup.patch
Type: text/x-patch
Size: 17901 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090703/d46e6b44/attachment.bin>

Andreas Neustifter

2009-Jul-06 09:25 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 3

Hi,

this is the third in a series of patches to cleanup and improve the
LLVM Profiling Infrastructure. It depends on the previous patches from
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023569.html
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023602.html

This patch cleans up the ProfilingInfo by:
*) supplying a MissingValue marker for communicating to clients that a
certain piece of profiling information is not available
*) change the data type that profile info is stored in from unsigned to
double (needed for future changes)
*) add caching of synthesised values (e.g. when block information is
calculated from edge information)
*) support in the ProfileInfoLoaderPass for loading also block and
function profiles (up until now only edge profiles were loaded)
*) adding support for a (0,entry) edge to all profile related code

The next part is to implement a static profile estimator to support the
final goal of having optimal edge profiling.

To all the US-Folks: have a nice holiday weekend. To the rest of the
world: have a nice weekend too!

Andi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-r74779.zero-entry-edge.double.cleanup.patch
Type: text/x-patch
Size: 17901 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090706/39789657/attachment.bin>

Andreas Neustifter

2009-Jul-06 09:46 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 4

Hi,

this is the fourth in a series of patches to cleanup and improve the
LLVM Profiling Infrastructure. It depends on the previous patches from
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023569.html
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023602.html
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023642.html

This patch introduces an implementation of the ProfileInfo that
estimates a profile as proposed in [Ball94]. (It distributes incoming
flow in an block equally to the successor edges and assumes an loop
iteration count of 10.)

Andi

[Ball94] Thomas Ball, James R. Larus:
"Optimally profiling and tracing programs"
ACM TOPLAS, Volume 16, Issue 4, 1994
http://portal.acm.org/citation.cfm?id=183527

Andreas Neustifter

2009-Jul-06 10:21 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 4

Okay, I guess I should attach the patch too...

Andreas Neustifter wrote:> Hi,
> 
> this is the fourth in a series of patches to cleanup and improve the
> LLVM Profiling Infrastructure. It depends on the previous patches from
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023569.html
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023602.html
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023642.html
> 
> This patch introduces an implementation of the ProfileInfo that
> estimates a profile as proposed in [Ball94]. (It distributes incoming
> flow in an block equally to the successor edges and assumes an loop
> iteration count of 10.)
> 
> Andi
> 
> [Ball94] Thomas Ball, James R. Larus:
> "Optimally profiling and tracing programs"
> ACM TOPLAS, Volume 16, Issue 4, 1994
> http://portal.acm.org/citation.cfm?id=183527
> 
> 
> -------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-r74818.profile-estimator.patch
Type: text/x-patch
Size: 10866 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090706/646d7f3c/attachment.bin>

Andreas Neustifter

2009-Jul-06 12:02 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 2

I'm sorry, I saw that this patch is missing too...

Andreas Neustifter wrote:> Hi,
> 
> this is the second in a series of patches to cleanup and improve the
> LLVM Profiling Infrastructure. It depends on the previous patch from
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023569.html
> 
> The ProfileInfoLoaderPass was only dealing with edge profiling
> information, this patch adds support for block and function profiling
> information.
> 
> Grettings, Andi
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> -------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-r74762.added.block.function.profiling.patch
Type: text/x-patch
Size: 4616 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090706/81c221d4/attachment.bin>

Andreas Neustifter

2009-Jul-06 13:56 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 5

Hi,

this is the fifth in a series of patches to cleanup and improve the
LLVM Profiling Infrastructure. It depends on the previous patches from
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023569.html
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023602.html
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023642.html
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023643.html

Cleans up the previous patches a little bit and fixes some small bugs in
the llvm-prof output.

Andi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-r74824.small.cleanup.patch
Type: text/x-patch
Size: 6317 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090706/ba9a8a7a/attachment.bin>

Andreas Neustifter

2009-Jul-08 13:59 UTC

head link

[LLVMdev] Profiling in LLVM Patch

Hi Daniel,

I just want to make a few remarks to the things we discussed before I split up
the patch from [1] to the patches, the last being [2]. Some of the code from the
first patch [1] was rewritten in the process so some of the things we talked
about changed a little, I want to give the rationale behind my new design
decisions.

Andreas Neustifter wrote:> Daniel Dunbar wrote:
> [...]
>> 1. The intent of ProfileInfo is that it is the public interface used by
the rest
>> of LLVM for all "profiling" related analyses. As such, we
want it to have a
>> thin, well-defined interface which can be implemented by multiple
analysis
>> providers, just like we have with the current alias analysis interface.
Your
>> current patch has a large number of changes to ProfileInfo which should
instead
>> be inside a subclass of ProfileInfo, where most of the implementation
details
>> are private.
>>
>> I think the first task should be to separate out the changes which are
>> appropriate for the ProfileInfo class itself and apply those to the
tree. I
>> believe that the important changes which are worth reflecting in the
API are
>> changed the weights to return a double instead of an int, and defining
-1 as a
>> sentinel, "don't know", value.
> So you want the ProfileInfo to be untouched except for the double weights
and the MissingValue? Then mark all methods virtual and do a subclass that has
this sort of heavy implementation that I am using? Do you have any name
suggestions for this subclass?
> One thing I would like to patch in the current ProfileInfo implementation
is the consistent use of the (0,entry) edge which is necessary to profile
functions with only one block (namely the entry block). >
 > [...]
 >> The current ProfileInfo functions should also become virtual, and all
other
 >> implementation details should be left to the subclass.
 >>
 >> Finally, I don't think the ignoreMissing parameter to
getExecutionCount makes
 >> sense, instead the provider should always return -1 if it doesn't
have the
 >> value, and clients should "do the right thing.
 > Maybe, its of course always possibile to have custom wrappers that perform
this sort of task if necessary...
 >
The so called "heavy implementation" has resolved itself partially, I
changed in ProfileInfo only the synthesis that was there already to cache the
synthesised values, for the ProfileInfo to be complete I also added BasicBlock
and Function profiling. So the interface is just slightly bigger than before
(MissingValue, Block and Function profiling) but now usable for (I guess) all
ProfileInfo tasks.

During this I killed the synthesis part of ProfileInfoLoader (which was
duplicated code from ProfileInfo) and I also had to rewrite llvm-prof to use
ProfileInfo instead of ProfileInfoLoader. (I think llvm-prof is now quite
cleanly implemented...).
>> 2. We shouldn't worry too much about preserving the functionality
of the current
>> llvm-prof tool, which uses the ProfileInfoLoader directly. llvm-prof
should be
>> rewritten to just use the public API provided by ProfileInfo. This
should be
>> done early so there are less dependencies on the ProfileInfoLoader
interface.
> Okay, thats no problem. I guess it makes sense to rewrite llvm-prof but I
was not sure if this sort of changes would be appreciated.See my comments to point 1., its rewritten now and works as before but with a
interface towards ProfileInfo.
> [...]
>>> --- llvm-van/include/llvm/Analysis/ProfileInfo.h      2009-05-12
10:37:52.000000000 +0200
>>> +++ llvm-c7/include/llvm/Analysis/ProfileInfo.h       2009-06-26
16:47:01.000000000 +0200
>>> [...]
>> Instead of multiple maps from Function* ->, would it make sense to
collect these
>> three (or four) maps inside a per-function structure?
> Yes, makes sense, its then easier to factor out the retrieval and handling
of this structure.I tried this, it cluttered up the code with another redirection, I kept the 3
maps and I think this works good enough.

 > [...]>>> --- llvm-van/lib/Analysis/MaximumSpanningTree.cpp     1970-01-01
01:00:00.000000000 +0100
>>> +++ llvm-c7/lib/Analysis/MaximumSpanningTree.cpp      2009-06-26
16:47:01.000000000 +0200
>> ...
>>> +#define WOOD_REP(M,V1,V2) \
>> Please use "forest" instead of "wood".
> Oh, sorry, thats a false friend, in German "forest" is
"Wald".All the wood is not combined into an forest :-)

 > [...]>>> +  // compare two weighted edges
>>> +  struct VISIBILITY_HIDDEN EdgeWeightCompare {
>>> +    bool operator()(const ProfileInfo::EdgeWeight X,
>>> +                    const ProfileInfo::EdgeWeight Y) const {
>>> +      if ( X.second > Y.second ) return true;
>>> +      if ( X.second < Y.second ) return false;
>>> +      if ( X.first.first != 0 && Y.first.first == 0 )
return true;
>>> +      if ( X.first.first == 0 && Y.first.first != 0 )
return false;
>>> +      if ( X.first.first == 0 && Y.first.first == 0 )
return false;
>>> +      int c1 = 
X.first.first->getName().compare(Y.first.first->getName());
>>> +      if ( c1 > 0 ) return true;
>>> +      if ( c1 < 0 ) return false;
>>> +      int c2 = 
X.first.second->getName().compare(Y.first.second->getName());
>>> +      if ( c2 > 0 ) return true;
>>> +      return false;
>>> +    }
>>> +  };
>> Comparing edges based on the basic block name is not a good idea, in
some common
>> cases these will all be empty. If the ordering really needs to be
deterministic
>> then you will need to number the basic blocks, but is there a
compelling reason
>> to not order based on the basic block address?
> This slipped my notice and shouldn't have made it into production code.
Its sufficient to order based on weights.
> (This implementation was necessary when the MST was also needed for reading
the profiling information, but since now I handle this with the
Constant-Initializer that provides this information directly in the profile, its
only necessary to compare the weight.)I have regression scripts that test the debug output of the various passes
against a reference output to have a quick check if everything works as
expected. For this its necessary to have a deterministic sorting of the edges.
Since then block name is not good in this case (I knew, but I couldn't find
another solution at that time) I use the block size now, this should stay the
same if code is translated by the regression script. But I include the
"advanced sorting" only in debug builds, in regular builds its still
only sorting by edge weight.
> [...]
The patches are still not as small as possible, I tried to go between reasonable
sized patches and flooding the list with small patches, I hope its okay.

I have no or limited EMail this and next week, I will be back on the 20th of
July.

Thanks, Andi

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-June/023427.html

[2] http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-July/023716.html

Daniel Dunbar

2009-Jul-14 08:28 UTC

head link

[LLVMdev] Profiling in LLVM Patch Followup 1

Hi Andreas,

Thanks for breaking things up.

I applied two pieces of this patch in separate no-functionality-change
commits, here:
  http://llvm.org/viewvc/llvm-project?view=rev&revision=75623
and here:
  http://llvm.org/viewvc/llvm-project?view=rev&revision=75625

I merged in my changes to your patch, which results in the attached
patch. There may be some missed merge errors. The main problem I have
with the rest of this patch is that it causes a regression in
llvm-prof's behavior. I tried running edge profiling on the
MultiSource/Applications/aha benchmark in the llvm test-suite, and
llvm-prof no longer prints any function information:
--
ddunbar at giles:aha$ opt -f Output/aha.linked.rbc -insert-edge-profiling -o
foo.bc
ddunbar at giles:aha$ llc foo.bc -o - | gcc -x assembler - -x none
~/llvm/Debug/lib/profile_rt.dylib
ddunbar at giles:aha$ ./a.out
...
ddunbar at giles:aha$ ~/llvm/Release/bin/llvm-prof foo.bc llvmprof.out
WARNING: profile information is inconsistent with the current program!
===-------------------------------------------------------------------------==LLVM
profiling output for executions:
  1. ./a.out

===-------------------------------------------------------------------------==Function
execution frequencies:

 ##   Frequency

  NOTE: 32 functions were never executed!

--
whereas before I would get something like:
--
===-------------------------------------------------------------------------==Function
execution frequencies:

 ##   Frequency
  1. 87161541/406099952 selle
  2. 86016884/406099952 print_pgm
  3. 80066663/406099952 check
  4. 44786540/406099952 fix_operands
  5. 17224823/406099952 divide
  6. 13735854/406099952 increment
...
--

The basic block counts also differ, sometimes very significantly, but
I'm not sure if this is to be expected. Can you investigate both of
these differences on some real example? I'd like to keep everything
working at least as good as it was before (modulo some other features,
like function counts or basic block traces, which we don't really
support).

Some minor comments on the (original) patch below...

Thanks,
 - Daniel
> Index: include/llvm/Analysis/Passes.h
> ==================================================================> ---
include/llvm/Analysis/Passes.h	(revision 74697)
> +++ include/llvm/Analysis/Passes.h	(working copy)
...> +  ModulePass *createProfileLoaderPass(ProfileInfoLoader *PIL);
I avoided the need for this by just loading the profile info twice, as
a sort of intermediate step.
> +  std::vector<unsigned> ECs = PIL->getRawEdgeCounts();
This copies the vector, it should be a (const) refererence.
> +  // Instrument all of the edges...
> +  unsigned i = 0;
> +  for (Module::iterator F = M.begin(), E = M.end(); F != E; ++F)
> +    for (Function::iterator BB = F->begin(), E = F->end(); BB != E;
++BB) {
> +      // Okay, we have to add a counter of each outgoing edge.  If the
> +      // outgoing edge is not critical don't split it, just insert the
counter
> +      // in the source or destination of the edge.
> +      TerminatorInst *TI = BB->getTerminator();
> +      for (unsigned s = 0, e = TI->getNumSuccessors(); s != e; ++s) {
> +        EdgeCounts[std::make_pair(BB, TI->getSuccessor(s))]+= ECs[i++];
This can end up reading off the end of ECs if the module changes (or
the profile information is of a different type).
> Index: tools/llvm-prof/llvm-prof.cpp
> ==================================================================> ---
tools/llvm-prof/llvm-prof.cpp	(revision 74697)
> +++ tools/llvm-prof/llvm-prof.cpp	(working copy)
> +  class LoaderInterface : public ModulePass {
> +    ProfileInfo *PI;
I renamed this class, and moved the main printing functionality to its
run method. This avoids the member state.
>    class ProfileAnnotator : public AssemblyAnnotationWriter {
> -    std::map<const Function  *, unsigned> &FuncFreqs;
> -    std::map<const BasicBlock*, unsigned> &BlockFreqs;
> -    std::map<ProfileInfoLoader::Edge, unsigned> &EdgeFreqs;
> +    ProfileInfo *PI;
I would advocate making this a reference instead of a pointer; that
makes it obvious that this class doesn't own the ProfileInfo, and that
the client is responsible for its memory management.

On Thu, Jul 2, 2009 at 8:23 AM, Andreas
Neustifter<e0325716 at student.tuwien.ac.at>
wrote:> Hi,
>
> this is the first in a series of patches to cleanup and improve the LLVM
> Profiling Infrastructure.
>
> First and foremost this patch removes duplicate functionality from
> ProfileInfoLoader and ProfileInfo:
> The ProfileInfoLoader performed not only the loading of the profile
> information but also some synthesis of block and function execution counts
> from edge profiling information. Since the ProfileInfo performs this
> synthesis anyways, the ProfileInfoLoader was demoted to be really only the
> interface to the profile information file.
>
> Since the llvm-prof was the only client of the synthesis portion of the
> ProfileInfoLoader it had to be changed to fetch this kind of information
> from the ProfileInfo with the help of the ProfileInfoLoaderPass.
>
> The next step are to
> *) add proper handling of an edge (0,entry) for each function to the
> Profiling Infrastructure
> *) provide a dedicated return value if the profiling information for a
> certain element is not available
> *) change the profiling values itself to double as preparation for future
> passes that require floating point support from ProfileInfo
>
> Greetings, Andi
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-r74697.profileinfo.cleanup.regenerated.patch
Type: application/octet-stream
Size: 19137 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090714/89b5bbc2/attachment.obj>

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Jul 2009 - [LLVMdev] Profiling in LLVM Patch Followup 4

[LLVMdev] Profiling in LLVM Patch

[LLVMdev] Profiling in LLVM Patch

[LLVMdev] Resolving extern "C" functions declared in executible

[LLVMdev] Profiling in LLVM Patch Followup 1

[LLVMdev] Profiling in LLVM Patch Followup 2

[LLVMdev] Profiling in LLVM Patch Flowup 3

[LLVMdev] Profiling in LLVM Patch Followup 3

[LLVMdev] Profiling in LLVM Patch Followup 4

[LLVMdev] Profiling in LLVM Patch Followup 4

[LLVMdev] Profiling in LLVM Patch Followup 2

[LLVMdev] Profiling in LLVM Patch Followup 5

[LLVMdev] Profiling in LLVM Patch

[LLVMdev] Profiling in LLVM Patch Followup 1

Reasonably Related Threads