Stephan Reiter
2010-Sep-23  19:20 UTC
[LLVMdev] Finding all values derived from a function argument
Hello!
I am trying to retrieve all instructions from a function's body that
are dependent on a specific argument. The strategy I am currently
using for that is to follow all uses of the argument and record them.
Also, whenever I encounter a store of a dependent value, I try to find
loads in the function that are dependent on that store and resume
use-tracking from there. For this purpose I am using the
MemoryDependenceAnalysis pass, but to be honest, I think I am not
using it correctly or not in the most efficient way.
Is such functionality - finding dependent instructions - present in
LLVM? If not, is there a more efficient way to determine dependent
instructions than like in the code below? In particular, I'm wondering
if it is possible to find all loads that depend on a particular store
more efficiently?
Thank you,
Stephan
    llvm::DenseSet<llvm::Instruction *> findDependents(llvm::Argument
*arg, llvm::Function *func)
    {
      std::vector<llvm::Instruction *> workList;
      for(llvm::Argument::use_iterator I = arg->use_begin(), E
arg->use_end(); I != E; ++I)
      {
        if(llvm::Instruction *user =
llvm::dyn_cast<llvm::Instruction>(*I))
          workList.push_back(user);
      }
      llvm::FunctionPassManager fpm(func->getParent());
      llvm::MemoryDependenceAnalysis *mda = new
llvm::MemoryDependenceAnalysis();
      fpm.add(mda);
      fpm.run(*func);
      // find all loads in the current function
      std::list<llvm::LoadInst *> loads;
      for(llvm::Function::iterator I = func->begin(), E = func->end();
I != E; ++I)
      {
        for(llvm::BasicBlock::iterator II = I->begin(), EE = I->end();
II != EE; ++II)
        {
          if(llvm::LoadInst *load = llvm::dyn_cast<llvm::LoadInst>(II))
            loads.push_back(load);
        }
      }
      llvm::DenseSet<llvm::Instruction *> deps;
      for(;;)
      {
        llvm::DenseSet<llvm::Instruction *> stores;
        while(!workList.empty())
        {
          llvm::Instruction *I = workList.back();
          workList.pop_back();
          if(deps.count(I) != 0)
            continue;
          deps.insert(I);
          if(llvm::StoreInst *store = llvm::dyn_cast<llvm::StoreInst>(I))
            stores.insert(I);
          for(llvm::Value::use_iterator II = I->use_begin(), EE
I->use_end(); II != EE; ++II)
          {
            if(llvm::Instruction *user =
llvm::dyn_cast<llvm::Instruction>(*II))
              workList.push_back(user);
          }
        }
        if(stores.size() > 0)
        {
          // find loads that depend on our stores, TODO: this seems fishy!
          for(std::list<llvm::LoadInst *>::iterator I = loads.begin(),
E = loads.end(); I != E; )
          {
            bool pick = false;
#if 1
            llvm::SmallVector<llvm::NonLocalDepResult, 16> results;
            mda->getNonLocalPointerDependency((*I)->getPointerOperand(),
true, (*I)->getParent(), results);
            for(llvm::SmallVector<llvm::NonLocalDepResult,
16>::iterator II = results.begin(), EE = results.end(); !pick && II
!EE; ++II)
            {
              const llvm::MemDepResult &result = II->getResult();
              if(llvm::Instruction *dep = result.getInst())
              {
                if(llvm::StoreInst *store
llvm::dyn_cast<llvm::StoreInst>(dep))
                  pick |= (stores.count(store) != 0);
              }
            }
#else
            llvm::MemDepResult result = mda->getDependency(*I);
            if(llvm::Instruction *dep = result.getInst())
            {
              if(llvm::StoreInst *store =
llvm::dyn_cast<llvm::StoreInst>(dep))
                pick = (stores.count(store) != 0);
            }
#endif
            if(pick)
            {
              workList.push_back(*I);
              I = loads.erase(I);
            }
            else
              ++I;
          }
          stores.clear();
        }
        else
          break;
      }
      return deps;
    }
Duncan Sands
2010-Sep-24  07:26 UTC
[LLVMdev] Finding all values derived from a function argument
Hi Stephan,> I am trying to retrieve all instructions from a function's body that > are dependent on a specific argument. The strategy I am currently > using for that is to follow all uses of the argument and record them. > Also, whenever I encounter a store of a dependent value, I try to find > loads in the function that are dependent on that store and resume > use-tracking from there. For this purpose I am using the > MemoryDependenceAnalysis pass, but to be honest, I think I am not > using it correctly or not in the most efficient way. > > Is such functionality - finding dependent instructions - present in > LLVM? If not, is there a more efficient way to determine dependent > instructions than like in the code below? In particular, I'm wondering > if it is possible to find all loads that depend on a particular store > more efficiently?I don't think memory dependence analysis is appropriate for this. The function llvm::PointerMayBeCaptured in CaptureTracking.cpp does something like what you are trying to do. It used to track stores and loads from them, but this was removed because it was not helpful in practice. The way it used to track stores and loads was simple. First, it only tried to analysis stores to local variables (AllocaInst). Stores to globals are harder since there are more ways that other functions can get at the stored value. If it detected a store to an AllocaInst, then it walked all uses of the AllocaInst and analysed those. A load from the AllocaInst was considered to return the stored value (even though more detailed analysis might have shown that it was loading some other value). A store of the address of the local variable would also be tracked etc. This was a flow insensitive analysis. However as I mentioned, it was all removed because in practice either alloca's did not exist (because other passes had promoted everything to registers) or their address was passed to some other function where there was no way of knowing what the function did to it. It is easy to construct toy examples where tracking AllocaInst's helped, but in real code it did not. Ciao, Duncan.
Kenneth Uildriks
2010-Sep-24  13:29 UTC
[LLVMdev] Finding all values derived from a function argument
On Fri, Sep 24, 2010 at 2:26 AM, Duncan Sands <baldrick at free.fr> wrote:> Hi Stephan, > >> I am trying to retrieve all instructions from a function's body that >> are dependent on a specific argument. The strategy I am currently >> using for that is to follow all uses of the argument and record them. >> Also, whenever I encounter a store of a dependent value, I try to find >> loads in the function that are dependent on that store and resume >> use-tracking from there. For this purpose I am using the >> MemoryDependenceAnalysis pass, but to be honest, I think I am not >> using it correctly or not in the most efficient way. >> >> Is such functionality - finding dependent instructions - present in >> LLVM? If not, is there a more efficient way to determine dependent >> instructions than like in the code below? In particular, I'm wondering >> if it is possible to find all loads that depend on a particular store >> more efficiently? > > I don't think memory dependence analysis is appropriate for this. The function > llvm::PointerMayBeCaptured in CaptureTracking.cpp does something like what you > are trying to do. It used to track stores and loads from them, but this was > removed because it was not helpful in practice. The way it used to track > stores and loads was simple. First, it only tried to analysis stores to local > variables (AllocaInst). Stores to globals are harder since there are more ways > that other functions can get at the stored value. If it detected a store to an > AllocaInst, then it walked all uses of the AllocaInst and analysed those. A > load from the AllocaInst was considered to return the stored value (even though > more detailed analysis might have shown that it was loading some other value). > A store of the address of the local variable would also be tracked etc. This > was a flow insensitive analysis. However as I mentioned, it was all removed > because in practice either alloca's did not exist (because other passes had > promoted everything to registers) or their address was passed to some other > function where there was no way of knowing what the function did to it. It is > easy to construct toy examples where tracking AllocaInst's helped, but in real > code it did not. > > Ciao, > > Duncan.I'm working on something (gcalloc removal) that would actually be easier if I could easily answer the question "what loads from memory *might* return a given pointer value or a pointer derived from it?" I don't care if it's guaranteed to return the value, but it would be helpful to be able to prove (for instance) that no pointer to object A could possibly outlive some event or some other object. Even better if I could do it interprocedurally.