Hello, Sorry for my stupid mistakes. I hope that everything is fine now. This patch fixes PR2218. There are no loads in example, however "instcombine" changes memcpy() into store/load. Regards, Jakub Staszak -------------- next part -------------- A non-text attachment was scrubbed... Name: pr2218-2.patch Type: application/octet-stream Size: 6525 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090725/b7dd0f10/attachment.obj> -------------- next part -------------- On Jul 22, 2009, at 10:51 PM, Chris Lattner wrote:> > On Jul 22, 2009, at 1:37 PM, Jakub Staszak wrote: > >> Hello, >> >> This patch fixes PR2218. > > Very nice. Are you sure this fixes PR2218? The example there > doesn't have any loads in it. > >> However, I'm not pretty sure that this optimization should be in >> MemCpyOpt. I think that GVN is good place as well. > > Yes, you're right. My long term goal is to merge the relevant > pieces of memcpyopt into GVN/DSE at some point. To do that, some > more major surgery needs to be done to memdep to make it work both > backward (to support GVN) and forward (to support DSE). In the > meantime, enhancing memcpyopt is fine, so long as there are good > testcases (and your patch has them!). > > Nit picky stuff about the patch: > > + for(unsigned argI = 0; argI < CS.arg_size(); ++argI) { > + if(CS.getArgument(argI)->stripPointerCasts() == pointer) > > Please put spaces after if/for. You get this right in some places, > but wrong in others. > > + Value* pointer = L->getPointerOperand(); > > Please use "Value *pointer" style throughout. It looks like > MemCpyOpt is already really inconsistent about this :-/ but please > don't make it worse :). > > + /* Pointer must be a parameter (case 1) */ > > please use C++ style // comments. > > > > Some other points: > 1. I don't see a check that the load isn't volatile, please don't > hack volatile loads. > > 2. The dependency check is relatively expensive, so please reorder > it to be checked after the check for alloca: > > + Value* pointer = L->getPointerOperand(); > + MemDepResult dep = MD.getDependency(L); > + > + // Require pointer to have local dependency. > + if(!dep.getInst() || dep.isNonLocal()) > + return false; > + > + // Require pointer to be an alloca. > + if(!isa<AllocaInst>(pointer)) > + return false; > > > 3. In this code, does the argument also need to be sret, or is any > argument ok? > > + if(CallInst *C = dyn_cast<CallInst>(dep.getInst())) { > + CallSite CS = CallSite::get(C); > + > + /* Pointer must be a parameter (case 1) */ > + if(!CS.hasArgument(pointer)) > + return false; > > 4. Here, please do an alias check if the store's pointer operand is > not the same as the load's (allowing the xform if they are must- > alias). This helps in some cases where there are GEP's that are the > same but not actually CSE'd: > > + /* Store cannot be volatile (case 2) */ > + if(S->getPointerOperand() != pointer || S->isVolatile()) > + return false; > > > 5. In the logic for your xform, I don't see any check to see if the > alloca itself has uses other than the load/store. In this example, > won't it miscompile the code by eliminating the store into temporary? > > %temporary = alloca i8 > call void @initialize(i8* noalias sret %temporary) > %tmp = load i8* %temporary > store i8 %tmp, i8* %result > %tmp2 = load i8* %temporary > ... > > 6. In the xform part of the code, you strip pointer casts, but you > don't do that in the argument checking part, why? > > + for(unsigned argI = 0; argI < CS.arg_size(); ++argI) { > + if(CS.getArgument(argI)->stripPointerCasts() == pointer) > > Thanks for working on this area, I'm glad to see it getting attention! > > -Chris
On Jul 25, 2009, at 4:48 PM, Jakub Staszak wrote:> Hello, > > Sorry for my stupid mistakes. I hope that everything is fine now. > This patch fixes PR2218. There are no loads in example, however > "instcombine" changes memcpy() into store/load.Hi Jakub, Sorry for the delay, I'm way behind on code review. Generally if you respond quickly, I'll remember enough about the previous email that I can respond quickly too. If it gets out of my brain then it takes me a while to get back to it. I'll try to be better in the future :( This patch is looking like a great improvement. Some comments on formatting: Please pull this out to a helper function: + CallSite CS = CallSite::get(C); + + // Pointer must be a parameter (case 1) + for (argI = 0; argI < CS.arg_size(); ++argI) + if (CS.getArgument(argI)->stripPointerCasts() == pointer) + break; + + if (argI == CS.arg_size()) + return false; per http://llvm.org/docs/CodingStandards.html#hl_predicateloops + // Store cannot be volatile (case 2) and must-alias with our pointer. + if (S->isVolatile()) { + return false; + } else { no need for the 'else after return': http://llvm.org/docs/CodingStandards.html#hl_else_after_return + AliasAnalysis& AA = getAnalysis<AliasAnalysis>(); + if (AA.alias(S->getPointerOperand(), 1, pointer, 1) !+ AliasAnalysis::MustAlias) + return false; Knowing that the loaded and stored pointer must alias is important, but you also need to check that the loaded and stored sizes equal each other. + // Look for a replacement for our pointer. If more than one found, exit. + for (User::use_iterator I = L->use_begin(), E = L->use_end(); I != E; ++I) { + Instruction *User = cast<Instruction>(*I); + + if (StoreInst *SI = dyn_cast<StoreInst>(User)) { + if (SI->isVolatile()) + return false; + + if (!Repl) + Repl = SI->getPointerOperand(); + else if (Repl != SI->getPointerOperand()) + return false; + + } else + return false; + } Please do something like this: if (!L->hasOneUse()) return false; StoreInst *StoreOfLoad = dyn_cast<StoreInst>(L->use_back()); if (StoreOfLoad == 0 || ...) ... Actually, I see now that the code actually allows multiple stores as long as they are to the same pointer. That also seems reasonable to me, but please update the comment above the loop to make it really clear what it is doing. It is probably also reasonable to factor this out to a static helper function. I'm still not sure that this transformation is safe (it seems like we need another dependence query). In particular, given something like: %temporary = alloca i8 call void @initialize(i8* noalias sret %temporary) %tmp = load i8* %temporary store 42 -> %result store i8 %tmp, i8* %result It is not safe to transform this into: call void @initialize(i8* noalias sret %result) store 42 -> %result I think this can be fixed by doing a dependence query from the store, though I'm not sure what we'd expect. Another option is to see if the result is only used by the store. In this case, the loop above that allows multiple stores to the same pointer seems unneeded because we'd only be able to support one store. Thanks for working on this! -Chris
Hello, I fixed my patch as you asked. Sorry for the delay, I'd been working on my SSU patch (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-August/025347.html ) I hope that everything is fine now. -Jakub -------------- next part -------------- A non-text attachment was scrubbed... Name: pr2218-3.patch Type: application/octet-stream Size: 7511 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090902/954ace6e/attachment.obj> -------------- next part -------------- On Aug 7, 2009, at 7:40 PM, Chris Lattner wrote:> > On Jul 25, 2009, at 4:48 PM, Jakub Staszak wrote: > >> Hello, >> >> Sorry for my stupid mistakes. I hope that everything is fine now. >> This patch fixes PR2218. There are no loads in example, however >> "instcombine" changes memcpy() into store/load. > > Hi Jakub, > > Sorry for the delay, I'm way behind on code review. Generally if > you respond quickly, I'll remember enough about the previous email > that I can respond quickly too. If it gets out of my brain then it > takes me a while to get back to it. I'll try to be better in the > future :( > > This patch is looking like a great improvement. Some comments on > formatting: > > Please pull this out to a helper function: > + CallSite CS = CallSite::get(C); > + > + // Pointer must be a parameter (case 1) > + for (argI = 0; argI < CS.arg_size(); ++argI) > + if (CS.getArgument(argI)->stripPointerCasts() == pointer) > + break; > + > + if (argI == CS.arg_size()) > + return false; > > per http://llvm.org/docs/CodingStandards.html#hl_predicateloops > > > + // Store cannot be volatile (case 2) and must-alias with our > pointer. > + if (S->isVolatile()) { > + return false; > + } else { > > no need for the 'else after return': http://llvm.org/docs/CodingStandards.html#hl_else_after_return > > + AliasAnalysis& AA = getAnalysis<AliasAnalysis>(); > + if (AA.alias(S->getPointerOperand(), 1, pointer, 1) !> + AliasAnalysis::MustAlias) > + return false; > > Knowing that the loaded and stored pointer must alias is important, > but you also need to check that the loaded and stored sizes equal > each other. > > + // Look for a replacement for our pointer. If more than one > found, exit. > + for (User::use_iterator I = L->use_begin(), E = L->use_end(); I ! > = E; ++I) { > + Instruction *User = cast<Instruction>(*I); > + > + if (StoreInst *SI = dyn_cast<StoreInst>(User)) { > + if (SI->isVolatile()) > + return false; > + > + if (!Repl) > + Repl = SI->getPointerOperand(); > + else if (Repl != SI->getPointerOperand()) > + return false; > + > + } else > + return false; > + } > > Please do something like this: > > if (!L->hasOneUse()) return false; > StoreInst *StoreOfLoad = dyn_cast<StoreInst>(L->use_back()); > if (StoreOfLoad == 0 || ...) > ... > > > Actually, I see now that the code actually allows multiple stores as > long as they are to the same pointer. That also seems reasonable to > me, but please update the comment above the loop to make it really > clear what it is doing. It is probably also reasonable to factor > this out to a static helper function. > > I'm still not sure that this transformation is safe (it seems like > we need another dependence query). In particular, given something > like: > > %temporary = alloca i8 > call void @initialize(i8* noalias sret %temporary) > %tmp = load i8* %temporary > store 42 -> %result > store i8 %tmp, i8* %result > > > It is not safe to transform this into: > > call void @initialize(i8* noalias sret %result) > store 42 -> %result > > > I think this can be fixed by doing a dependence query from the > store, though I'm not sure what we'd expect. Another option is to > see if the result is only used by the store. In this case, the loop > above that allows multiple stores to the same pointer seems unneeded > because we'd only be able to support one store. > > Thanks for working on this! > > -Chris > > >