Hello, I fixed my patch as you asked. Sorry for the delay, I'd been working on my SSU patch (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-August/025347.html ) I hope that everything is fine now. -Jakub -------------- next part -------------- A non-text attachment was scrubbed... Name: pr2218-3.patch Type: application/octet-stream Size: 7511 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090902/954ace6e/attachment.obj> -------------- next part -------------- On Aug 7, 2009, at 7:40 PM, Chris Lattner wrote:> > On Jul 25, 2009, at 4:48 PM, Jakub Staszak wrote: > >> Hello, >> >> Sorry for my stupid mistakes. I hope that everything is fine now. >> This patch fixes PR2218. There are no loads in example, however >> "instcombine" changes memcpy() into store/load. > > Hi Jakub, > > Sorry for the delay, I'm way behind on code review. Generally if > you respond quickly, I'll remember enough about the previous email > that I can respond quickly too. If it gets out of my brain then it > takes me a while to get back to it. I'll try to be better in the > future :( > > This patch is looking like a great improvement. Some comments on > formatting: > > Please pull this out to a helper function: > + CallSite CS = CallSite::get(C); > + > + // Pointer must be a parameter (case 1) > + for (argI = 0; argI < CS.arg_size(); ++argI) > + if (CS.getArgument(argI)->stripPointerCasts() == pointer) > + break; > + > + if (argI == CS.arg_size()) > + return false; > > per http://llvm.org/docs/CodingStandards.html#hl_predicateloops > > > + // Store cannot be volatile (case 2) and must-alias with our > pointer. > + if (S->isVolatile()) { > + return false; > + } else { > > no need for the 'else after return': http://llvm.org/docs/CodingStandards.html#hl_else_after_return > > + AliasAnalysis& AA = getAnalysis<AliasAnalysis>(); > + if (AA.alias(S->getPointerOperand(), 1, pointer, 1) !> + AliasAnalysis::MustAlias) > + return false; > > Knowing that the loaded and stored pointer must alias is important, > but you also need to check that the loaded and stored sizes equal > each other. > > + // Look for a replacement for our pointer. If more than one > found, exit. > + for (User::use_iterator I = L->use_begin(), E = L->use_end(); I ! > = E; ++I) { > + Instruction *User = cast<Instruction>(*I); > + > + if (StoreInst *SI = dyn_cast<StoreInst>(User)) { > + if (SI->isVolatile()) > + return false; > + > + if (!Repl) > + Repl = SI->getPointerOperand(); > + else if (Repl != SI->getPointerOperand()) > + return false; > + > + } else > + return false; > + } > > Please do something like this: > > if (!L->hasOneUse()) return false; > StoreInst *StoreOfLoad = dyn_cast<StoreInst>(L->use_back()); > if (StoreOfLoad == 0 || ...) > ... > > > Actually, I see now that the code actually allows multiple stores as > long as they are to the same pointer. That also seems reasonable to > me, but please update the comment above the loop to make it really > clear what it is doing. It is probably also reasonable to factor > this out to a static helper function. > > I'm still not sure that this transformation is safe (it seems like > we need another dependence query). In particular, given something > like: > > %temporary = alloca i8 > call void @initialize(i8* noalias sret %temporary) > %tmp = load i8* %temporary > store 42 -> %result > store i8 %tmp, i8* %result > > > It is not safe to transform this into: > > call void @initialize(i8* noalias sret %result) > store 42 -> %result > > > I think this can be fixed by doing a dependence query from the > store, though I'm not sure what we'd expect. Another option is to > see if the result is only used by the store. In this case, the loop > above that allows multiple stores to the same pointer seems unneeded > because we'd only be able to support one store. > > Thanks for working on this! > > -Chris > > >
On Sep 2, 2009, at 1:07 AM, Jakub Staszak wrote:> Hello, > > I fixed my patch as you asked. Sorry for the delay, I'd been working > on my SSU patch (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-August/025347.html > ) > > I hope that everything is fine now.Hey Jakub, Thanks for working on this again, one more round :) Please merge the three testcases into one file. We added a new FileCheck tool which allows you to check for the exact sequence of instructions expected, which also allows the tests to be merged into one file. +/// MemCpyOpt::pointerIsParameter - returns true iff pointer is a parameter of +/// C call instruction. +bool MemCpyOpt::pointerIsParameter(Value *pointer, CallInst *C, unsigned &argI) +{ + CallSite CS = CallSite::get(C); + for (argI = 0; argI < CS.arg_size(); ++argI) Please make this a static function, it doesn't need "this". Also, please do something like this in the for loop: + for (argI = 0, CS.arg_size(); argI != argE; ++argI) pointerIsParameter returning a bool and the argument # byref is somewhat awkward. I think it would be better if it returned an int, which was -1 if not found. + MemoryDependenceAnalysis& MD = getAnalysis<MemoryDependenceAnalysis>(); You can sink this link to right before the call to MD.getDependency. + // Require pointer to have local dependency. + if (!dep.getInst() || dep.isNonLocal()) + return false; Random thought: after this patch goes in, it would be interesting to see how many queries get rejected because they are non-local: handling non-local would be pretty easy. Please add a TODO that mentions this. In any case, the check for dep.isNonLocal() can be removed. Nonlocal queries have a null instruction. + } else if (StoreInst *S = dyn_cast<StoreInst>(dep.getInst())) { ... + uint64_t ptrSize = AA.getTypeStoreSize(pointer->getType()); ... + uint64_t memSize = AA.getTypeStoreSize(memPtr->getType()); ... + if (AA.alias(pointer, ptrSize, memPtr, memSize) != AliasAnalysis::MustAlias) This is passing in the size of the pointer, not the pointee. However, since you only care about must alias, you don't care about the size of the object, you can just pass in 1 for both sizes. Please put a comment above the call to AA.alias explaining why you require mustalias here. + // Look for a replacement for our pointer. If more than one found, exit. + if (!L->hasOneUse()) + return false; + StoreInst *StoreOfLoad = dyn_cast<StoreInst>(L->use_back()); + if (!StoreOfLoad) + return false; Please do these check before the MemDep query, they are very cheap and will make the optimization run faster (by avoiding queries). I'm pretty sure that there is still a legality check missing here. Your code will transform this: %temporary = alloca i8 call void @initialize(i8* noalias sret %temporary) %tmp = load i8* %temporary store i8 42, i8* %result store i8 %tmp, i8* %result to: %temporary = alloca i8 call void @initialize(i8* noalias sret %result) store i8 42, i8* %result which is a miscompilation. To fix this, I'd do a memdep query of "result" from StoreOfLoad and from L. If they both have the same dependence, then the transform is ok. + if (CallInst *C = dyn_cast<CallInst>(dep.getInst())) { + CallSite CS = CallSite::get(C); + CS.setArgument(argI, Repl); pointerIsParameter strips bitcasts. Because of that, the type of the call argument may not be the same as the type of "repl", which could cause an assertion failure. We need to either make pointerIsParameter not strip bitcasts, or have this code insert a bitcast when the types don't line up. -Chris
On Sep 2, 2009, at 3:15 PM, Chris Lattner wrote:> Please merge the three testcases into one file. We added a new > FileCheck tool which allows you to check for the exact sequence of > instructions expected, which also allows the tests to be merged into > one file. > > +/// MemCpyOpt::pointerIsParameter - returns true iff pointer is a > parameter of > +/// C call instruction. > +bool MemCpyOpt::pointerIsParameter(Value *pointer, CallInst *C, > unsigned &argI) > +{ > + CallSite CS = CallSite::get(C); > + for (argI = 0; argI < CS.arg_size(); ++argI) > > Please make this a static function, it doesn't need "this". Also, > please do something like this in the for loop: > > + for (argI = 0, CS.arg_size(); argI != argE; ++argI) > > > pointerIsParameter returning a bool and the argument # byref is > somewhat awkward. I think it would be better if it returned an int, > which was -1 if not found.The problem is that CS.arg_size() is "unsigned int". Of course, I can cast int -> unsigned int but it wouldn't look good I think.> > + MemoryDependenceAnalysis& MD = > getAnalysis<MemoryDependenceAnalysis>(); > > You can sink this link to right before the call to MD.getDependency. > > > > + // Require pointer to have local dependency. > + if (!dep.getInst() || dep.isNonLocal()) > + return false; > > Random thought: after this patch goes in, it would be interesting to > see how many queries get rejected because they are non-local: > handling non-local would be pretty easy. Please add a TODO that > mentions this. > > In any case, the check for dep.isNonLocal() can be removed. > Nonlocal queries have a null instruction.OK, i'll work on it.> > + } else if (StoreInst *S = dyn_cast<StoreInst>(dep.getInst())) { > ... > + uint64_t ptrSize = AA.getTypeStoreSize(pointer->getType()); > ... > + uint64_t memSize = AA.getTypeStoreSize(memPtr->getType()); > ... > + if (AA.alias(pointer, ptrSize, memPtr, memSize) != > AliasAnalysis::MustAlias) > > This is passing in the size of the pointer, not the pointee. > However, since you only care about must alias, you don't care about > the size of the object, you can just pass in 1 for both sizes.Ahh, last time you said I had to check sizes. You said: "Knowing that the loaded and stored pointer must alias is important, but you also need to check that the loaded and stored sizes equal each other." :-)> Please put a comment above the call to AA.alias explaining why you > require mustalias here. > > > > + // Look for a replacement for our pointer. If more than one > found, exit. > + if (!L->hasOneUse()) > + return false; > + StoreInst *StoreOfLoad = dyn_cast<StoreInst>(L->use_back()); > + if (!StoreOfLoad) > + return false; > > Please do these check before the MemDep query, they are very cheap > and will make the optimization run faster (by avoiding queries). > > > > I'm pretty sure that there is still a legality check missing here. > Your code will transform this: > > %temporary = alloca i8 > call void @initialize(i8* noalias sret %temporary) > %tmp = load i8* %temporary > store i8 42, i8* %result > store i8 %tmp, i8* %result > > to: > > %temporary = alloca i8 > call void @initialize(i8* noalias sret %result) > store i8 42, i8* %result > > which is a miscompilation. To fix this, I'd do a memdep query of > "result" from StoreOfLoad and from L. If they both have the same > dependence, then the transform is ok.See PR2218-dont.ll test. -Jakub