I now understand that IndVarSimplify.cpp is capable of reproducing array references when the pointer initialization from the array address is found inside the immediately enclosing loop, such that in the following code: int A[20000], B[100], Z; int main() { int i, j, *a, *b; for ( a = &A[0], i = 0; i != 200; i++ ) for ( b = &B[0], j = 0; j != 100; j++ ) Z += *a++ * *b++; } the pointer b can be transformed into an offset from B[], but a cannot be transformed into an offset from A[] because the initialization is two levels up. I started poking around in IndVarSimplify.cpp and I have a patch which should (at least partially) enable the transformation for a as well, but I find that although the code looks OK in debug printouts, the gccas output is unchanged. So two questions: 1) Am I on the right track with this transformation (patch enclosed) and 2) why does the output not reflect my changes? Index: lib/Transforms/Scalar/IndVarSimplify.cpp ==================================================================RCS file: /var/cvs/llvm/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp,v retrieving revision 1.78 diff -r1.78 IndVarSimplify.cpp 310c310 < runOnLoop(*I); ---> runOnLoop(LI, *I);322,323c322,323 < void runOnLoop(Loop *L); < void EliminatePointerRecurrence(PHINode *PN, BasicBlock *Preheader, ---> void runOnLoop(LoopInfo *LI, Loop *L); > void EliminatePointerRecurrence(LoopInfo *LI, Loop *L, PHINode *PN,BasicBlock *Preheader, 361c361 < void IndVarSimplify::EliminatePointerRecurrence(PHINode *PN, ---> void IndVarSimplify::EliminatePointerRecurrence(LoopInfo *LI, Loop *L,PHINode *PN, 366a367> Value *saved_trip_count = L->getTripCount(); // save before it'sclobbered 392c393,408 < ---> std::cerr << "\tGEPI: " << *GEPI; > if (PHINode *phi = dyn_cast<PHINode>(GEPI->getOperand(0))) > { > if (ConstantExpr *CE =dyn_cast<ConstantExpr>(phi->getIncomingValue(0)))> if (CE->getOpcode() == Instruction::GetElementPtr) // Wewant this to become explicit...> { > std::cerr << "Block Before: " << *GEPI->getParent(); > Instruction *mul =BinaryOperator::createMul(saved_trip_count,> LI->getLoopFor( phi->getIncomingBlock(1) ) > ->getCanonicalInductionVariable(), "mul.",GEPI);> Instruction *add = BinaryOperator::createAdd(mul,NewAdd, "add.", GEPI);> GEPI->setOperand( 0, phi->getIncomingValue(0) ); > GEPI->setOperand( 1, add ); > std::cerr << "Block After: " << *GEPI->getParent(); > } > }605c621,622 < void IndVarSimplify::runOnLoop(Loop *L) { ---> void IndVarSimplify::runOnLoop(LoopInfo *LI, Loop *L) { >617c634 < EliminatePointerRecurrence(PN, Preheader, DeadInsts); ---> EliminatePointerRecurrence(LI, L, PN, Preheader, DeadInsts);626c643 < runOnLoop(*I); ---> runOnLoop(LI,*I);652,653c669,670 < if (SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(SCEV)) < if (AR->getNumOperands() == 2 && isa<SCEVConstant>(AR->getOperand(1))) ---> if (1)//SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(SCEV)) > if (1)//AR->getNumOperands() == 2 &&isa<SCEVConstant>(AR->getOperand(1))) Naftali
On Thu, 28 Jul 2005, Naftali Schwartz wrote:> I now understand that IndVarSimplify.cpp is capable of reproducing array > references when the pointer initialization from the array address is found > inside the immediately enclosing loop, such that in the following code:Ok.> int A[20000], B[100], Z; > int main() > { > int i, j, *a, *b; > for ( a = &A[0], i = 0; i != 200; i++ ) > for ( b = &B[0], j = 0; j != 100; j++ ) > Z += *a++ * *b++; > } > > the pointer b can be transformed into an offset from B[], but a cannot be > transformed into an offset from A[] because the initialization is two levels > up.Hrm, that should work.> I started poking around in IndVarSimplify.cpp and I have a patch which should > (at least partially) enable the transformation for a as well, but I find that > although the code looks OK in debug printouts, the gccas output is unchanged.Ok> So two questions: 1) Am I on the right track with this transformation (patch > enclosed) andI haven't taken a look at the code. Could you please give a very simple example showing what it is supposed to change that is currently not handled? Also, can you make sure to wrap the lines to fit in 80-columns? Finally, please send diffs as "unified" diffs, this makes it easier to read. You can get unified diffs by passing "-u" to cvs diff or normal diff.> 2) why does the output not reflect my changes?There are a couple of different possibilities. I'm not sure exactly what you're trying to accomplish here, but one of the following may happen: 1. Something in the compiler is running after indvars that undoes what you are trying to do. 2. The input to indvars in not what you think it is, so running the pass in the middle of gccas is not giving you the same output as running it logically at the end of gccas with the opt program. To isolate #2, it might be useful to get the full set of passes that gccas is doing. gccas will dump the passes if you run it like this: gccas /dev/null -o /dev/null -debug-pass=Arguments That way you can use the passes with 'opt' to see exactly what code is given to indvars. -Chris> Index: lib/Transforms/Scalar/IndVarSimplify.cpp > ==================================================================> RCS file: /var/cvs/llvm/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp,v > retrieving revision 1.78 > diff -r1.78 IndVarSimplify.cpp > 310c310 > < runOnLoop(*I); > --- >> runOnLoop(LI, *I); > 322,323c322,323 > < void runOnLoop(Loop *L); > < void EliminatePointerRecurrence(PHINode *PN, BasicBlock *Preheader, > --- >> void runOnLoop(LoopInfo *LI, Loop *L); >> void EliminatePointerRecurrence(LoopInfo *LI, Loop *L, PHINode *PN, > BasicBlock *Preheader, > 361c361 > < void IndVarSimplify::EliminatePointerRecurrence(PHINode *PN, > --- >> void IndVarSimplify::EliminatePointerRecurrence(LoopInfo *LI, Loop *L, > PHINode *PN, > 366a367 >> Value *saved_trip_count = L->getTripCount(); // save before it's > clobbered > 392c393,408 > < > --- >> std::cerr << "\tGEPI: " << *GEPI; >> if (PHINode *phi = dyn_cast<PHINode>(GEPI->getOperand(0))) >> { >> if (ConstantExpr *CE = > dyn_cast<ConstantExpr>(phi->getIncomingValue(0))) >> if (CE->getOpcode() == Instruction::GetElementPtr) // We > want this to become explicit... >> { >> std::cerr << "Block Before: " << *GEPI->getParent(); >> Instruction *mul = > BinaryOperator::createMul(saved_trip_count, >> LI->getLoopFor( phi->getIncomingBlock(1) ) >> ->getCanonicalInductionVariable(), "mul.", > GEPI); >> Instruction *add = BinaryOperator::createAdd(mul, > NewAdd, "add.", GEPI); >> GEPI->setOperand( 0, phi->getIncomingValue(0) ); >> GEPI->setOperand( 1, add ); >> std::cerr << "Block After: " << *GEPI->getParent(); >> } >> } > 605c621,622 > < void IndVarSimplify::runOnLoop(Loop *L) { > --- >> void IndVarSimplify::runOnLoop(LoopInfo *LI, Loop *L) { >> > 617c634 > < EliminatePointerRecurrence(PN, Preheader, DeadInsts); > --- >> EliminatePointerRecurrence(LI, L, PN, Preheader, DeadInsts); > 626c643 > < runOnLoop(*I); > --- >> runOnLoop(LI,*I); > 652,653c669,670 > < if (SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(SCEV)) > < if (AR->getNumOperands() == 2 && > isa<SCEVConstant>(AR->getOperand(1))) > --- >> if (1)//SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(SCEV)) >> if (1)//AR->getNumOperands() == 2 && > isa<SCEVConstant>(AR->getOperand(1))) > > Naftali > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-Chris -- http://nondot.org/sabre/ http://llvm.org/
OK, thanks Chris, I've found that running opt -loopsimplify -instcombine -indvars -stats gives me the setup I need for this transformation, and a small patch makes it happen in the simple case we discussed. However, I'm having some trouble when things get a bit more complicated with 3 nesting levels: int A[3000000], B[20000], C[100], Z; int main() { int i, j, k, *a, *b, *c; for ( a = &A[0], i = 0; i != 300; i++ ) for ( b = &B[0], j = 0; j != 200; j++ ) for ( c = &C[0], k = 0; k != 100; k++ ) Z += *a++ * *b++ * *c++; } My main problem is maintaining the loop nesting structure, which some pass in llvm seems to want to collapse, and then scalar evolution can't find all the predictable loop counts. However, when I insert a volatile memory operation inside of each loop, this preserves the loop structure and scalar evolution finds all the loop counts. What to do? Thanks, Naftali On Thu, 28 Jul 2005, Chris Lattner wrote:> On Thu, 28 Jul 2005, Naftali Schwartz wrote: >> I now understand that IndVarSimplify.cpp is capable of reproducing array >> references when the pointer initialization from the array address is found >> inside the immediately enclosing loop, such that in the following code: > > Ok. > >> int A[20000], B[100], Z; >> int main() >> { >> int i, j, *a, *b; >> for ( a = &A[0], i = 0; i != 200; i++ ) >> for ( b = &B[0], j = 0; j != 100; j++ ) >> Z += *a++ * *b++; >> } >> >> the pointer b can be transformed into an offset from B[], but a cannot be >> transformed into an offset from A[] because the initialization is two >> levels up. > > Hrm, that should work. > >> I started poking around in IndVarSimplify.cpp and I have a patch which >> should (at least partially) enable the transformation for a as well, but I >> find that although the code looks OK in debug printouts, the gccas output >> is unchanged. > > Ok >