Hi folks, I'm trying to analyse this piece of IR: for.body: ; preds = %for.body, %entry %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] %0 = shl nsw i64 %indvars.iv, 1 %arrayidx = getelementptr inbounds i32* %b, i64 %0 %1 = load i32* %arrayidx, align 4, !tbaa !1 %add = add nsw i32 %1, %I %arrayidx3 = getelementptr inbounds i32* %a, i64 %0 store i32 %add, i32* %arrayidx3, align 4, !tbaa !1 %2 = or i64 %0, 1 %arrayidx7 = getelementptr inbounds i32* %b, i64 %2 %3 = load i32* %arrayidx7, align 4, !tbaa !1 %add8 = add nsw i32 %3, %I %arrayidx12 = getelementptr inbounds i32* %a, i64 %2 store i32 %add8, i32* %arrayidx12, align 4, !tbaa !1 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %exitcond = icmp eq i64 %indvars.iv.next, 512 br i1 %exitcond, label %for.end, label %for.body And, when going through the GEP: %0 = shl nsw i64 %indvars.iv, 1 %arrayidx = getelementptr inbounds i32* %b, i64 %0 ScalarEvolution::createNodeForGEP() is losing the SCEV::FlagNSW, and I believe it's because of this piece of code in getMulExpr(): // Build the new addrec. Propagate the NUW and NSW flags if both the // outer mul and the inner addrec are guaranteed to have no overflow. // // No self-wrap cannot be guaranteed after changing the step size, but // will be inferred if either NUW or NSW is true. Flags = AddRec->getNoWrapFlags(clearFlags(Flags, SCEV::FlagNW)); const SCEV *NewRec = getAddRecExpr(NewOps, AddRecLoop, Flags); // If all of the other operands were loop invariant, we are done. if (Ops.size() == 1) return NewRec; I understand that you can't assume it won't wrap around with the new stride, but I couldn't find anything else later that would re-assign NSW/NUW. Maybe I need to add this code myself? Possibly in the GEP code, if I can detect the range will be safe, even for large strides? cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131113/a3cc6a9c/attachment.html>
On Nov 13, 2013, at 3:39 AM, Renato Golin <renato.golin at linaro.org> wrote:> Hi folks, > > I'm trying to analyse this piece of IR: > > for.body: ; preds = %for.body, %entry > %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] > %0 = shl nsw i64 %indvars.iv, 1 > %arrayidx = getelementptr inbounds i32* %b, i64 %0 > %1 = load i32* %arrayidx, align 4, !tbaa !1 > %add = add nsw i32 %1, %I > %arrayidx3 = getelementptr inbounds i32* %a, i64 %0 > store i32 %add, i32* %arrayidx3, align 4, !tbaa !1 > %2 = or i64 %0, 1 > %arrayidx7 = getelementptr inbounds i32* %b, i64 %2 > %3 = load i32* %arrayidx7, align 4, !tbaa !1 > %add8 = add nsw i32 %3, %I > %arrayidx12 = getelementptr inbounds i32* %a, i64 %2 > store i32 %add8, i32* %arrayidx12, align 4, !tbaa !1 > %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 > %exitcond = icmp eq i64 %indvars.iv.next, 512 > br i1 %exitcond, label %for.end, label %for.body > > And, when going through the GEP: > > %0 = shl nsw i64 %indvars.iv, 1 > %arrayidx = getelementptr inbounds i32* %b, i64 %0 > > ScalarEvolution::createNodeForGEP() is losing the SCEV::FlagNSW, and I believe it's because of this piece of code in getMulExpr(): > > // Build the new addrec. Propagate the NUW and NSW flags if both the > // outer mul and the inner addrec are guaranteed to have no overflow. > // > // No self-wrap cannot be guaranteed after changing the step size, but > // will be inferred if either NUW or NSW is true. > Flags = AddRec->getNoWrapFlags(clearFlags(Flags, SCEV::FlagNW)); > const SCEV *NewRec = getAddRecExpr(NewOps, AddRecLoop, Flags); > > // If all of the other operands were loop invariant, we are done. > if (Ops.size() == 1) return NewRec; > > I understand that you can't assume it won't wrap around with the new stride, but I couldn't find anything else later that would re-assign NSW/NUW. Maybe I need to add this code myself? Possibly in the GEP code, if I can detect the range will be safe, even for large strides?Hi Renato, Did you figure this out? The code you’re pointing to should preserve NSW and NUW. It is only clearing NW. Here’s what I think should happen, without actually trying it myself: - createNodeForGEP notices that inbounds implies NSW SCEV::NoWrapFlags Wrap = GEP->isInBounds() ? SCEV::FlagNSW : SCEV::FlagAnyWrap - The array index should already be an AddRecExpr with NSW const SCEV *IndexS = getSCEV(Index); - createNodeForGEP attempts to scale the index using the NSW flag we picked up earlier const SCEV *LocalOffset = getMulExpr(IndexS, ElementSize, Wrap); - getMulExpr constructs a new AddRec with NSW: Flags = AddRec->getNoWrapFlags(clearFlags(Flags, SCEV::FlagNW)); const SCEV *NewRec = getAddRecExpr(NewOps, AddRecLoop, Flags); The clearing of NW does not need to happen in your case (NSW/NUW should always imply NW). You could try this and see if it fixes the problem: Flags = AddRec->getNoWrapFlags(); If that fixes your case, then the right/safe fix will be trivial. -Andy
On 16 November 2013 06:56, Andrew Trick <atrick at apple.com> wrote:> - getMulExpr constructs a new AddRec with NSW: > > Flags = AddRec->getNoWrapFlags(clearFlags(Flags, SCEV::FlagNW)); > const SCEV *NewRec = getAddRecExpr(NewOps, AddRecLoop, Flags); >Hi Andrew, Thanks for looking at this. Clearing the flags here makes sense, but it's being too conservative. I'm thinking I'll need something like propagateSafeFlags(...) with knowledge of the loop induction ranges or something else instead of simply relying on later processes to spot the problems. The clearing of NW does not need to happen in your case (NSW/NUW should> always imply NW). You could try this and see if it fixes the problem: > > Flags = AddRec->getNoWrapFlags(); > > If that fixes your case, then the right/safe fix will be trivial. >It does, on that case. Shouldn't be too hard, but having the right info on the right place could be tricky. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131116/76c06e78/attachment.html>
Maybe Matching Threads
- [LLVMdev] SCEV getMulExpr() not propagating Wrap flags
- [LLVMdev] SCEV getMulExpr() not propagating Wrap flags
- [LLVMdev] SCEV getMulExpr() not propagating Wrap flags
- [IndVarSimplify] Narrow IV's are not eliminated resulting in inefficient code
- [IndVarSimplify] Narrow IV's are not eliminated resulting in inefficient code