Here is original C code: void topup(int a[], unsigned long i) { for (; i < 16; i++) { a[i] = 1; } } Here is the IR before the pass where I expect SCEV to return trip-count value ; Function Attrs: nofree norecurse nounwind uwtable writeonly define dso_local void @topup(i32* nocapture %a, i64 %i) local_unnamed_addr #0 { entry: %cmp3 = icmp ult i64 %i, 16 br i1 %cmp3, label %for.body.preheader, label %for.end for.body.preheader: ; preds = %entry br label %for.body for.body: ; preds %for.body.preheader, %for.body %i.addr.04 = phi i64 [ %inc, %for.body ], [ %i, %for.body.preheader ] %arrayidx = getelementptr inbounds i32, i32* %a, i64 %i.addr.04 store i32 1, i32* %arrayidx, align 4, !tbaa !2 %inc = add nuw nsw i64 %i.addr.04, 1 %exitcond = icmp eq i64 %inc, 16 br i1 %exitcond, label %for.end.loopexit, label %for.body for.end.loopexit: ; preds = %for.body br label %for.end for.end: ; preds %for.end.loopexit, %entry ret void } I have also checked that at a pass before above pass SCEV works and returns 16. Here is the IR before a pass where SCEV works as per the expectations: ; Preheader: for.body.preheader: ; preds = %entry br label %for.body ; Loop: for.body: ; preds %for.body.preheader, %for.body %i.addr.04 = phi i64 [ %inc, %for.body ], [ %i, %for.body.preheader ] %arrayidx = getelementptr inbounds i32, i32* %a, i64 %i.addr.04 store i32 1, i32* %arrayidx, align 4, !tbaa !2 %inc = add nuw nsw i64 %i.addr.04, 1 %cmp = icmp ult i64 %inc, 16 br i1 %cmp, label %for.body, label %for.end.loopexit ; Exit blocks for.end.loopexit: ; preds = %for.body br label %for.end - Vivek On Mon, Aug 26, 2019 at 2:09 AM Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> On Sun, Aug 25, 2019 at 4:49 AM vivek pandya via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > > Hello, > > > > I am first time paying with SCEV codebase. > > I am trying to find out why ScalarEvolution is not able to give correct > back edge taken count for an expression. > > > > So in my case flow reaches to howFarToZero() and in that function, I > have following expressions as SCEV > > > > Start = (15 + (-1 * %i) (which is set to Distance SCEV) > > Step = 1 > > > > now, first of all, should I expect Start as ConstantSCEV (15) instead of > the whole expression > > the problem here is getUnsignedRangeMax(Distance) returns very large > number because of -1 in the SCEV. > > Given you have have above, in howFarToZero SCEV is reasoning about the > loop as if it were: > > unsigned i = ...; > unsigned Start = 15 - i; > while (Start != 0) { > unsigned Step = 1; > Start = Start + Step; > } > > > How we can make this work? Here we can clearly say that after 15 steps > this expression will be 0 > > and so I don't think this assertion ("after 15 steps this expression > will be 0") is correct. > > If you post a minimal example we can try to figure out if SCEV can be > smarter here. > > -- Sanjoy > > > and thus we have a value for backedge taken count. > > > > Any help will be appriciated. > > > > Thanks, > > Vivek > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190826/013b56bc/attachment.html>
With the first example I see: Determining loop execution counts for: @topup Loop %for.body: backedge-taken count is (15 + (-1 * %i)) Loop %for.body: max backedge-taken count is -1 Loop %for.body: Predicated backedge-taken count is (15 + (-1 * %i)) So is the real problem here isn't that SCEV isn't computing the correct BE count, but that the max BE count it computes is not precise? If yes, I think you'll have to change https://github.com/llvm-project/llvm/blob/16bef0f9a051afa7b57f5fd01624086c35ec1e60/lib/Analysis/ScalarEvolution.cpp#L8746 to produce a more precise max trip count. -- Sanjoy On Sun, Aug 25, 2019 at 9:02 PM vivek pandya <vivekvpandya at gmail.com> wrote:> > Here is original C code: > void topup(int a[], unsigned long i) { > for (; i < 16; i++) { > a[i] = 1; > } > } > > Here is the IR before the pass where I expect SCEV to return trip-count value > > ; Function Attrs: nofree norecurse nounwind uwtable writeonly > define dso_local void @topup(i32* nocapture %a, i64 %i) local_unnamed_addr #0 { > entry: > %cmp3 = icmp ult i64 %i, 16 > br i1 %cmp3, label %for.body.preheader, label %for.end > > for.body.preheader: ; preds = %entry > br label %for.body > > for.body: ; preds = %for.body.preheader, %for.body > %i.addr.04 = phi i64 [ %inc, %for.body ], [ %i, %for.body.preheader ] > %arrayidx = getelementptr inbounds i32, i32* %a, i64 %i.addr.04 > store i32 1, i32* %arrayidx, align 4, !tbaa !2 > %inc = add nuw nsw i64 %i.addr.04, 1 > %exitcond = icmp eq i64 %inc, 16 > br i1 %exitcond, label %for.end.loopexit, label %for.body > > for.end.loopexit: ; preds = %for.body > br label %for.end > > for.end: ; preds = %for.end.loopexit, %entry > ret void > } > > I have also checked that at a pass before above pass SCEV works and returns 16. > > Here is the IR before a pass where SCEV works as per the expectations: > ; Preheader: > for.body.preheader: ; preds = %entry > br label %for.body > > ; Loop: > for.body: ; preds = %for.body.preheader, %for.body > %i.addr.04 = phi i64 [ %inc, %for.body ], [ %i, %for.body.preheader ] > %arrayidx = getelementptr inbounds i32, i32* %a, i64 %i.addr.04 > store i32 1, i32* %arrayidx, align 4, !tbaa !2 > %inc = add nuw nsw i64 %i.addr.04, 1 > %cmp = icmp ult i64 %inc, 16 > br i1 %cmp, label %for.body, label %for.end.loopexit > > ; Exit blocks > for.end.loopexit: ; preds = %for.body > br label %for.end > > - Vivek > > On Mon, Aug 26, 2019 at 2:09 AM Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >> >> On Sun, Aug 25, 2019 at 4:49 AM vivek pandya via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> > >> > Hello, >> > >> > I am first time paying with SCEV codebase. >> > I am trying to find out why ScalarEvolution is not able to give correct back edge taken count for an expression. >> > >> > So in my case flow reaches to howFarToZero() and in that function, I have following expressions as SCEV >> > >> > Start = (15 + (-1 * %i) (which is set to Distance SCEV) >> > Step = 1 >> > >> > now, first of all, should I expect Start as ConstantSCEV (15) instead of the whole expression >> > the problem here is getUnsignedRangeMax(Distance) returns very large number because of -1 in the SCEV. >> >> Given you have have above, in howFarToZero SCEV is reasoning about the >> loop as if it were: >> >> unsigned i = ...; >> unsigned Start = 15 - i; >> while (Start != 0) { >> unsigned Step = 1; >> Start = Start + Step; >> } >> >> > How we can make this work? Here we can clearly say that after 15 steps this expression will be 0 >> >> and so I don't think this assertion ("after 15 steps this expression >> will be 0") is correct. >> >> If you post a minimal example we can try to figure out if SCEV can be >> smarter here. >> >> -- Sanjoy >> >> > and thus we have a value for backedge taken count. >> > >> > Any help will be appriciated. >> > >> > Thanks, >> > Vivek >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
yes, I reached a similar conclusion. I have attached a patch if it looks on the correct direction I can send it for review. -Vivek On Mon, Sep 2, 2019 at 10:00 PM Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> With the first example I see: > > Determining loop execution counts for: @topup > Loop %for.body: backedge-taken count is (15 + (-1 * %i)) > Loop %for.body: max backedge-taken count is -1 > Loop %for.body: Predicated backedge-taken count is (15 + (-1 * %i)) > > So is the real problem here isn't that SCEV isn't computing the > correct BE count, but that the max BE count it computes is not > precise? > > If yes, I think you'll have to change > > https://github.com/llvm-project/llvm/blob/16bef0f9a051afa7b57f5fd01624086c35ec1e60/lib/Analysis/ScalarEvolution.cpp#L8746 > to produce a more precise max trip count. > > -- Sanjoy > > On Sun, Aug 25, 2019 at 9:02 PM vivek pandya <vivekvpandya at gmail.com> > wrote: > > > > Here is original C code: > > void topup(int a[], unsigned long i) { > > for (; i < 16; i++) { > > a[i] = 1; > > } > > } > > > > Here is the IR before the pass where I expect SCEV to return trip-count > value > > > > ; Function Attrs: nofree norecurse nounwind uwtable writeonly > > define dso_local void @topup(i32* nocapture %a, i64 %i) > local_unnamed_addr #0 { > > entry: > > %cmp3 = icmp ult i64 %i, 16 > > br i1 %cmp3, label %for.body.preheader, label %for.end > > > > for.body.preheader: ; preds = %entry > > br label %for.body > > > > for.body: ; preds > %for.body.preheader, %for.body > > %i.addr.04 = phi i64 [ %inc, %for.body ], [ %i, %for.body.preheader ] > > %arrayidx = getelementptr inbounds i32, i32* %a, i64 %i.addr.04 > > store i32 1, i32* %arrayidx, align 4, !tbaa !2 > > %inc = add nuw nsw i64 %i.addr.04, 1 > > %exitcond = icmp eq i64 %inc, 16 > > br i1 %exitcond, label %for.end.loopexit, label %for.body > > > > for.end.loopexit: ; preds = %for.body > > br label %for.end > > > > for.end: ; preds > %for.end.loopexit, %entry > > ret void > > } > > > > I have also checked that at a pass before above pass SCEV works and > returns 16. > > > > Here is the IR before a pass where SCEV works as per the expectations: > > ; Preheader: > > for.body.preheader: ; preds = %entry > > br label %for.body > > > > ; Loop: > > for.body: ; preds > %for.body.preheader, %for.body > > %i.addr.04 = phi i64 [ %inc, %for.body ], [ %i, %for.body.preheader ] > > %arrayidx = getelementptr inbounds i32, i32* %a, i64 %i.addr.04 > > store i32 1, i32* %arrayidx, align 4, !tbaa !2 > > %inc = add nuw nsw i64 %i.addr.04, 1 > > %cmp = icmp ult i64 %inc, 16 > > br i1 %cmp, label %for.body, label %for.end.loopexit > > > > ; Exit blocks > > for.end.loopexit: ; preds = %for.body > > br label %for.end > > > > - Vivek > > > > On Mon, Aug 26, 2019 at 2:09 AM Sanjoy Das < > sanjoy at playingwithpointers.com> wrote: > >> > >> On Sun, Aug 25, 2019 at 4:49 AM vivek pandya via llvm-dev > >> <llvm-dev at lists.llvm.org> wrote: > >> > > >> > Hello, > >> > > >> > I am first time paying with SCEV codebase. > >> > I am trying to find out why ScalarEvolution is not able to give > correct back edge taken count for an expression. > >> > > >> > So in my case flow reaches to howFarToZero() and in that function, I > have following expressions as SCEV > >> > > >> > Start = (15 + (-1 * %i) (which is set to Distance SCEV) > >> > Step = 1 > >> > > >> > now, first of all, should I expect Start as ConstantSCEV (15) instead > of the whole expression > >> > the problem here is getUnsignedRangeMax(Distance) returns very large > number because of -1 in the SCEV. > >> > >> Given you have have above, in howFarToZero SCEV is reasoning about the > >> loop as if it were: > >> > >> unsigned i = ...; > >> unsigned Start = 15 - i; > >> while (Start != 0) { > >> unsigned Step = 1; > >> Start = Start + Step; > >> } > >> > >> > How we can make this work? Here we can clearly say that after 15 > steps this expression will be 0 > >> > >> and so I don't think this assertion ("after 15 steps this expression > >> will be 0") is correct. > >> > >> If you post a minimal example we can try to figure out if SCEV can be > >> smarter here. > >> > >> -- Sanjoy > >> > >> > and thus we have a value for backedge taken count. > >> > > >> > Any help will be appriciated. > >> > > >> > Thanks, > >> > Vivek > >> > _______________________________________________ > >> > LLVM Developers mailing list > >> > llvm-dev at lists.llvm.org > >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190903/a26a08e4/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: scev.patch Type: application/octet-stream Size: 7283 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190903/a26a08e4/attachment-0001.obj>