Alexandre Isoard via llvm-dev
2018-Aug-15 20:31 UTC
[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
Is that why we do not deduce +<nsw> from "add nsw" either? Is that an intrinsic limitation of creating a context-invariant expressions from a Value* or is that a limitation of our implementation (our unification not considering the nsw flags)? On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at codeaurora.org> wrote:> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote: > > Hello, > > If I run clang on the following code: > > void func(unsigned n) { >> for (unsigned long x = 1; x < n; ++x) >> dummy(x); >> } > > > I get the following llvm ir: > > define void @func(i32 %n) { >> entry: >> %conv = zext i32 %n to i64 >> %cmp5 = icmp ugt i32 %n, 1 >> br i1 %cmp5, label %for.body, label %for.cond.cleanup >> for.cond.cleanup: ; preds = %for.body, >> %entry >> ret void >> for.body: ; preds = %entry, >> %for.body >> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >> tail call void @dummy(i64 %x.06) #2 >> %inc = add nuw nsw i64 %x.06, 1 >> %exitcond = icmp eq i64 %inc, %conv >> br i1 %exitcond, label %for.cond.cleanup, label %for.body >> } > > > Over which, SCEV will provide the following analysis: > > Printing analysis 'Scalar Evolution Analysis' for function 'func': >> Classifying expressions for: @func >> %conv = zext i32 %n to i64 >> --> (zext i32 %n to i64) U: [0,4294967296) S: [0,4294967296) >> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >> --> {1,+,1}<nuw><nsw><%for.body> U: [1,-9223372036854775808) S: >> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64)) LoopDispositions: >> { %for.body: Computable } >> %inc = add nuw nsw i64 %x.06, 1 >> --> {2,+,1}<nuw><%for.body> U: [2,0) S: [2,0) Exits: (zext i32 %n to >> i64) LoopDispositions: { %for.body: Computable } >> Determining loop execution counts for: @func >> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n to i64))<nsw> >> Loop %for.body: max backedge-taken count is -2 >> Loop %for.body: Predicated backedge-taken count is (-2 + (zext i32 %n to >> i64))<nsw> >> Predicates: >> Loop %for.body: Trip multiple is 1 > > > Now, I was surprised by the max backedge-taken count being -2, and I > suspect it is due to the backedge-taken count being marked as <nsw> instead > of <nuw>. > > Is that on purpose, is that a bug, or is my analysis incorrect? I am not > sure where to fix that issue. > > > The backedge-taken count isn't nuw because nsw/nuw markings aren't > flow-sensitive: there isn't any way to mark the trip count as nuw without > marking every computation of `(long)n-2` as nuw. > > There's some code in ScalarEvolution::howFarToZero to try to refine the > max backedge-taken count in some cases, but it isn't very general. See > https://reviews.llvm.org/D28536 . > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-- *Alexandre Isoard* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180815/bcd96b67/attachment.html>
Friedman, Eli via llvm-dev
2018-Aug-15 20:59 UTC
[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
On 8/15/2018 1:31 PM, Alexandre Isoard wrote:> Is that why we do not deduce +<nsw> from "add nsw" either?Essentially, yes.> Is that an intrinsic limitation of creating a context-invariant > expressions from a Value* or is that a limitation of our > implementation (our unification not considering the nsw flags)?It's a consequence of unification not considering nsw. (nsw on an instruction is naturally invariant because violating nsw produces poison, not UB.) -Eli> > On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli > <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>> wrote: > > On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote: >> Hello, >> >> If I run clang on the following code: >> >> void func(unsigned n) { >> for (unsigned long x = 1; x < n; ++x) >> dummy(x); >> } >> >> >> I get the following llvm ir: >> >> define void @func(i32 %n) { >> entry: >> %conv = zext i32 %n to i64 >> %cmp5 = icmp ugt i32 %n, 1 >> br i1 %cmp5, label %for.body, label %for.cond.cleanup >> for.cond.cleanup: ; preds = %for.body, %entry >> ret void >> for.body: ; preds = %entry, %for.body >> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >> tail call void @dummy(i64 %x.06) #2 >> %inc = add nuw nsw i64 %x.06, 1 >> %exitcond = icmp eq i64 %inc, %conv >> br i1 %exitcond, label %for.cond.cleanup, label %for.body >> } >> >> >> Over which, SCEV will provide the following analysis: >> >> Printing analysis 'Scalar Evolution Analysis' for function >> 'func': >> Classifying expressions for: @func >> %conv = zext i32 %n to i64 >> --> (zext i32 %n to i64) U: [0,4294967296) S: [0,4294967296) >> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >> --> {1,+,1}<nuw><nsw><%for.body> U: >> [1,-9223372036854775808) S: [1,-9223372036854775808)Exits: >> (-1 + (zext i32 %n to i64))LoopDispositions: { %for.body: >> Computable } >> %inc = add nuw nsw i64 %x.06, 1 >> --> {2,+,1}<nuw><%for.body> U: [2,0) S: [2,0)Exits: (zext >> i32 %n to i64)LoopDispositions: { %for.body: Computable } >> Determining loop execution counts for: @func >> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n to >> i64))<nsw> >> Loop %for.body: max backedge-taken count is -2 >> Loop %for.body: Predicated backedge-taken count is (-2 + >> (zext i32 %n to i64))<nsw> >> Predicates: >> Loop %for.body: Trip multiple is 1 >> >> >> Now, I was surprised by the max backedge-taken count being -2, >> and I suspect it is due to the backedge-taken count being marked >> as <nsw> instead of <nuw>. >> >> Is that on purpose, is that a bug, or is my analysis incorrect? I >> am not sure where to fix that issue. > > The backedge-taken count isn't nuw because nsw/nuw markings aren't > flow-sensitive: there isn't any way to mark the trip count as nuw > without marking every computation of `(long)n-2` as nuw. > > There's some code in ScalarEvolution::howFarToZero to try to > refine the max backedge-taken count in some cases, but it isn't > very general. See https://reviews.llvm.org/D28536 . > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > > > > -- > *Alexandre Isoard*-- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180815/e0b989a3/attachment.html>
Alexandre Isoard via llvm-dev
2018-Aug-15 21:27 UTC
[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
I'm not sure I understand the poison/undef/UB distinctions. But on this example: define i32 @func(i1 zeroext %b, i32 %x, i32 %y) {> entry: > %adds = add nsw i32 %x, %y > %addu = add nuw i32 %x, %y > %cond = select i1 %b, i32 %adds, i32 %addu > ret i32 %cond > }It is important to not propagate the nsw/nuw between the two SCEV expressions (which unification would do today, can I consider that a bug or is it a feature?). So we work-around it by not informing SCEV of the flags: Printing analysis 'Scalar Evolution Analysis' for function 'func':> Classifying expressions for: @func > %adds = add nsw i32 %x, %y > --> (%x + %y) U: full-set S: full-set > %addu = add nuw i32 %x, %y > --> (%x + %y) U: full-set S: full-set > %cond = select i1 %b, i32 %adds, i32 %addu > --> %cond U: full-set S: full-set > Determining loop execution counts for: @funcWould there be problems if we properly considered nuw/nsw flags when unifying SCEVs? On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli <efriedma at codeaurora.org> wrote:> On 8/15/2018 1:31 PM, Alexandre Isoard wrote: > > Is that why we do not deduce +<nsw> from "add nsw" either? > > > Essentially, yes. > > Is that an intrinsic limitation of creating a context-invariant > expressions from a Value* or is that a limitation of our implementation > (our unification not considering the nsw flags)? > > > It's a consequence of unification not considering nsw. (nsw on an > instruction is naturally invariant because violating nsw produces poison, > not UB.) > > -Eli > > > On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at codeaurora.org> > wrote: > >> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote: >> >> Hello, >> >> If I run clang on the following code: >> >> void func(unsigned n) { >>> for (unsigned long x = 1; x < n; ++x) >>> dummy(x); >>> } >> >> >> I get the following llvm ir: >> >> define void @func(i32 %n) { >>> entry: >>> %conv = zext i32 %n to i64 >>> %cmp5 = icmp ugt i32 %n, 1 >>> br i1 %cmp5, label %for.body, label %for.cond.cleanup >>> for.cond.cleanup: ; preds = %for.body, >>> %entry >>> ret void >>> for.body: ; preds = %entry, >>> %for.body >>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>> tail call void @dummy(i64 %x.06) #2 >>> %inc = add nuw nsw i64 %x.06, 1 >>> %exitcond = icmp eq i64 %inc, %conv >>> br i1 %exitcond, label %for.cond.cleanup, label %for.body >>> } >> >> >> Over which, SCEV will provide the following analysis: >> >> Printing analysis 'Scalar Evolution Analysis' for function 'func': >>> Classifying expressions for: @func >>> %conv = zext i32 %n to i64 >>> --> (zext i32 %n to i64) U: [0,4294967296) S: [0,4294967296) >>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>> --> {1,+,1}<nuw><nsw><%for.body> U: [1,-9223372036854775808) S: >>> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64)) LoopDispositions: >>> { %for.body: Computable } >>> %inc = add nuw nsw i64 %x.06, 1 >>> --> {2,+,1}<nuw><%for.body> U: [2,0) S: [2,0) Exits: (zext i32 %n to >>> i64) LoopDispositions: { %for.body: Computable } >>> Determining loop execution counts for: @func >>> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n to i64))<nsw> >>> Loop %for.body: max backedge-taken count is -2 >>> Loop %for.body: Predicated backedge-taken count is (-2 + (zext i32 %n to >>> i64))<nsw> >>> Predicates: >>> Loop %for.body: Trip multiple is 1 >> >> >> Now, I was surprised by the max backedge-taken count being -2, and I >> suspect it is due to the backedge-taken count being marked as <nsw> instead >> of <nuw>. >> >> Is that on purpose, is that a bug, or is my analysis incorrect? I am not >> sure where to fix that issue. >> >> >> The backedge-taken count isn't nuw because nsw/nuw markings aren't >> flow-sensitive: there isn't any way to mark the trip count as nuw without >> marking every computation of `(long)n-2` as nuw. >> >> There's some code in ScalarEvolution::howFarToZero to try to refine the >> max backedge-taken count in some cases, but it isn't very general. See >> https://reviews.llvm.org/D28536 . >> >> -Eli >> >> -- >> Employee of Qualcomm Innovation Center, Inc. >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >> >> > > -- > *Alexandre Isoard* > > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-- *Alexandre Isoard* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180815/460f1c2f/attachment-0001.html>