Alexandre Isoard via llvm-dev
2018-Aug-16 23:09 UTC
[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
Ok. To go back to the original issue, would it be meaningful to add a SCEVUMax(0, BTC) on the final BTC computed by SCEV? So that it does not use "negative values"? On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at codeaurora.org> wrote:> On 8/15/2018 2:27 PM, Alexandre Isoard wrote: > > I'm not sure I understand the poison/undef/UB distinctions. > > But on this example: > > define i32 @func(i1 zeroext %b, i32 %x, i32 %y) { >> entry: >> %adds = add nsw i32 %x, %y >> %addu = add nuw i32 %x, %y >> %cond = select i1 %b, i32 %adds, i32 %addu >> ret i32 %cond >> } > > > It is important to not propagate the nsw/nuw between the two SCEV > expressions (which unification would do today, can I consider that a bug or > is it a feature?). > > > It's an intentional design choice. > > So we work-around it by not informing SCEV of the flags: > > Printing analysis 'Scalar Evolution Analysis' for function 'func': >> Classifying expressions for: @func >> %adds = add nsw i32 %x, %y >> --> (%x + %y) U: full-set S: full-set >> %addu = add nuw i32 %x, %y >> --> (%x + %y) U: full-set S: full-set >> %cond = select i1 %b, i32 %adds, i32 %addu >> --> %cond U: full-set S: full-set >> Determining loop execution counts for: @func > > > Would there be problems if we properly considered nuw/nsw flags when > unifying SCEVs? > > > There would be other consequences. For example, `(%x + %y)<nsw>` and `(%x > + %y)<nuw>` wouldn't compare equal for other simplifications, and all the > places that call setNoWrapFlags would have to be rewritten. It's probably > possible to come up with some workable design, but nobody has actually > tried it, so it's not clear how much work it would be to implement, or > whether it would improve the generated code overall. > > -Eli > > > On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli <efriedma at codeaurora.org> > wrote: > >> On 8/15/2018 1:31 PM, Alexandre Isoard wrote: >> >> Is that why we do not deduce +<nsw> from "add nsw" either? >> >> >> Essentially, yes. >> >> Is that an intrinsic limitation of creating a context-invariant >> expressions from a Value* or is that a limitation of our implementation >> (our unification not considering the nsw flags)? >> >> >> It's a consequence of unification not considering nsw. (nsw on an >> instruction is naturally invariant because violating nsw produces poison, >> not UB.) >> >> -Eli >> >> >> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at codeaurora.org> >> wrote: >> >>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote: >>> >>> Hello, >>> >>> If I run clang on the following code: >>> >>> void func(unsigned n) { >>>> for (unsigned long x = 1; x < n; ++x) >>>> dummy(x); >>>> } >>> >>> >>> I get the following llvm ir: >>> >>> define void @func(i32 %n) { >>>> entry: >>>> %conv = zext i32 %n to i64 >>>> %cmp5 = icmp ugt i32 %n, 1 >>>> br i1 %cmp5, label %for.body, label %for.cond.cleanup >>>> for.cond.cleanup: ; preds = %for.body, >>>> %entry >>>> ret void >>>> for.body: ; preds = %entry, >>>> %for.body >>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>> tail call void @dummy(i64 %x.06) #2 >>>> %inc = add nuw nsw i64 %x.06, 1 >>>> %exitcond = icmp eq i64 %inc, %conv >>>> br i1 %exitcond, label %for.cond.cleanup, label %for.body >>>> } >>> >>> >>> Over which, SCEV will provide the following analysis: >>> >>> Printing analysis 'Scalar Evolution Analysis' for function 'func': >>>> Classifying expressions for: @func >>>> %conv = zext i32 %n to i64 >>>> --> (zext i32 %n to i64) U: [0,4294967296) S: [0,4294967296) >>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>> --> {1,+,1}<nuw><nsw><%for.body> U: [1,-9223372036854775808) S: >>>> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64)) LoopDispositions: >>>> { %for.body: Computable } >>>> %inc = add nuw nsw i64 %x.06, 1 >>>> --> {2,+,1}<nuw><%for.body> U: [2,0) S: [2,0) Exits: (zext i32 %n >>>> to i64) LoopDispositions: { %for.body: Computable } >>>> Determining loop execution counts for: @func >>>> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n to i64))<nsw> >>>> Loop %for.body: max backedge-taken count is -2 >>>> Loop %for.body: Predicated backedge-taken count is (-2 + (zext i32 %n >>>> to i64))<nsw> >>>> Predicates: >>>> Loop %for.body: Trip multiple is 1 >>> >>> >>> Now, I was surprised by the max backedge-taken count being -2, and I >>> suspect it is due to the backedge-taken count being marked as <nsw> instead >>> of <nuw>. >>> >>> Is that on purpose, is that a bug, or is my analysis incorrect? I am not >>> sure where to fix that issue. >>> >>> >>> The backedge-taken count isn't nuw because nsw/nuw markings aren't >>> flow-sensitive: there isn't any way to mark the trip count as nuw without >>> marking every computation of `(long)n-2` as nuw. >>> >>> There's some code in ScalarEvolution::howFarToZero to try to refine the >>> max backedge-taken count in some cases, but it isn't very general. See >>> https://reviews.llvm.org/D28536 . >>> >>> -Eli >>> >>> -- >>> Employee of Qualcomm Innovation Center, Inc. >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>> >>> >> >> -- >> *Alexandre Isoard* >> >> >> -- >> Employee of Qualcomm Innovation Center, Inc. >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >> >> > > -- > *Alexandre Isoard* > > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-- *Alexandre Isoard* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/a14ee024/attachment-0001.html>
Alexandre Isoard via llvm-dev
2018-Aug-16 23:24 UTC
[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
Hum, actually that gets eliminated to nothing of course. What we need is really the NUW flag... On Thu, Aug 16, 2018 at 4:09 PM Alexandre Isoard <alexandre.isoard at gmail.com> wrote:> Ok. > > To go back to the original issue, would it be meaningful to add a > SCEVUMax(0, BTC) on the final BTC computed by SCEV? > > So that it does not use "negative values"? > > On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at codeaurora.org> > wrote: > >> On 8/15/2018 2:27 PM, Alexandre Isoard wrote: >> >> I'm not sure I understand the poison/undef/UB distinctions. >> >> But on this example: >> >> define i32 @func(i1 zeroext %b, i32 %x, i32 %y) { >>> entry: >>> %adds = add nsw i32 %x, %y >>> %addu = add nuw i32 %x, %y >>> %cond = select i1 %b, i32 %adds, i32 %addu >>> ret i32 %cond >>> } >> >> >> It is important to not propagate the nsw/nuw between the two SCEV >> expressions (which unification would do today, can I consider that a bug or >> is it a feature?). >> >> >> It's an intentional design choice. >> >> So we work-around it by not informing SCEV of the flags: >> >> Printing analysis 'Scalar Evolution Analysis' for function 'func': >>> Classifying expressions for: @func >>> %adds = add nsw i32 %x, %y >>> --> (%x + %y) U: full-set S: full-set >>> %addu = add nuw i32 %x, %y >>> --> (%x + %y) U: full-set S: full-set >>> %cond = select i1 %b, i32 %adds, i32 %addu >>> --> %cond U: full-set S: full-set >>> Determining loop execution counts for: @func >> >> >> Would there be problems if we properly considered nuw/nsw flags when >> unifying SCEVs? >> >> >> There would be other consequences. For example, `(%x + %y)<nsw>` and >> `(%x + %y)<nuw>` wouldn't compare equal for other simplifications, and all >> the places that call setNoWrapFlags would have to be rewritten. It's >> probably possible to come up with some workable design, but nobody has >> actually tried it, so it's not clear how much work it would be to >> implement, or whether it would improve the generated code overall. >> >> -Eli >> >> >> On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli <efriedma at codeaurora.org> >> wrote: >> >>> On 8/15/2018 1:31 PM, Alexandre Isoard wrote: >>> >>> Is that why we do not deduce +<nsw> from "add nsw" either? >>> >>> >>> Essentially, yes. >>> >>> Is that an intrinsic limitation of creating a context-invariant >>> expressions from a Value* or is that a limitation of our implementation >>> (our unification not considering the nsw flags)? >>> >>> >>> It's a consequence of unification not considering nsw. (nsw on an >>> instruction is naturally invariant because violating nsw produces poison, >>> not UB.) >>> >>> -Eli >>> >>> >>> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at codeaurora.org> >>> wrote: >>> >>>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote: >>>> >>>> Hello, >>>> >>>> If I run clang on the following code: >>>> >>>> void func(unsigned n) { >>>>> for (unsigned long x = 1; x < n; ++x) >>>>> dummy(x); >>>>> } >>>> >>>> >>>> I get the following llvm ir: >>>> >>>> define void @func(i32 %n) { >>>>> entry: >>>>> %conv = zext i32 %n to i64 >>>>> %cmp5 = icmp ugt i32 %n, 1 >>>>> br i1 %cmp5, label %for.body, label %for.cond.cleanup >>>>> for.cond.cleanup: ; preds = %for.body, >>>>> %entry >>>>> ret void >>>>> for.body: ; preds = %entry, >>>>> %for.body >>>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>>> tail call void @dummy(i64 %x.06) #2 >>>>> %inc = add nuw nsw i64 %x.06, 1 >>>>> %exitcond = icmp eq i64 %inc, %conv >>>>> br i1 %exitcond, label %for.cond.cleanup, label %for.body >>>>> } >>>> >>>> >>>> Over which, SCEV will provide the following analysis: >>>> >>>> Printing analysis 'Scalar Evolution Analysis' for function 'func': >>>>> Classifying expressions for: @func >>>>> %conv = zext i32 %n to i64 >>>>> --> (zext i32 %n to i64) U: [0,4294967296) S: [0,4294967296) >>>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>>> --> {1,+,1}<nuw><nsw><%for.body> U: [1,-9223372036854775808) S: >>>>> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64)) LoopDispositions: >>>>> { %for.body: Computable } >>>>> %inc = add nuw nsw i64 %x.06, 1 >>>>> --> {2,+,1}<nuw><%for.body> U: [2,0) S: [2,0) Exits: (zext i32 %n >>>>> to i64) LoopDispositions: { %for.body: Computable } >>>>> Determining loop execution counts for: @func >>>>> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n to >>>>> i64))<nsw> >>>>> Loop %for.body: max backedge-taken count is -2 >>>>> Loop %for.body: Predicated backedge-taken count is (-2 + (zext i32 %n >>>>> to i64))<nsw> >>>>> Predicates: >>>>> Loop %for.body: Trip multiple is 1 >>>> >>>> >>>> Now, I was surprised by the max backedge-taken count being -2, and I >>>> suspect it is due to the backedge-taken count being marked as <nsw> instead >>>> of <nuw>. >>>> >>>> Is that on purpose, is that a bug, or is my analysis incorrect? I am >>>> not sure where to fix that issue. >>>> >>>> >>>> The backedge-taken count isn't nuw because nsw/nuw markings aren't >>>> flow-sensitive: there isn't any way to mark the trip count as nuw without >>>> marking every computation of `(long)n-2` as nuw. >>>> >>>> There's some code in ScalarEvolution::howFarToZero to try to refine the >>>> max backedge-taken count in some cases, but it isn't very general. See >>>> https://reviews.llvm.org/D28536 . >>>> >>>> -Eli >>>> >>>> -- >>>> Employee of Qualcomm Innovation Center, Inc. >>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>> >>>> >>> >>> -- >>> *Alexandre Isoard* >>> >>> >>> -- >>> Employee of Qualcomm Innovation Center, Inc. >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>> >>> >> >> -- >> *Alexandre Isoard* >> >> >> -- >> Employee of Qualcomm Innovation Center, Inc. >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >> >> > > -- > *Alexandre Isoard* >-- *Alexandre Isoard* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/d548e1fd/attachment.html>
Friedman, Eli via llvm-dev
2018-Aug-16 23:34 UTC
[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
In general, the backedge-taken count is an unsigned value; it's possible to write a loop with a trip count of 2^64 using a 64-bit induction variable. To prove your loop has a "small" trip count, you have to use either the guard or the nsw/nuw markings on the induction variable. -Eli On 8/16/2018 4:09 PM, Alexandre Isoard wrote:> Ok. > > To go back to the original issue, would it be meaningful to add a > SCEVUMax(0, BTC) on the final BTC computed by SCEV? > > So that it does not use "negative values"? > > On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at codeaurora.org > <mailto:efriedma at codeaurora.org>> wrote: > > On 8/15/2018 2:27 PM, Alexandre Isoard wrote: >> I'm not sure I understand the poison/undef/UB distinctions. >> >> But on this example: >> >> define i32 @func(i1 zeroext %b, i32 %x, i32 %y) { >> entry: >> %adds = add nsw i32 %x, %y >> %addu = add nuw i32 %x, %y >> %cond = select i1 %b, i32 %adds, i32 %addu >> ret i32 %cond >> } >> >> >> It is important to not propagate the nsw/nuw between the two SCEV >> expressions (which unification would do today, can I consider >> that a bug or is it a feature?). > > It's an intentional design choice. > >> So we work-around it by not informing SCEV of the flags: >> >> Printing analysis 'Scalar Evolution Analysis' for function >> 'func': >> Classifying expressions for: @func >> %adds = add nsw i32 %x, %y >> --> (%x + %y) U: full-set S: full-set >> %addu = add nuw i32 %x, %y >> --> (%x + %y) U: full-set S: full-set >> %cond = select i1 %b, i32 %adds, i32 %addu >> --> %cond U: full-set S: full-set >> Determining loop execution counts for: @func >> >> >> Would there be problems if we properly considered nuw/nsw flags >> when unifying SCEVs? > > There would be other consequences. For example, `(%x + %y)<nsw>` > and `(%x + %y)<nuw>` wouldn't compare equal for other > simplifications, and all the places that call setNoWrapFlags would > have to be rewritten. It's probably possible to come up with some > workable design, but nobody has actually tried it, so it's not > clear how much work it would be to implement, or whether it would > improve the generated code overall. > > -Eli > >> >> On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli >> <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>> wrote: >> >> On 8/15/2018 1:31 PM, Alexandre Isoard wrote: >>> Is that why we do not deduce +<nsw> from "add nsw" either? >> >> Essentially, yes. >> >>> Is that an intrinsic limitation of creating a >>> context-invariant expressions from a Value* or is that a >>> limitation of our implementation (our unification not >>> considering the nsw flags)? >> >> It's a consequence of unification not considering nsw. (nsw >> on an instruction is naturally invariant because violating >> nsw produces poison, not UB.) >> >> -Eli >> >>> >>> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli >>> <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>> >>> wrote: >>> >>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote: >>>> Hello, >>>> >>>> If I run clang on the following code: >>>> >>>> void func(unsigned n) { >>>> for (unsigned long x = 1; x < n; ++x) >>>> dummy(x); >>>> } >>>> >>>> >>>> I get the following llvm ir: >>>> >>>> define void @func(i32 %n) { >>>> entry: >>>> %conv = zext i32 %n to i64 >>>> %cmp5 = icmp ugt i32 %n, 1 >>>> br i1 %cmp5, label %for.body, label %for.cond.cleanup >>>> for.cond.cleanup: ; preds >>>> %for.body, %entry >>>> ret void >>>> for.body: ; preds = %entry, %for.body >>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>> tail call void @dummy(i64 %x.06) #2 >>>> %inc = add nuw nsw i64 %x.06, 1 >>>> %exitcond = icmp eq i64 %inc, %conv >>>> br i1 %exitcond, label %for.cond.cleanup, label >>>> %for.body >>>> } >>>> >>>> >>>> Over which, SCEV will provide the following analysis: >>>> >>>> Printing analysis 'Scalar Evolution Analysis' for >>>> function 'func': >>>> Classifying expressions for: @func >>>> %conv = zext i32 %n to i64 >>>> --> (zext i32 %n to i64) U: [0,4294967296) S: >>>> [0,4294967296) >>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>> --> {1,+,1}<nuw><nsw><%for.body> U: >>>> [1,-9223372036854775808) S: >>>> [1,-9223372036854775808)Exits: (-1 + (zext i32 %n >>>> to i64))LoopDispositions: { %for.body: Computable } >>>> %inc = add nuw nsw i64 %x.06, 1 >>>> --> {2,+,1}<nuw><%for.body> U: [2,0) S: >>>> [2,0)Exits: (zext i32 %n to i64)LoopDispositions: { >>>> %for.body: Computable } >>>> Determining loop execution counts for: @func >>>> Loop %for.body: backedge-taken count is (-2 + (zext >>>> i32 %n to i64))<nsw> >>>> Loop %for.body: max backedge-taken count is -2 >>>> Loop %for.body: Predicated backedge-taken count is >>>> (-2 + (zext i32 %n to i64))<nsw> >>>> Predicates: >>>> Loop %for.body: Trip multiple is 1 >>>> >>>> >>>> Now, I was surprised by the max backedge-taken count >>>> being -2, and I suspect it is due to the backedge-taken >>>> count being marked as <nsw> instead of <nuw>. >>>> >>>> Is that on purpose, is that a bug, or is my analysis >>>> incorrect? I am not sure where to fix that issue. >>> >>> The backedge-taken count isn't nuw because nsw/nuw >>> markings aren't flow-sensitive: there isn't any way to >>> mark the trip count as nuw without marking every >>> computation of `(long)n-2` as nuw. >>> >>> There's some code in ScalarEvolution::howFarToZero to >>> try to refine the max backedge-taken count in some >>> cases, but it isn't very general. See >>> https://reviews.llvm.org/D28536 . >>> >>> -Eli >>> >>> -- >>> Employee of Qualcomm Innovation Center, Inc. >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>> >>> >>> >>> -- >>> *Alexandre Isoard* >> >> >> -- >> Employee of Qualcomm Innovation Center, Inc. >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >> >> >> >> -- >> *Alexandre Isoard* > > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > > > > -- > *Alexandre Isoard*-- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/bd87ff06/attachment.html>
Alexandre Isoard via llvm-dev
2018-Aug-16 23:54 UTC
[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?
The loop exits iff {2,+,1}<nuw><%for.body> == (zext i32 %n to i64) The nuw marking on the "induction variable" should be sufficient to deduce a max loop trip count of 2^32. But I do not know how we compute it (we build a database and it is contrived to follow, at least to me). I saw that we annotate it with <nsw> (which is correct and can be (and probably has been) deduced from the ranges) and following our discussion, we can't annotate it with <nuw> as per limitation of our unification. So, I am kind of in a bind here... On Thu, Aug 16, 2018 at 4:34 PM Friedman, Eli <efriedma at codeaurora.org> wrote:> In general, the backedge-taken count is an unsigned value; it's possible > to write a loop with a trip count of 2^64 using a 64-bit induction > variable. To prove your loop has a "small" trip count, you have to use > either the guard or the nsw/nuw markings on the induction variable. > > -Eli > > On 8/16/2018 4:09 PM, Alexandre Isoard wrote: > > Ok. > > To go back to the original issue, would it be meaningful to add a > SCEVUMax(0, BTC) on the final BTC computed by SCEV? > > So that it does not use "negative values"? > > On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at codeaurora.org> > wrote: > >> On 8/15/2018 2:27 PM, Alexandre Isoard wrote: >> >> I'm not sure I understand the poison/undef/UB distinctions. >> >> But on this example: >> >> define i32 @func(i1 zeroext %b, i32 %x, i32 %y) { >>> entry: >>> %adds = add nsw i32 %x, %y >>> %addu = add nuw i32 %x, %y >>> %cond = select i1 %b, i32 %adds, i32 %addu >>> ret i32 %cond >>> } >> >> >> It is important to not propagate the nsw/nuw between the two SCEV >> expressions (which unification would do today, can I consider that a bug or >> is it a feature?). >> >> >> It's an intentional design choice. >> >> So we work-around it by not informing SCEV of the flags: >> >> Printing analysis 'Scalar Evolution Analysis' for function 'func': >>> Classifying expressions for: @func >>> %adds = add nsw i32 %x, %y >>> --> (%x + %y) U: full-set S: full-set >>> %addu = add nuw i32 %x, %y >>> --> (%x + %y) U: full-set S: full-set >>> %cond = select i1 %b, i32 %adds, i32 %addu >>> --> %cond U: full-set S: full-set >>> Determining loop execution counts for: @func >> >> >> Would there be problems if we properly considered nuw/nsw flags when >> unifying SCEVs? >> >> >> There would be other consequences. For example, `(%x + %y)<nsw>` and >> `(%x + %y)<nuw>` wouldn't compare equal for other simplifications, and all >> the places that call setNoWrapFlags would have to be rewritten. It's >> probably possible to come up with some workable design, but nobody has >> actually tried it, so it's not clear how much work it would be to >> implement, or whether it would improve the generated code overall. >> >> -Eli >> >> >> On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli <efriedma at codeaurora.org> >> wrote: >> >>> On 8/15/2018 1:31 PM, Alexandre Isoard wrote: >>> >>> Is that why we do not deduce +<nsw> from "add nsw" either? >>> >>> >>> Essentially, yes. >>> >>> Is that an intrinsic limitation of creating a context-invariant >>> expressions from a Value* or is that a limitation of our implementation >>> (our unification not considering the nsw flags)? >>> >>> >>> It's a consequence of unification not considering nsw. (nsw on an >>> instruction is naturally invariant because violating nsw produces poison, >>> not UB.) >>> >>> -Eli >>> >>> >>> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at codeaurora.org> >>> wrote: >>> >>>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote: >>>> >>>> Hello, >>>> >>>> If I run clang on the following code: >>>> >>>> void func(unsigned n) { >>>>> for (unsigned long x = 1; x < n; ++x) >>>>> dummy(x); >>>>> } >>>> >>>> >>>> I get the following llvm ir: >>>> >>>> define void @func(i32 %n) { >>>>> entry: >>>>> %conv = zext i32 %n to i64 >>>>> %cmp5 = icmp ugt i32 %n, 1 >>>>> br i1 %cmp5, label %for.body, label %for.cond.cleanup >>>>> for.cond.cleanup: ; preds = %for.body, >>>>> %entry >>>>> ret void >>>>> for.body: ; preds = %entry, >>>>> %for.body >>>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>>> tail call void @dummy(i64 %x.06) #2 >>>>> %inc = add nuw nsw i64 %x.06, 1 >>>>> %exitcond = icmp eq i64 %inc, %conv >>>>> br i1 %exitcond, label %for.cond.cleanup, label %for.body >>>>> } >>>> >>>> >>>> Over which, SCEV will provide the following analysis: >>>> >>>> Printing analysis 'Scalar Evolution Analysis' for function 'func': >>>>> Classifying expressions for: @func >>>>> %conv = zext i32 %n to i64 >>>>> --> (zext i32 %n to i64) U: [0,4294967296) S: [0,4294967296) >>>>> %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ] >>>>> --> {1,+,1}<nuw><nsw><%for.body> U: [1,-9223372036854775808) S: >>>>> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64)) LoopDispositions: >>>>> { %for.body: Computable } >>>>> %inc = add nuw nsw i64 %x.06, 1 >>>>> --> {2,+,1}<nuw><%for.body> U: [2,0) S: [2,0) Exits: (zext i32 %n >>>>> to i64) LoopDispositions: { %for.body: Computable } >>>>> Determining loop execution counts for: @func >>>>> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n to >>>>> i64))<nsw> >>>>> Loop %for.body: max backedge-taken count is -2 >>>>> Loop %for.body: Predicated backedge-taken count is (-2 + (zext i32 %n >>>>> to i64))<nsw> >>>>> Predicates: >>>>> Loop %for.body: Trip multiple is 1 >>>> >>>> >>>> Now, I was surprised by the max backedge-taken count being -2, and I >>>> suspect it is due to the backedge-taken count being marked as <nsw> instead >>>> of <nuw>. >>>> >>>> Is that on purpose, is that a bug, or is my analysis incorrect? I am >>>> not sure where to fix that issue. >>>> >>>> >>>> The backedge-taken count isn't nuw because nsw/nuw markings aren't >>>> flow-sensitive: there isn't any way to mark the trip count as nuw without >>>> marking every computation of `(long)n-2` as nuw. >>>> >>>> There's some code in ScalarEvolution::howFarToZero to try to refine the >>>> max backedge-taken count in some cases, but it isn't very general. See >>>> https://reviews.llvm.org/D28536 . >>>> >>>> -Eli >>>> >>>> -- >>>> Employee of Qualcomm Innovation Center, Inc. >>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>> >>>> >>> >>> -- >>> *Alexandre Isoard* >>> >>> >>> -- >>> Employee of Qualcomm Innovation Center, Inc. >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>> >>> >> >> -- >> *Alexandre Isoard* >> >> >> -- >> Employee of Qualcomm Innovation Center, Inc. >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >> >> > > -- > *Alexandre Isoard* > > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-- *Alexandre Isoard* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/1d32c954/attachment.html>