thr3ads.net - llvm dev - [llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>? [Aug 2018]

If this information is useful, please help other people find it:
Share via:

Alexandre Isoard via llvm-dev

2018-Aug-16 23:09 UTC

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

Ok.

To go back to the original issue, would it be meaningful to add a
SCEVUMax(0, BTC) on the final BTC computed by SCEV?

So that it does not use "negative values"?

On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at codeaurora.org>
wrote:
> On 8/15/2018 2:27 PM, Alexandre Isoard wrote:
>
> I'm not sure I understand the poison/undef/UB distinctions.
>
> But on this example:
>
> define i32 @func(i1 zeroext %b, i32 %x, i32 %y) {
>> entry:
>>   %adds = add nsw i32 %x, %y
>>   %addu = add nuw i32 %x, %y
>>   %cond = select i1 %b, i32 %adds, i32 %addu
>>   ret i32 %cond
>> }
>
>
> It is important to not propagate the nsw/nuw between the two SCEV
> expressions (which unification would do today, can I consider that a bug or
> is it a feature?).
>
>
> It's an intentional design choice.
>
> So we work-around it by not informing SCEV of the flags:
>
> Printing analysis 'Scalar Evolution Analysis' for function
'func':
>> Classifying expressions for: @func
>>   %adds = add nsw i32 %x, %y
>>   -->  (%x + %y) U: full-set S: full-set
>>   %addu = add nuw i32 %x, %y
>>   -->  (%x + %y) U: full-set S: full-set
>>   %cond = select i1 %b, i32 %adds, i32 %addu
>>   -->  %cond U: full-set S: full-set
>> Determining loop execution counts for: @func
>
>
> Would there be problems if we properly considered nuw/nsw flags when
> unifying SCEVs?
>
>
> There would be other consequences.  For example, `(%x + %y)<nsw>` and
`(%x
> + %y)<nuw>` wouldn't compare equal for other simplifications, and
all the
> places that call setNoWrapFlags would have to be rewritten.  It's
probably
> possible to come up with some workable design, but nobody has actually
> tried it, so it's not clear how much work it would be to implement, or
> whether it would improve the generated code overall.
>
> -Eli
>
>
> On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli <efriedma at
codeaurora.org>
> wrote:
>
>> On 8/15/2018 1:31 PM, Alexandre Isoard wrote:
>>
>> Is that why we do not deduce +<nsw> from "add nsw"
either?
>>
>>
>> Essentially, yes.
>>
>> Is that an intrinsic limitation of creating a context-invariant
>> expressions from a Value* or is that a limitation of our implementation
>> (our unification not considering the nsw flags)?
>>
>>
>> It's a consequence of unification not considering nsw.  (nsw on an
>> instruction is naturally invariant because violating nsw produces
poison,
>> not UB.)
>>
>> -Eli
>>
>>
>> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at
codeaurora.org>
>> wrote:
>>
>>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote:
>>>
>>> Hello,
>>>
>>> If I run clang on the following code:
>>>
>>> void func(unsigned n) {
>>>>   for (unsigned long x = 1; x < n; ++x)
>>>>     dummy(x);
>>>> }
>>>
>>>
>>> I get the following llvm ir:
>>>
>>> define void @func(i32 %n) {
>>>> entry:
>>>>   %conv = zext i32 %n to i64
>>>>   %cmp5 = icmp ugt i32 %n, 1
>>>>   br i1 %cmp5, label %for.body, label %for.cond.cleanup
>>>> for.cond.cleanup:                                 ; preds =
%for.body,
>>>> %entry
>>>>   ret void
>>>> for.body:                                         ; preds =
%entry,
>>>> %for.body
>>>>   %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>>   tail call void @dummy(i64 %x.06) #2
>>>>   %inc = add nuw nsw i64 %x.06, 1
>>>>   %exitcond = icmp eq i64 %inc, %conv
>>>>   br i1 %exitcond, label %for.cond.cleanup, label %for.body
>>>> }
>>>
>>>
>>> Over which, SCEV will provide the following analysis:
>>>
>>> Printing analysis 'Scalar Evolution Analysis' for function
'func':
>>>> Classifying expressions for: @func
>>>>   %conv = zext i32 %n to i64
>>>>   -->  (zext i32 %n to i64) U: [0,4294967296) S:
[0,4294967296)
>>>>   %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>>   -->  {1,+,1}<nuw><nsw><%for.body> U:
[1,-9223372036854775808) S:
>>>> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64))
LoopDispositions:
>>>> { %for.body: Computable }
>>>>   %inc = add nuw nsw i64 %x.06, 1
>>>>   -->  {2,+,1}<nuw><%for.body> U: [2,0) S: [2,0)
Exits: (zext i32 %n
>>>> to i64) LoopDispositions: { %for.body: Computable }
>>>> Determining loop execution counts for: @func
>>>> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n to
i64))<nsw>
>>>> Loop %for.body: max backedge-taken count is -2
>>>> Loop %for.body: Predicated backedge-taken count is (-2 + (zext
i32 %n
>>>> to i64))<nsw>
>>>>  Predicates:
>>>> Loop %for.body: Trip multiple is 1
>>>
>>>
>>> Now, I was surprised by the max backedge-taken count being -2, and
I
>>> suspect it is due to the backedge-taken count being marked as
<nsw> instead
>>> of <nuw>.
>>>
>>> Is that on purpose, is that a bug, or is my analysis incorrect? I
am not
>>> sure where to fix that issue.
>>>
>>>
>>> The backedge-taken count isn't nuw because nsw/nuw markings
aren't
>>> flow-sensitive: there isn't any way to mark the trip count as
nuw without
>>> marking every computation of `(long)n-2` as nuw.
>>>
>>> There's some code in ScalarEvolution::howFarToZero to try to
refine the
>>> max backedge-taken count in some cases, but it isn't very
general.  See
>>> https://reviews.llvm.org/D28536 .
>>>
>>> -Eli
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
>>>
>>>
>>
>> --
>> *Alexandre Isoard*
>>
>>
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
>>
>>
>
> --
> *Alexandre Isoard*
>
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project
>
>
-- 
*Alexandre Isoard*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/a14ee024/attachment-0001.html>

Alexandre Isoard via llvm-dev

2018-Aug-16 23:24 UTC

head link

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

Hum, actually that gets eliminated to nothing of course. What we need is
really the NUW flag...

On Thu, Aug 16, 2018 at 4:09 PM Alexandre Isoard <alexandre.isoard at
gmail.com>
wrote:
> Ok.
>
> To go back to the original issue, would it be meaningful to add a
> SCEVUMax(0, BTC) on the final BTC computed by SCEV?
>
> So that it does not use "negative values"?
>
> On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at
codeaurora.org>
> wrote:
>
>> On 8/15/2018 2:27 PM, Alexandre Isoard wrote:
>>
>> I'm not sure I understand the poison/undef/UB distinctions.
>>
>> But on this example:
>>
>> define i32 @func(i1 zeroext %b, i32 %x, i32 %y) {
>>> entry:
>>>   %adds = add nsw i32 %x, %y
>>>   %addu = add nuw i32 %x, %y
>>>   %cond = select i1 %b, i32 %adds, i32 %addu
>>>   ret i32 %cond
>>> }
>>
>>
>> It is important to not propagate the nsw/nuw between the two SCEV
>> expressions (which unification would do today, can I consider that a
bug or
>> is it a feature?).
>>
>>
>> It's an intentional design choice.
>>
>> So we work-around it by not informing SCEV of the flags:
>>
>> Printing analysis 'Scalar Evolution Analysis' for function
'func':
>>> Classifying expressions for: @func
>>>   %adds = add nsw i32 %x, %y
>>>   -->  (%x + %y) U: full-set S: full-set
>>>   %addu = add nuw i32 %x, %y
>>>   -->  (%x + %y) U: full-set S: full-set
>>>   %cond = select i1 %b, i32 %adds, i32 %addu
>>>   -->  %cond U: full-set S: full-set
>>> Determining loop execution counts for: @func
>>
>>
>> Would there be problems if we properly considered nuw/nsw flags when
>> unifying SCEVs?
>>
>>
>> There would be other consequences.  For example, `(%x + %y)<nsw>`
and
>> `(%x + %y)<nuw>` wouldn't compare equal for other
simplifications, and all
>> the places that call setNoWrapFlags would have to be rewritten. 
It's
>> probably possible to come up with some workable design, but nobody has
>> actually tried it, so it's not clear how much work it would be to
>> implement, or whether it would improve the generated code overall.
>>
>> -Eli
>>
>>
>> On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli <efriedma at
codeaurora.org>
>> wrote:
>>
>>> On 8/15/2018 1:31 PM, Alexandre Isoard wrote:
>>>
>>> Is that why we do not deduce +<nsw> from "add nsw"
either?
>>>
>>>
>>> Essentially, yes.
>>>
>>> Is that an intrinsic limitation of creating a context-invariant
>>> expressions from a Value* or is that a limitation of our
implementation
>>> (our unification not considering the nsw flags)?
>>>
>>>
>>> It's a consequence of unification not considering nsw.  (nsw on
an
>>> instruction is naturally invariant because violating nsw produces
poison,
>>> not UB.)
>>>
>>> -Eli
>>>
>>>
>>> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at
codeaurora.org>
>>> wrote:
>>>
>>>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote:
>>>>
>>>> Hello,
>>>>
>>>> If I run clang on the following code:
>>>>
>>>> void func(unsigned n) {
>>>>>   for (unsigned long x = 1; x < n; ++x)
>>>>>     dummy(x);
>>>>> }
>>>>
>>>>
>>>> I get the following llvm ir:
>>>>
>>>> define void @func(i32 %n) {
>>>>> entry:
>>>>>   %conv = zext i32 %n to i64
>>>>>   %cmp5 = icmp ugt i32 %n, 1
>>>>>   br i1 %cmp5, label %for.body, label %for.cond.cleanup
>>>>> for.cond.cleanup:                                 ; preds =
%for.body,
>>>>> %entry
>>>>>   ret void
>>>>> for.body:                                         ; preds =
%entry,
>>>>> %for.body
>>>>>   %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>>>   tail call void @dummy(i64 %x.06) #2
>>>>>   %inc = add nuw nsw i64 %x.06, 1
>>>>>   %exitcond = icmp eq i64 %inc, %conv
>>>>>   br i1 %exitcond, label %for.cond.cleanup, label %for.body
>>>>> }
>>>>
>>>>
>>>> Over which, SCEV will provide the following analysis:
>>>>
>>>> Printing analysis 'Scalar Evolution Analysis' for
function 'func':
>>>>> Classifying expressions for: @func
>>>>>   %conv = zext i32 %n to i64
>>>>>   -->  (zext i32 %n to i64) U: [0,4294967296) S:
[0,4294967296)
>>>>>   %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>>>   -->  {1,+,1}<nuw><nsw><%for.body> U:
[1,-9223372036854775808) S:
>>>>> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64))
LoopDispositions:
>>>>> { %for.body: Computable }
>>>>>   %inc = add nuw nsw i64 %x.06, 1
>>>>>   -->  {2,+,1}<nuw><%for.body> U: [2,0) S:
[2,0) Exits: (zext i32 %n
>>>>> to i64) LoopDispositions: { %for.body: Computable }
>>>>> Determining loop execution counts for: @func
>>>>> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n
to
>>>>> i64))<nsw>
>>>>> Loop %for.body: max backedge-taken count is -2
>>>>> Loop %for.body: Predicated backedge-taken count is (-2 +
(zext i32 %n
>>>>> to i64))<nsw>
>>>>>  Predicates:
>>>>> Loop %for.body: Trip multiple is 1
>>>>
>>>>
>>>> Now, I was surprised by the max backedge-taken count being -2,
and I
>>>> suspect it is due to the backedge-taken count being marked as
<nsw> instead
>>>> of <nuw>.
>>>>
>>>> Is that on purpose, is that a bug, or is my analysis incorrect?
I am
>>>> not sure where to fix that issue.
>>>>
>>>>
>>>> The backedge-taken count isn't nuw because nsw/nuw markings
aren't
>>>> flow-sensitive: there isn't any way to mark the trip count
as nuw without
>>>> marking every computation of `(long)n-2` as nuw.
>>>>
>>>> There's some code in ScalarEvolution::howFarToZero to try
to refine the
>>>> max backedge-taken count in some cases, but it isn't very
general.  See
>>>> https://reviews.llvm.org/D28536 .
>>>>
>>>> -Eli
>>>>
>>>> --
>>>> Employee of Qualcomm Innovation Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora
Forum, a Linux Foundation Collaborative Project
>>>>
>>>>
>>>
>>> --
>>> *Alexandre Isoard*
>>>
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
>>>
>>>
>>
>> --
>> *Alexandre Isoard*
>>
>>
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
>>
>>
>
> --
> *Alexandre Isoard*
>

-- 
*Alexandre Isoard*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/d548e1fd/attachment.html>

Friedman, Eli via llvm-dev

2018-Aug-16 23:34 UTC

head link

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

In general, the backedge-taken count is an unsigned value; it's possible 
to write a loop with a trip count of 2^64 using a 64-bit induction 
variable.  To prove your loop has a "small" trip count, you have to
use
either the guard or the nsw/nuw markings on the induction variable.

-Eli

On 8/16/2018 4:09 PM, Alexandre Isoard wrote:> Ok.
>
> To go back to the original issue, would it be meaningful to add a 
> SCEVUMax(0, BTC) on the final BTC computed by SCEV?
>
> So that it does not use "negative values"?
>
> On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at
codeaurora.org
> <mailto:efriedma at codeaurora.org>> wrote:
>
>     On 8/15/2018 2:27 PM, Alexandre Isoard wrote:
>>     I'm not sure I understand the poison/undef/UB distinctions.
>>
>>     But on this example:
>>
>>         define i32 @func(i1 zeroext %b, i32 %x, i32 %y) {
>>         entry:
>>           %adds = add nsw i32 %x, %y
>>           %addu = add nuw i32 %x, %y
>>           %cond = select i1 %b, i32 %adds, i32 %addu
>>           ret i32 %cond
>>         }
>>
>>
>>     It is important to not propagate the nsw/nuw between the two SCEV
>>     expressions (which unification would do today, can I consider
>>     that a bug or is it a feature?).
>
>     It's an intentional design choice.
>
>>     So we work-around it by not informing SCEV of the flags:
>>
>>         Printing analysis 'Scalar Evolution Analysis' for
function
>>         'func':
>>         Classifying expressions for: @func
>>           %adds = add nsw i32 %x, %y
>>           -->  (%x + %y) U: full-set S: full-set
>>           %addu = add nuw i32 %x, %y
>>           -->  (%x + %y) U: full-set S: full-set
>>           %cond = select i1 %b, i32 %adds, i32 %addu
>>           -->  %cond U: full-set S: full-set
>>         Determining loop execution counts for: @func
>>
>>
>>     Would there be problems if we properly considered nuw/nsw flags
>>     when unifying SCEVs?
>
>     There would be other consequences.  For example, `(%x + %y)<nsw>`
>     and `(%x + %y)<nuw>` wouldn't compare equal for other
>     simplifications, and all the places that call setNoWrapFlags would
>     have to be rewritten.  It's probably possible to come up with some
>     workable design, but nobody has actually tried it, so it's not
>     clear how much work it would be to implement, or whether it would
>     improve the generated code overall.
>
>     -Eli
>
>>
>>     On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli
>>     <efriedma at codeaurora.org <mailto:efriedma at
codeaurora.org>> wrote:
>>
>>         On 8/15/2018 1:31 PM, Alexandre Isoard wrote:
>>>         Is that why we do not deduce +<nsw> from "add
nsw" either?
>>
>>         Essentially, yes.
>>
>>>         Is that an intrinsic limitation of creating a
>>>         context-invariant expressions from a Value* or is that a
>>>         limitation of our implementation (our unification not
>>>         considering the nsw flags)?
>>
>>         It's a consequence of unification not considering nsw. 
(nsw
>>         on an instruction is naturally invariant because violating
>>         nsw produces poison, not UB.)
>>
>>         -Eli
>>
>>>
>>>         On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli
>>>         <efriedma at codeaurora.org <mailto:efriedma at
codeaurora.org>>
>>>         wrote:
>>>
>>>             On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev
wrote:
>>>>             Hello,
>>>>
>>>>             If I run clang on the following code:
>>>>
>>>>                 void func(unsigned n) {
>>>>                   for (unsigned long x = 1; x < n; ++x)
>>>>                     dummy(x);
>>>>                 }
>>>>
>>>>
>>>>             I get the following llvm ir:
>>>>
>>>>                 define void @func(i32 %n) {
>>>>                 entry:
>>>>                   %conv = zext i32 %n to i64
>>>>                   %cmp5 = icmp ugt i32 %n, 1
>>>>                   br i1 %cmp5, label %for.body, label
%for.cond.cleanup
>>>>                 for.cond.cleanup:                ; preds
>>>>                 %for.body, %entry
>>>>                   ret void
>>>>                 for.body:                ; preds = %entry,
%for.body
>>>>                   %x.06 = phi i64 [ %inc, %for.body ], [ 1,
%entry ]
>>>>                   tail call void @dummy(i64 %x.06) #2
>>>>                   %inc = add nuw nsw i64 %x.06, 1
>>>>                   %exitcond = icmp eq i64 %inc, %conv
>>>>                   br i1 %exitcond, label %for.cond.cleanup,
label
>>>>                 %for.body
>>>>                 }
>>>>
>>>>
>>>>             Over which, SCEV will provide the following
analysis:
>>>>
>>>>                 Printing analysis 'Scalar Evolution
Analysis' for
>>>>                 function 'func':
>>>>                 Classifying expressions for: @func
>>>>                   %conv = zext i32 %n to i64
>>>>                   -->  (zext i32 %n to i64) U:
[0,4294967296) S:
>>>>                 [0,4294967296)
>>>>                   %x.06 = phi i64 [ %inc, %for.body ], [ 1,
%entry ]
>>>>                   -->
{1,+,1}<nuw><nsw><%for.body> U:
>>>>                 [1,-9223372036854775808) S:
>>>>                 [1,-9223372036854775808)Exits: (-1 + (zext i32
%n
>>>>                 to i64))LoopDispositions: { %for.body:
Computable }
>>>>                   %inc = add nuw nsw i64 %x.06, 1
>>>>                   --> {2,+,1}<nuw><%for.body> U:
[2,0) S:
>>>>                 [2,0)Exits: (zext i32 %n to
i64)LoopDispositions: {
>>>>                 %for.body: Computable }
>>>>                 Determining loop execution counts for: @func
>>>>                 Loop %for.body: backedge-taken count is (-2 +
(zext
>>>>                 i32 %n to i64))<nsw>
>>>>                 Loop %for.body: max backedge-taken count is -2
>>>>                 Loop %for.body: Predicated backedge-taken count
is
>>>>                 (-2 + (zext i32 %n to i64))<nsw>
>>>>                  Predicates:
>>>>                 Loop %for.body: Trip multiple is 1
>>>>
>>>>
>>>>             Now, I was surprised by the max backedge-taken
count
>>>>             being -2, and I suspect it is due to the
backedge-taken
>>>>             count being marked as <nsw> instead of
<nuw>.
>>>>
>>>>             Is that on purpose, is that a bug, or is my
analysis
>>>>             incorrect? I am not sure where to fix that issue.
>>>
>>>             The backedge-taken count isn't nuw because nsw/nuw
>>>             markings aren't flow-sensitive: there isn't any
way to
>>>             mark the trip count as nuw without marking every
>>>             computation of `(long)n-2` as nuw.
>>>
>>>             There's some code in ScalarEvolution::howFarToZero
to
>>>             try to refine the max backedge-taken count in some
>>>             cases, but it isn't very general.  See
>>>             https://reviews.llvm.org/D28536 .
>>>
>>>             -Eli
>>>
>>>             -- 
>>>             Employee of Qualcomm Innovation Center, Inc.
>>>             Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, a Linux Foundation Collaborative Project
>>>
>>>
>>>
>>>         -- 
>>>         *Alexandre Isoard*
>>
>>
>>         -- 
>>         Employee of Qualcomm Innovation Center, Inc.
>>         Qualcomm Innovation Center, Inc. is a member of Code Aurora
Forum, a Linux Foundation Collaborative Project
>>
>>
>>
>>     -- 
>>     *Alexandre Isoard*
>
>
>     -- 
>     Employee of Qualcomm Innovation Center, Inc.
>     Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
>
>
>
> -- 
> *Alexandre Isoard*

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/bd87ff06/attachment.html>

Alexandre Isoard via llvm-dev

2018-Aug-16 23:54 UTC

head link

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

The loop exits iff  {2,+,1}<nuw><%for.body> ==  (zext i32 %n to i64)

The nuw marking on the "induction variable" should be sufficient to
deduce
a max loop trip count of 2^32.
But I do not know how we compute it (we build a database and it is
contrived to follow, at least to me).

I saw that we annotate it with <nsw> (which is correct and can be (and
probably has been) deduced from the ranges) and following our discussion,
we can't annotate it with <nuw> as per limitation of our unification.

So, I am kind of in a bind here...

On Thu, Aug 16, 2018 at 4:34 PM Friedman, Eli <efriedma at codeaurora.org>
wrote:
> In general, the backedge-taken count is an unsigned value; it's
possible
> to write a loop with a trip count of 2^64 using a 64-bit induction
> variable.  To prove your loop has a "small" trip count, you have
to use
> either the guard or the nsw/nuw markings on the induction variable.
>
> -Eli
>
> On 8/16/2018 4:09 PM, Alexandre Isoard wrote:
>
> Ok.
>
> To go back to the original issue, would it be meaningful to add a
> SCEVUMax(0, BTC) on the final BTC computed by SCEV?
>
> So that it does not use "negative values"?
>
> On Wed, Aug 15, 2018 at 2:40 PM Friedman, Eli <efriedma at
codeaurora.org>
> wrote:
>
>> On 8/15/2018 2:27 PM, Alexandre Isoard wrote:
>>
>> I'm not sure I understand the poison/undef/UB distinctions.
>>
>> But on this example:
>>
>> define i32 @func(i1 zeroext %b, i32 %x, i32 %y) {
>>> entry:
>>>   %adds = add nsw i32 %x, %y
>>>   %addu = add nuw i32 %x, %y
>>>   %cond = select i1 %b, i32 %adds, i32 %addu
>>>   ret i32 %cond
>>> }
>>
>>
>> It is important to not propagate the nsw/nuw between the two SCEV
>> expressions (which unification would do today, can I consider that a
bug or
>> is it a feature?).
>>
>>
>> It's an intentional design choice.
>>
>> So we work-around it by not informing SCEV of the flags:
>>
>> Printing analysis 'Scalar Evolution Analysis' for function
'func':
>>> Classifying expressions for: @func
>>>   %adds = add nsw i32 %x, %y
>>>   -->  (%x + %y) U: full-set S: full-set
>>>   %addu = add nuw i32 %x, %y
>>>   -->  (%x + %y) U: full-set S: full-set
>>>   %cond = select i1 %b, i32 %adds, i32 %addu
>>>   -->  %cond U: full-set S: full-set
>>> Determining loop execution counts for: @func
>>
>>
>> Would there be problems if we properly considered nuw/nsw flags when
>> unifying SCEVs?
>>
>>
>> There would be other consequences.  For example, `(%x + %y)<nsw>`
and
>> `(%x + %y)<nuw>` wouldn't compare equal for other
simplifications, and all
>> the places that call setNoWrapFlags would have to be rewritten. 
It's
>> probably possible to come up with some workable design, but nobody has
>> actually tried it, so it's not clear how much work it would be to
>> implement, or whether it would improve the generated code overall.
>>
>> -Eli
>>
>>
>> On Wed, Aug 15, 2018 at 1:59 PM Friedman, Eli <efriedma at
codeaurora.org>
>> wrote:
>>
>>> On 8/15/2018 1:31 PM, Alexandre Isoard wrote:
>>>
>>> Is that why we do not deduce +<nsw> from "add nsw"
either?
>>>
>>>
>>> Essentially, yes.
>>>
>>> Is that an intrinsic limitation of creating a context-invariant
>>> expressions from a Value* or is that a limitation of our
implementation
>>> (our unification not considering the nsw flags)?
>>>
>>>
>>> It's a consequence of unification not considering nsw.  (nsw on
an
>>> instruction is naturally invariant because violating nsw produces
poison,
>>> not UB.)
>>>
>>> -Eli
>>>
>>>
>>> On Wed, Aug 15, 2018 at 12:39 PM Friedman, Eli <efriedma at
codeaurora.org>
>>> wrote:
>>>
>>>> On 8/15/2018 12:21 PM, Alexandre Isoard via llvm-dev wrote:
>>>>
>>>> Hello,
>>>>
>>>> If I run clang on the following code:
>>>>
>>>> void func(unsigned n) {
>>>>>   for (unsigned long x = 1; x < n; ++x)
>>>>>     dummy(x);
>>>>> }
>>>>
>>>>
>>>> I get the following llvm ir:
>>>>
>>>> define void @func(i32 %n) {
>>>>> entry:
>>>>>   %conv = zext i32 %n to i64
>>>>>   %cmp5 = icmp ugt i32 %n, 1
>>>>>   br i1 %cmp5, label %for.body, label %for.cond.cleanup
>>>>> for.cond.cleanup:                                 ; preds =
%for.body,
>>>>> %entry
>>>>>   ret void
>>>>> for.body:                                         ; preds =
%entry,
>>>>> %for.body
>>>>>   %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>>>   tail call void @dummy(i64 %x.06) #2
>>>>>   %inc = add nuw nsw i64 %x.06, 1
>>>>>   %exitcond = icmp eq i64 %inc, %conv
>>>>>   br i1 %exitcond, label %for.cond.cleanup, label %for.body
>>>>> }
>>>>
>>>>
>>>> Over which, SCEV will provide the following analysis:
>>>>
>>>> Printing analysis 'Scalar Evolution Analysis' for
function 'func':
>>>>> Classifying expressions for: @func
>>>>>   %conv = zext i32 %n to i64
>>>>>   -->  (zext i32 %n to i64) U: [0,4294967296) S:
[0,4294967296)
>>>>>   %x.06 = phi i64 [ %inc, %for.body ], [ 1, %entry ]
>>>>>   -->  {1,+,1}<nuw><nsw><%for.body> U:
[1,-9223372036854775808) S:
>>>>> [1,-9223372036854775808) Exits: (-1 + (zext i32 %n to i64))
LoopDispositions:
>>>>> { %for.body: Computable }
>>>>>   %inc = add nuw nsw i64 %x.06, 1
>>>>>   -->  {2,+,1}<nuw><%for.body> U: [2,0) S:
[2,0) Exits: (zext i32 %n
>>>>> to i64) LoopDispositions: { %for.body: Computable }
>>>>> Determining loop execution counts for: @func
>>>>> Loop %for.body: backedge-taken count is (-2 + (zext i32 %n
to
>>>>> i64))<nsw>
>>>>> Loop %for.body: max backedge-taken count is -2
>>>>> Loop %for.body: Predicated backedge-taken count is (-2 +
(zext i32 %n
>>>>> to i64))<nsw>
>>>>>  Predicates:
>>>>> Loop %for.body: Trip multiple is 1
>>>>
>>>>
>>>> Now, I was surprised by the max backedge-taken count being -2,
and I
>>>> suspect it is due to the backedge-taken count being marked as
<nsw> instead
>>>> of <nuw>.
>>>>
>>>> Is that on purpose, is that a bug, or is my analysis incorrect?
I am
>>>> not sure where to fix that issue.
>>>>
>>>>
>>>> The backedge-taken count isn't nuw because nsw/nuw markings
aren't
>>>> flow-sensitive: there isn't any way to mark the trip count
as nuw without
>>>> marking every computation of `(long)n-2` as nuw.
>>>>
>>>> There's some code in ScalarEvolution::howFarToZero to try
to refine the
>>>> max backedge-taken count in some cases, but it isn't very
general.  See
>>>> https://reviews.llvm.org/D28536 .
>>>>
>>>> -Eli
>>>>
>>>> --
>>>> Employee of Qualcomm Innovation Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora
Forum, a Linux Foundation Collaborative Project
>>>>
>>>>
>>>
>>> --
>>> *Alexandre Isoard*
>>>
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
>>>
>>>
>>
>> --
>> *Alexandre Isoard*
>>
>>
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
>>
>>
>
> --
> *Alexandre Isoard*
>
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project
>
>
-- 
*Alexandre Isoard*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/1d32c954/attachment.html>

llvm dev - Aug 2018 - [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?

[llvm-dev] [SCEV] Why is backedge-taken count <nsw> instead of <nuw>?