Daniel Neilson via llvm-dev
2017-Sep-14  15:43 UTC
[llvm-dev] IVUsers pass is fragile. Is this okay? How can it be resolved?
Thank you for your thoughts, Hal. More information below...
On Sep 13, 2017, at 5:43 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel
at anl.gov>> wrote:
On 09/13/2017 01:01 PM, Daniel Neilson via llvm-dev wrote:
… snip
 For example, the following IR will produce different sets of IV users if
either:
i) The order of the PHI nodes in the %loop block are reordered; or
ii) The uselistorder directive is uncommented
---
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
target triple = "x86_64-unknown-linux-gnu"
define void @test(i64 %v1, i32 %v2, i64* %addr) {
entry:
 br label %loop
loop:
 %iv = phi i64 [%v1, %entry], [%iv.inc, %loop]
 %iv2 = phi i32 [%v2, %entry], [%5, %loop]
 %0 = trunc i64 %iv to i32
 %1 = sub i32 %iv2, %0
 %2 = sitofp i32 %1 to double
 %3 = sub i64 0, %iv
 %4 = trunc i64 %3 to i32
 %5 = sub i32 %1, %4
 %iv.inc = add i64 %iv, 1
 store i64 %iv.inc, i64* %addr, align 8
 br i1 undef, label %loop, label %exit
exit:
 ret void
; uselistorder directives ----
; uselistorder i64 %iv, {2, 1, 0}
}
—
… snip
 So, to get back to the original questions:
1) What exactly is IVUsers supposed to be finding? For instance, in the example
above, what would be the ideal/correct set of IVUsers that the analysis should
be finding?
I don't have a good answer to this question, other than the obvious one
(that it's supposed to find all values that have interesting SCEV
expressions making use of the induction variable). Interesting here means that
it's an affine addrec or has a starting value that is one (recursively). It
also helps support post-inc transformations.
The problem here is that we're trying to avoid calling getSCEV on all
instructions just to see if they end up being an addrec of the given loop. Maybe
we should do this in two steps? First, walk the users to find the instruction on
which to call getSCEV. Then, go through the instructions in BB order, calling
getSCEV on those identified instructions.
I’m not sure, either. What it looks like, just from reading and interpreting the
implementation, is that it’s looking for the SCEVs of instructions that
terminate a def-use chain that starts at a loop-header phi. There are are a few
criteria for what constitutes “terminating” a def-use chain, but the most
fundamental one (I think) appears to be that a user of the SCEV isn’t itself an
“interesting” SCEV.
2) Is it acceptable that there is this sort of difference in the IVUsers
analysis results based on nothing more than instruction or use list ordering? I
personally hope not; it has been mildly infuriating having to narrow down on a
bug with this difference in place.
Technically speaking, the dependency on the order of the phis is okay (i.e.,
it's possible they'll be no good way to avoid that). Not having it is
clearly better. The dependency on the use-list ordering is highly discouraged.
As you've noticed, this makes problems very hard to track down.
Part of the problem here may be that, because what SCEV proves, and thus
returns, is dependent on what SCEV's have been previously constructed
(unfortunately), it's not hard to develop these kinds of processing-order
dependencies with analyses that use SCEV.
 -Hal
 Thankfully, it looks like the SCEVs that are produced for each instruction in
the DFT that I’m looking at are consistent; there doesn’t seem to be any affect
on the SCEVs themselves as a result of the traversal orders.
 It looks to me that there are two pieces of the implementation of IVUsers that
are leading to the fragility (i.e. dependence on input ordering) that I am
seeing.
A) The inclusion of Processed.count(User) as a condition of the two if
statements at approx lines 235 & 241 in IVUsers::AddUsersImpl() of
lib/Analysis/IVUsers.cpp.
B) It doesn’t terminate a DFT if it encounters a phi node in the loop-header
that it hasn’t seen before.
 To help illustrate these, here are the relevant DFT traces from the sample IR
that I provided in the original post. (1) is the IR as-is, and (2) is with the
uselistorder instruction active.
DFT (1)
%iv = phi [%v1], [%iv.inc]  (ret true)
SCEV: {%v1,+,1}<%loop>
   * User: %iv.inc = add %iv, 1 (ret true)
           SCEV: {(1 + %v1),+,1}<%loop>
      * User: store %iv.inc, %addr  (ret false) **** ADD as user of %iv.inc def
****
      * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
   *  User: %3 = sub 0, %iv (ret true)
            SCEV: {(-1 * %v1),+,-1}<%loop>
      * User: %4 = trunc %3 (ret true)
              SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
         * User: %5 = sub %1, %4 (ret true)
                 SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)))}<%loop>
            * User: %iv2 = phi [%v2], [%5] (ret true)
                    SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *
(trunc i64 %v1 to i32)))}<%loop>
               * User: %1 = sub %iv2, %0 (ret true)
                       SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1
* (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
                  * User: %5 = sub %1, %4 (already processed) **** ADD as user
of %1 def ****
                  * User: %2 = sitofp %1 (ret false) **** ADD as user of %1 def
****
   * User: %0 = trunc %iv (ret true)
           SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
      * User: %1 = sub %iv2, %0 (already processed) **** ADD as user of %0 def
****
IV Users for loop %loop:
  %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* %addr,
align 8
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %5 = sub i32
%1, %4
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 = sitofp
i32 %1 to double
  %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 = sub i32 %iv2, %0
DFT (2)
%iv = phi [%v1], [%iv.inc] (ret true)
SCEV: {%v1,+,1}<%loop>
   * User: %0 = trunc %iv (ret true)
           SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
      * User: %1 = sub %iv2, %0 (ret true)
              SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc
i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
         * User: %5 = sub %1, %4 (ret true)
                 SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)))}<%loop>
            * User: %iv2 = phi [%v2], [%5] (ret true)
                    SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *
(trunc i64 %v1 to i32)))}<%loop>
               * User: %1 = sub %iv2, %0 (already processed) **** ADD as user of
%iv2 def ****
         * User: %2 = sitofp %1 (ret false) **** ADD as user of %1 def ****
   * User: %3 = sub 0, %iv (ret true)
           SCEV: {(-1 * %v1),+,-1}<%loop>
      * User: %4 = trunc %3 (ret true)
              SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
         * User: %5 = sub %1, %4 (already processed) **** ADD as user of %4 def
****
   * User: %iv.inc = add %iv, 1 (ret true)
           SCEV: {(1 + %v1),+,1}<%loop>
      * User: store %iv.inc, %addr (ret false) **** ADD as user of %iv.inc ****
      * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
IV Users for loop %loop:
  %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to
i32)))}<%loop> in    %1 = sub i32 %iv2, %0
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 = sitofp
i32 %1 to double
  %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in    %5 = sub i32 %1,
%4
  %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* %addr,
align 8
 For point (A), first take a close look at what the DFT does when it encounters
the def of %1 as a User in DFT (1). The first time that %1’s def is encountered
is as a user of %iv2. We continue traversing through the def of %1 at this point
to process the users of it because we haven’t encountered the def of %1 yet
(i.e. Processed.count(%1) is 0). However, we encounter the def of %1 again in
this same DFT later as a user of the def of %0. This second time that we
encounter %1 we don’t traverse its users because we’ve seen %1 before (i.e.
Processed.count(%1) is 1), and so we add the SCEV of %0 to the IVUsers set. In
DFT (2) we encounter the def of %1 first as a user of %0 — which allows the DFT
to process the users of %1 instead of adding the SCEV of %0 to the IVUsers set.
 For point (B), take a look at the traversal order in DFT (1). You’ll see the
chain %iv -> %3 -> %4 -> %5 -> %iv2 -> %1 -> %5. The def of %5
appears twice in the same def-use chain, and the second time we see it will be
the same situation as in point (A). The reason that we revisit %5 in the same
def-use chain is that we allow the DFT to continue through %iv2 (%iv2 is a phi
node in the loop-header) that will, in effect, allow the DFT to loop-around and
revisit. The result is adding the def of %5 as a user of %1’s SCEV in DFT (1).
We don’t do the same in DFT (2) because we don’t visit the def of %5 in the same
way. Instead, in DFT (2) the def of %1 ends up being the value that appears
twice in a def-use chain.
 What I *think* would be proper for IVUsers is to have IVUsers::AddUserImpl() :
A) Memoize its result for each given instruction instead of having the
Processed.count(User) condition at lines 235 & 241. The presence of
Processed.count(User) in the conditions at lines 235 & 241 basically has the
effect of changing the return value of AddUsersImpl() for “interesting”
instructions from true to false, which results in different behaviour depending
on the order in which instructions are visited. Interestingly, if we don’t
memoize the result of AddUsersImpl() at all, but instead just remove the
Processed.count(User) parts of these two if statements, then the return value of
AddUsersImpl() will *always* be true and we would have consistent results for
recursive calls on “interesting” SCEVS, but inconsistent results for recursive
calls on instructions that aren’t interesting.
B) Similar to the condition at line 212, don’t let the DFT continue into Users
that are phi nodes in the loop-header. We’re going to visit every phi node in
the loop-header as the root of a DFT, in turn, anyways, so this just prevents
the possibility of revisiting the same instruction multiple times in the same
def-use chain.
 If I mock these two changes up, the IVUsers sets in all three of my original
situations (IR as-is, IR with header-phi’s rearranged, and IR with use-list
ordering) all produce the exact same set of IVUsers. However, the set of IVUsers
ends up being a subset of what we previously had. I don’t know if this is
correct because the IVUsers analysis doesn’t appear to be well-defined with
respect to what it should be generating. Specifically, with this test we end up
with slight variations of this DFT:
%iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ]
SCEV: {%v1,+,1}<%loop>
   * User: %iv.inc = add i64 %iv, 1 (ret true)
           SCEV: {(1 + %v1),+,1}<%loop>
      * User: store i64 %iv.inc, i64* %addr, align 8. (ret false) **** ADD as
user of %iv.inc def ****
      * User: %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ] (already
processed PHI, skip)
   * User: %3 = sub i64 0, %iv (ret true)
           SCEV: {(-1 * %v1),+,-1}<%loop>
      * User: %4 = trunc i64 %3 to i32 (ret true)
              SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
         * User: %5 = sub i32 %1, %4 (ret true)
                 SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)))}<%loop>
            * User: %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ] (PHI in loop
header, skip)
   * User: %0 = trunc i64 %iv to i32 (ret true)
           SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
      * User: %1 = sub i32 %iv2, %0
              SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc
i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
         * User: %5 = sub i32 %1, %4 (memoized - ret true)
         * User: %2 = sitofp i32 %1 to double (ret false) **** ADD as user of %1
def ****
%iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ]
SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to
i32)))}<%loop>
   * User: %1 = sub i32 %iv2, %0 (memoized - ret true)
IVUsers:
  %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* %addr,
align 8
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 = sitofp
i32 %1 to double
 This looks sensible to me, but, again, I don’t really have a good sense of what
the “proper” IVUsers results are expected to be.
 Thoughts anyone?
Thanks,
 Daniel
---
Daniel Neilson, Ph.D.
Azul Systems
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170914/538cd0f2/attachment-0001.html>
Hal Finkel via llvm-dev
2017-Sep-15  02:30 UTC
[llvm-dev] IVUsers pass is fragile. Is this okay? How can it be resolved?
On 09/14/2017 10:43 AM, Daniel Neilson wrote:> Thank you for your thoughts, Hal. More information below... > >> On Sep 13, 2017, at 5:43 PM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> On 09/13/2017 01:01 PM, Daniel Neilson via llvm-dev wrote: > > … snip > >>> For example, the following IR will produce different sets of IV >>> users if either: >>> i) The order of the PHI nodes in the %loop block are reordered; or >>> ii) The uselistorder directive is uncommented >>> >>> --- >>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1" >>> target triple = "x86_64-unknown-linux-gnu" >>> >>> define void @test(i64 %v1, i32 %v2, i64* %addr) { >>> entry: >>> br label %loop >>> >>> loop: >>> %iv = phi i64 [%v1, %entry], [%iv.inc, %loop] >>> %iv2 = phi i32 [%v2, %entry], [%5, %loop] >>> %0 = trunc i64 %iv to i32 >>> %1 = sub i32 %iv2, %0 >>> %2 = sitofp i32 %1 to double >>> %3 = sub i64 0, %iv >>> %4 = trunc i64 %3 to i32 >>> %5 = sub i32 %1, %4 >>> %iv.inc = add i64 %iv, 1 >>> store i64 %iv.inc, i64* %addr, align 8 >>> br i1 undef, label %loop, label %exit >>> >>> exit: >>> ret void >>> >>> ; uselistorder directives ---- >>> ; uselistorder i64 %iv, {2, 1, 0} >>> } >>> — > > … snip > >>> So, to get back to the original questions: >>> 1) What exactly is IVUsers supposed to be finding? For instance, in >>> the example above, what would be the ideal/correct set of IVUsers >>> that the analysis should be finding? >> >> I don't have a good answer to this question, other than the obvious >> one (that it's supposed to find all values that have interesting SCEV >> expressions making use of the induction variable). Interesting here >> means that it's an affine addrec or has a starting value that is one >> (recursively). It also helps support post-inc transformations. >> >> The problem here is that we're trying to avoid calling getSCEV on all >> instructions just to see if they end up being an addrec of the given >> loop. Maybe we should do this in two steps? First, walk the users to >> find the instruction on which to call getSCEV. Then, go through the >> instructions in BB order, calling getSCEV on those identified >> instructions. >> > > I’m not sure, either. What it looks like, just from reading and > interpreting the implementation, is that it’s looking for the SCEVs of > instructions that terminate a def-use chain that starts at a > loop-header phi. There are are a few criteria for what constitutes > “terminating” a def-use chain, but the most fundamental one (I think) > appears to be that a user of the SCEV isn’t itself an “interesting” SCEV. > >>> >>> 2) Is it acceptable that there is this sort of difference in the >>> IVUsers analysis results based on nothing more than instruction or >>> use list ordering? I personally hope not; it has been mildly >>> infuriating having to narrow down on a bug with this difference in >>> place. >> >> Technically speaking, the dependency on the order of the phis is okay >> (i.e., it's possible they'll be no good way to avoid that). Not >> having it is clearly better. The dependency on the use-list ordering >> is highly discouraged. As you've noticed, this makes problems very >> hard to track down. >> >> Part of the problem here may be that, because what SCEV proves, and >> thus returns, is dependent on what SCEV's have been previously >> constructed (unfortunately), it's not hard to develop these kinds of >> processing-order dependencies with analyses that use SCEV. >> >> -Hal >> > > Thankfully, it looks like the SCEVs that are produced for each > instruction in the DFT that I’m looking at are consistent; there > doesn’t seem to be any affect on the SCEVs themselves as a result of > the traversal orders. > > It looks to me that there are two pieces of the implementation of > IVUsers that are leading to the fragility (i.e. dependence on input > ordering) that I am seeing. > A) The inclusion of Processed.count(User) as a condition of the two if > statements at approx lines 235 & 241 in IVUsers::AddUsersImpl() of > lib/Analysis/IVUsers.cpp. > B) It doesn’t terminate a DFT if it encounters a phi node in the > loop-header that it hasn’t seen before. > > To help illustrate these, here are the relevant DFT traces from the > sample IR that I provided in the original post. (1) is the IR as-is, > and (2) is with the uselistorder instruction active. > > DFT (1) > %iv = phi [%v1], [%iv.inc] (ret true) > SCEV: {%v1,+,1}<%loop> > * User: %iv.inc = add %iv, 1 (ret true) > SCEV: {(1 + %v1),+,1}<%loop> > * User: store %iv.inc, %addr (ret false) **** ADD as user of > %iv.inc def **** > * User: %iv = phi [%v1], [%iv.inc] (PHI already processed) > * User: %3 = sub 0, %iv (ret true) > SCEV: {(-1 * %v1),+,-1}<%loop> > * User: %4 = trunc %3 (ret true) > SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> > * User: %5 = sub %1, %4 (ret true) > SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * > (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) > + (-1 * (trunc i64 %v1 to i32)))}<%loop> > * User: %iv2 = phi [%v2], [%5] (ret true) > SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) > + (-1 * (trunc i64 %v1 to i32)))}<%loop> > * User: %1 = sub %iv2, %0 (ret true) > SCEV: {((-1 * (trunc i64 %v1 to i32)) + > %v2),+,(-1 + (-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 > %v1 to i32)))}<%loop> > * User: %5 = sub %1, %4 (already processed) **** ADD > as user of %1 def **** > * User: %2 = sitofp %1 (ret false) **** ADD as user > of %1 def **** > * User: %0 = trunc %iv (ret true) > SCEV: {(trunc i64 %v1 to i32),+,1}<%loop> > * User: %1 = sub %iv2, %0 (already processed) **** ADD as user > of %0 def **** > > IV Users for loop %loop: > %iv.inc = {(1 + %v1),+,1}<%loop> in store i64 %iv.inc, i64* > %addr, align 8 > %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 > (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %5 > = sub i32 %1, %4 > %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 > (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %2 > = sitofp i32 %1 to double > %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in %1 = sub i32 %iv2, %0 > > DFT (2) > %iv = phi [%v1], [%iv.inc] (ret true) > SCEV: {%v1,+,1}<%loop> > * User: %0 = trunc %iv (ret true) > SCEV: {(trunc i64 %v1 to i32),+,1}<%loop> > * User: %1 = sub %iv2, %0 (ret true) > SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 > * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> > * User: %5 = sub %1, %4 (ret true) > SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * > (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) > + (-1 * (trunc i64 %v1 to i32)))}<%loop> > * User: %iv2 = phi [%v2], [%5] (ret true) > SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) > + (-1 * (trunc i64 %v1 to i32)))}<%loop> > * User: %1 = sub %iv2, %0 (already processed) **** ADD > as user of %iv2 def **** > * User: %2 = sitofp %1 (ret false) **** ADD as user of %1 def > **** > * User: %3 = sub 0, %iv (ret true) > SCEV: {(-1 * %v1),+,-1}<%loop> > * User: %4 = trunc %3 (ret true) > SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> > * User: %5 = sub %1, %4 (already processed) **** ADD as user > of %4 def **** > * User: %iv.inc = add %iv, 1 (ret true) > SCEV: {(1 + %v1),+,1}<%loop> > * User: store %iv.inc, %addr (ret false) **** ADD as user of > %iv.inc **** > * User: %iv = phi [%v1], [%iv.inc] (PHI already processed) > > IV Users for loop %loop: > %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc > i64 %v1 to i32)))}<%loop> in %1 = sub i32 %iv2, %0 > %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 > (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %2 > = sitofp i32 %1 to double > %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in %5 = sub i32 %1, %4 > %iv.inc = {(1 + %v1),+,1}<%loop> in store i64 %iv.inc, i64* > %addr, align 8 > > > For point (A), first take a close look at what the DFT does when it > encounters the def of %1 as a User in DFT (1). The first time that > %1’s def is encountered is as a user of %iv2. We continue traversing > through the def of %1 at this point to process the users of it because > we haven’t encountered the def of %1 yet (i.e. Processed.count(%1) is > 0). However, we encounter the def of %1 again in this same DFT later > as a user of the def of %0. This second time that we encounter %1 we > don’t traverse its users because we’ve seen %1 before (i.e. > Processed.count(%1) is 1), and so we add the SCEV of %0 to the IVUsers > set. In DFT (2) we encounter the def of %1 first as a user of %0 — > which allows the DFT to process the users of %1 instead of adding the > SCEV of %0 to the IVUsers set. > > For point (B), take a look at the traversal order in DFT (1). You’ll > see the chain %iv -> %3 -> %4 -> %5 -> %iv2 -> %1 -> %5. The def of %5 > appears twice in the same def-use chain, and the second time we see it > will be the same situation as in point (A). The reason that we revisit > %5 in the same def-use chain is that we allow the DFT to continue > through %iv2 (%iv2 is a phi node in the loop-header) that will, in > effect, allow the DFT to loop-around and revisit. The result is adding > the def of %5 as a user of %1’s SCEV in DFT (1). We don’t do the same > in DFT (2) because we don’t visit the def of %5 in the same way. > Instead, in DFT (2) the def of %1 ends up being the value that appears > twice in a def-use chain. > > What I *think* would be proper for IVUsers is to have > IVUsers::AddUserImpl() : > > A) Memoize its result for each given instruction instead of having the > Processed.count(User) condition at lines 235 & 241. The presence of > Processed.count(User) in the conditions at lines 235 & 241 basically > has the effect of changing the return value of AddUsersImpl() for > “interesting” instructions from true to false, which results in > different behaviour depending on the order in which instructions are > visited. Interestingly, if we don’t memoize the result of > AddUsersImpl() at all, but instead just remove the > Processed.count(User) parts of these two if statements, then the > return value of AddUsersImpl() will *always* be true and we would have > consistent results for recursive calls on “interesting” SCEVS, but > inconsistent results for recursive calls on instructions that aren’t > interesting. > > B) Similar to the condition at line 212, don’t let the DFT continue > into Users that are phi nodes in the loop-header. We’re going to visit > every phi node in the loop-header as the root of a DFT, in turn, > anyways, so this just prevents the possibility of revisiting the same > instruction multiple times in the same def-use chain.These seems sensible.> > If I mock these two changes up, the IVUsers sets in all three of my > original situations (IR as-is, IR with header-phi’s rearranged, and IR > with use-list ordering) all produce the exact same set of IVUsers. > However, the set of IVUsers ends up being a subset of what we > previously had. I don’t know if this is correct because the IVUsers > analysis doesn’t appear to be well-defined with respect to what it > should be generating. Specifically, with this test we end up with > slight variations of this DFT: > > %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ] > SCEV: {%v1,+,1}<%loop> > * User: %iv.inc = add i64 %iv, 1 (ret true) > SCEV: {(1 + %v1),+,1}<%loop> > * User: store i64 %iv.inc, i64* %addr, align 8. (ret false) **** > ADD as user of %iv.inc def **** > * User: %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ] > (already processed PHI, skip) > * User: %3 = sub i64 0, %iv (ret true) > SCEV: {(-1 * %v1),+,-1}<%loop> > * User: %4 = trunc i64 %3 to i32 (ret true) > SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> > * User: %5 = sub i32 %1, %4 (ret true) > SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * > (trunc i64 %v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) > + (-1 * (trunc i64 %v1 to i32)))}<%loop> > * User: %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ] (PHI > in loop header, skip) > * User: %0 = trunc i64 %iv to i32 (ret true) > SCEV: {(trunc i64 %v1 to i32),+,1}<%loop> > * User: %1 = sub i32 %iv2, %0 > SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 > * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> > * User: %5 = sub i32 %1, %4 (memoized - ret true) > * User: %2 = sitofp i32 %1 to double (ret false) **** ADD as > user of %1 def **** > %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ] > SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 > %v1 to i32)))}<%loop> > * User: %1 = sub i32 %iv2, %0 (memoized - ret true) > > IVUsers: > %iv.inc = {(1 + %v1),+,1}<%loop> in store i64 %iv.inc, i64* > %addr, align 8 > %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 > (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %2 > = sitofp i32 %1 to double > > > This looks sensible to me, but, again, I don’t really have a good > sense of what the “proper” IVUsers results are expected to be.Compared to the previous results, this is missing: %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in %1 = sub i32 %iv2, %0 %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in %5 = sub i32 %1, %4 %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in %1 = sub i32 %iv2, %0 Can you comment on why? Thanks again, Hal> > > Thoughts anyone? > > Thanks, > Daniel > > --- > Daniel Neilson, Ph.D. > Azul Systems >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170914/4eb60399/attachment.html>
Daniel Neilson via llvm-dev
2017-Sep-15  03:31 UTC
[llvm-dev] IVUsers pass is fragile. Is this okay? How can it be resolved?
On Sep 14, 2017, at 9:30 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel
at anl.gov>> wrote:
On 09/14/2017 10:43 AM, Daniel Neilson wrote:
Thank you for your thoughts, Hal. More information below...
On Sep 13, 2017, at 5:43 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel
at anl.gov>> wrote:
On 09/13/2017 01:01 PM, Daniel Neilson via llvm-dev wrote:
… snip
 For example, the following IR will produce different sets of IV users if
either:
i) The order of the PHI nodes in the %loop block are reordered; or
ii) The uselistorder directive is uncommented
---
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
target triple = "x86_64-unknown-linux-gnu"
define void @test(i64 %v1, i32 %v2, i64* %addr) {
entry:
 br label %loop
loop:
 %iv = phi i64 [%v1, %entry], [%iv.inc, %loop]
 %iv2 = phi i32 [%v2, %entry], [%5, %loop]
 %0 = trunc i64 %iv to i32
 %1 = sub i32 %iv2, %0
 %2 = sitofp i32 %1 to double
 %3 = sub i64 0, %iv
 %4 = trunc i64 %3 to i32
 %5 = sub i32 %1, %4
 %iv.inc = add i64 %iv, 1
 store i64 %iv.inc, i64* %addr, align 8
 br i1 undef, label %loop, label %exit
exit:
 ret void
; uselistorder directives ----
; uselistorder i64 %iv, {2, 1, 0}
}
—
… snip
 So, to get back to the original questions:
1) What exactly is IVUsers supposed to be finding? For instance, in the example
above, what would be the ideal/correct set of IVUsers that the analysis should
be finding?
I don't have a good answer to this question, other than the obvious one
(that it's supposed to find all values that have interesting SCEV
expressions making use of the induction variable). Interesting here means that
it's an affine addrec or has a starting value that is one (recursively). It
also helps support post-inc transformations.
The problem here is that we're trying to avoid calling getSCEV on all
instructions just to see if they end up being an addrec of the given loop. Maybe
we should do this in two steps? First, walk the users to find the instruction on
which to call getSCEV. Then, go through the instructions in BB order, calling
getSCEV on those identified instructions.
I’m not sure, either. What it looks like, just from reading and interpreting the
implementation, is that it’s looking for the SCEVs of instructions that
terminate a def-use chain that starts at a loop-header phi. There are are a few
criteria for what constitutes “terminating” a def-use chain, but the most
fundamental one (I think) appears to be that a user of the SCEV isn’t itself an
“interesting” SCEV.
2) Is it acceptable that there is this sort of difference in the IVUsers
analysis results based on nothing more than instruction or use list ordering? I
personally hope not; it has been mildly infuriating having to narrow down on a
bug with this difference in place.
Technically speaking, the dependency on the order of the phis is okay (i.e.,
it's possible they'll be no good way to avoid that). Not having it is
clearly better. The dependency on the use-list ordering is highly discouraged.
As you've noticed, this makes problems very hard to track down.
Part of the problem here may be that, because what SCEV proves, and thus
returns, is dependent on what SCEV's have been previously constructed
(unfortunately), it's not hard to develop these kinds of processing-order
dependencies with analyses that use SCEV.
 -Hal
 Thankfully, it looks like the SCEVs that are produced for each instruction in
the DFT that I’m looking at are consistent; there doesn’t seem to be any affect
on the SCEVs themselves as a result of the traversal orders.
 It looks to me that there are two pieces of the implementation of IVUsers that
are leading to the fragility (i.e. dependence on input ordering) that I am
seeing.
A) The inclusion of Processed.count(User) as a condition of the two if
statements at approx lines 235 & 241 in IVUsers::AddUsersImpl() of
lib/Analysis/IVUsers.cpp.
B) It doesn’t terminate a DFT if it encounters a phi node in the loop-header
that it hasn’t seen before.
 To help illustrate these, here are the relevant DFT traces from the sample IR
that I provided in the original post. (1) is the IR as-is, and (2) is with the
uselistorder instruction active.
DFT (1)
%iv = phi [%v1], [%iv.inc]  (ret true)
SCEV: {%v1,+,1}<%loop>
   * User: %iv.inc = add %iv, 1 (ret true)
           SCEV: {(1 + %v1),+,1}<%loop>
      * User: store %iv.inc, %addr  (ret false) **** ADD as user of %iv.inc def
****
      * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
   *  User: %3 = sub 0, %iv (ret true)
            SCEV: {(-1 * %v1),+,-1}<%loop>
      * User: %4 = trunc %3 (ret true)
              SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
         * User: %5 = sub %1, %4 (ret true)
                 SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)))}<%loop>
            * User: %iv2 = phi [%v2], [%5] (ret true)
                    SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *
(trunc i64 %v1 to i32)))}<%loop>
               * User: %1 = sub %iv2, %0 (ret true)
                       SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1
* (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
                  * User: %5 = sub %1, %4 (already processed) **** ADD as user
of %1 def ****
                  * User: %2 = sitofp %1 (ret false) **** ADD as user of %1 def
****
   * User: %0 = trunc %iv (ret true)
           SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
      * User: %1 = sub %iv2, %0 (already processed) **** ADD as user of %0 def
****
IV Users for loop %loop:
  %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* %addr,
align 8
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %5 = sub i32
%1, %4
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 = sitofp
i32 %1 to double
  %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 = sub i32 %iv2, %0
DFT (2)
%iv = phi [%v1], [%iv.inc] (ret true)
SCEV: {%v1,+,1}<%loop>
   * User: %0 = trunc %iv (ret true)
           SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
      * User: %1 = sub %iv2, %0 (ret true)
              SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc
i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
         * User: %5 = sub %1, %4 (ret true)
                 SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)))}<%loop>
            * User: %iv2 = phi [%v2], [%5] (ret true)
                    SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 *
(trunc i64 %v1 to i32)))}<%loop>
               * User: %1 = sub %iv2, %0 (already processed) **** ADD as user of
%iv2 def ****
         * User: %2 = sitofp %1 (ret false) **** ADD as user of %1 def ****
   * User: %3 = sub 0, %iv (ret true)
           SCEV: {(-1 * %v1),+,-1}<%loop>
      * User: %4 = trunc %3 (ret true)
              SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
         * User: %5 = sub %1, %4 (already processed) **** ADD as user of %4 def
****
   * User: %iv.inc = add %iv, 1 (ret true)
           SCEV: {(1 + %v1),+,1}<%loop>
      * User: store %iv.inc, %addr (ret false) **** ADD as user of %iv.inc ****
      * User: %iv = phi [%v1], [%iv.inc] (PHI already processed)
IV Users for loop %loop:
  %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to
i32)))}<%loop> in    %1 = sub i32 %iv2, %0
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 = sitofp
i32 %1 to double
  %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in    %5 = sub i32 %1,
%4
  %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* %addr,
align 8
 For point (A), first take a close look at what the DFT does when it encounters
the def of %1 as a User in DFT (1). The first time that %1’s def is encountered
is as a user of %iv2. We continue traversing through the def of %1 at this point
to process the users of it because we haven’t encountered the def of %1 yet
(i.e. Processed.count(%1) is 0). However, we encounter the def of %1 again in
this same DFT later as a user of the def of %0. This second time that we
encounter %1 we don’t traverse its users because we’ve seen %1 before (i.e.
Processed.count(%1) is 1), and so we add the SCEV of %0 to the IVUsers set. In
DFT (2) we encounter the def of %1 first as a user of %0 — which allows the DFT
to process the users of %1 instead of adding the SCEV of %0 to the IVUsers set.
 For point (B), take a look at the traversal order in DFT (1). You’ll see the
chain %iv -> %3 -> %4 -> %5 -> %iv2 -> %1 -> %5. The def of %5
appears twice in the same def-use chain, and the second time we see it will be
the same situation as in point (A). The reason that we revisit %5 in the same
def-use chain is that we allow the DFT to continue through %iv2 (%iv2 is a phi
node in the loop-header) that will, in effect, allow the DFT to loop-around and
revisit. The result is adding the def of %5 as a user of %1’s SCEV in DFT (1).
We don’t do the same in DFT (2) because we don’t visit the def of %5 in the same
way. Instead, in DFT (2) the def of %1 ends up being the value that appears
twice in a def-use chain.
 What I *think* would be proper for IVUsers is to have IVUsers::AddUserImpl() :
A) Memoize its result for each given instruction instead of having the
Processed.count(User) condition at lines 235 & 241. The presence of
Processed.count(User) in the conditions at lines 235 & 241 basically has the
effect of changing the return value of AddUsersImpl() for “interesting”
instructions from true to false, which results in different behaviour depending
on the order in which instructions are visited. Interestingly, if we don’t
memoize the result of AddUsersImpl() at all, but instead just remove the
Processed.count(User) parts of these two if statements, then the return value of
AddUsersImpl() will *always* be true and we would have consistent results for
recursive calls on “interesting” SCEVS, but inconsistent results for recursive
calls on instructions that aren’t interesting.
B) Similar to the condition at line 212, don’t let the DFT continue into Users
that are phi nodes in the loop-header. We’re going to visit every phi node in
the loop-header as the root of a DFT, in turn, anyways, so this just prevents
the possibility of revisiting the same instruction multiple times in the same
def-use chain.
These seems sensible.
 If I mock these two changes up, the IVUsers sets in all three of my original
situations (IR as-is, IR with header-phi’s rearranged, and IR with use-list
ordering) all produce the exact same set of IVUsers. However, the set of IVUsers
ends up being a subset of what we previously had. I don’t know if this is
correct because the IVUsers analysis doesn’t appear to be well-defined with
respect to what it should be generating. Specifically, with this test we end up
with slight variations of this DFT:
%iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ]
SCEV: {%v1,+,1}<%loop>
   * User: %iv.inc = add i64 %iv, 1 (ret true)
           SCEV: {(1 + %v1),+,1}<%loop>
      * User: store i64 %iv.inc, i64* %addr, align 8. (ret false) **** ADD as
user of %iv.inc def ****
      * User: %iv = phi i64 [ %v1, %entry ], [ %iv.inc, %loop ] (already
processed PHI, skip)
   * User: %3 = sub i64 0, %iv (ret true)
           SCEV: {(-1 * %v1),+,-1}<%loop>
      * User: %4 = trunc i64 %3 to i32 (ret true)
              SCEV: {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop>
         * User: %5 = sub i32 %1, %4 (ret true)
                 SCEV: {((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)) + %v2),+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64
%v1 to i32)))}<%loop>
            * User: %iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ] (PHI in loop
header, skip)
   * User: %0 = trunc i64 %iv to i32 (ret true)
           SCEV: {(trunc i64 %v1 to i32),+,1}<%loop>
      * User: %1 = sub i32 %iv2, %0
              SCEV: {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc
i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop>
         * User: %5 = sub i32 %1, %4 (memoized - ret true)
         * User: %2 = sitofp i32 %1 to double (ret false) **** ADD as user of %1
def ****
%iv2 = phi i32 [ %v2, %entry ], [ %5, %loop ]
SCEV: {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to
i32)))}<%loop>
   * User: %1 = sub i32 %iv2, %0 (memoized - ret true)
IVUsers:
  %iv.inc = {(1 + %v1),+,1}<%loop> in    store i64 %iv.inc, i64* %addr,
align 8
  %1 = {((-1 * (trunc i64 %v1 to i32)) + %v2),+,(-1 + (-1 * (trunc i64 (-1 *
%v1) to i32)) + (-1 * (trunc i64 %v1 to i32)))}<%loop> in    %2 = sitofp
i32 %1 to double
 This looks sensible to me, but, again, I don’t really have a good sense of what
the “proper” IVUsers results are expected to be.
Compared to the previous results, this is missing:
  %0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 = sub i32 %iv2, %0
  %4 = {(trunc i64 (-1 * %v1) to i32),+,-1}<%loop> in    %5 = sub i32 %1,
%4
  %iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to
i32)))}<%loop> in    %1 = sub i32 %iv2, %0
Can you comment on why?
Sure thing...
First:
%0 = {(trunc i64 %v1 to i32),+,1}<%loop> in    %1 = sub i32 %iv2, %0
This result was only present in the IVUsers when the original IR was run, and
when the phi nodes in the loop-header are rearranged. It does not appears in the
IVUsers when altering the use-list order of the %iv phi.
It’s there due to both points (A) & (B). With the original IR (i.e. DFT (1),
above) we add this user when looking at the chain:
%iv -> %0 -> %1
But, we look at this chain *after* we have looked at the chains: %iv ->
%iv.inc ->  %3 -> %4 -> %5 -> %iv2 -> %1 -> [%5, %2]
Because we are looking at %iv -> %0 -> %1 *after* we have looked at a
different chain that contains %1, we will have Processed.count(%1) being 1.
Thus, the condition on line 241 will read this as chain termination, and add the
SCEV of %0 to IVUsers as used by %1. However, this is a different behaviour than
we would get had we been looking at the chain fragment %iv -> %0 -> %1
*without* having previously visited %1 in a different part of the DFT; had we
not previously seen %1, then Processed.count(%1) would be 0 and we would recurse
into AddUsersImpl(%1) which would, ultimately, return true and result in %1 not
being added as a user of the SCEV of %0.
It’s the same sort of situation in the IR with the phi nodes in the loop-header
rearranged.
This sort of thing strikes me as incorrect. It’s a situation where the DFT’s
behaviour changes just because we’re revisiting a node instead of seeing it
fresh. In the mocked-up “fixed” code, every time that we see %1 in a def-use
chain we, effectively, treat it the same way — by recursing into it, and
returning ‘true’.
Second:
%iv2 = {%v2,+,((-1 * (trunc i64 (-1 * %v1) to i32)) + (-1 * (trunc i64 %v1 to
i32)))}<%loop> in    %1 = sub i32 %iv2, %0
This one only appears in the IVUsers list when the use-list order of the %iv phi
is rearranged. (i.e. DFT (2))
Specifically, it’s added when looking at this def-use chain: %iv -> %0 ->
%1 -> %5 -> %iv2 -> %1    — notice that %1 appears as a user twice
(once for each def it uses) in this one def-use chain.
This one is only added because Processed.count(%1) is 1 when looking at the
users of %iv2 in this chain. So, we’re in the same situation where we do one
thing the first time we see %1 as a user and do something completely different
the second time. The condition on line 241 ends up being false the first time we
visit %1, and true the second time, which results in the
fragility/traversal-order-dependency that I’m seeing in IVUsers.
Again, strikes me as incorrect because the behaviour is inconsistent.
Actually, this sort of def-use chain traversal is why I think that it’s
important to stop the DFT if encountering a use that’s a phi in the loop-header
if we’re going to be memoizing the result of AddUsersImpl(). If we don’t stop
the traversal in a header-phi, then we’d have a situation where we haven’t
memoized the result of AddUsersImpl(%1) yet when calling AddUsersImpl(%1) for
the second time — because we’re still trying to determine what AddUsersImpl(%1)
will return in the first call. Depending on how the memoization is implemented,
the result is either an infinite loop, or a potentially inconsistent result.
I can’t really comment on whether there’s value in having the SCEVs of %iv2 and
%0 in the IVUsers list for this example. I genuinely don’t know whether there is
or is not.
As a point that indicates to me, at least, that the fixes that I tried are at
least on the right track: The IVUsers found with the mocked-up fix in place are
exactly the intersection of the three different IVUsers sets that I found with
this one IR example.
Cheers,
 Daniel
Thanks again,
Hal
 Thoughts anyone?
Thanks,
 Daniel
---
Daniel Neilson, Ph.D.
Azul Systems
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
---
Daniel Neilson, Ph.D.
Azul Systems
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/6b636c15/attachment-0001.html>
Reasonably Related Threads
- IVUsers pass is fragile. Is this okay? How can it be resolved?
- IVUsers pass is fragile. Is this okay? How can it be resolved?
- IVUsers pass is fragile. Is this okay? How can it be resolved?
- [LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution
- RFC for a design change in LoopStrengthReduce / ScalarEvolution